Test Setup Overview:
Testing Methodology: The best parameters for text-to-image (T2I) and text-to-video (T2V) generation are based on the official beta scheduler for WAN 2.2, which I have subjectively confirmed as the best scheduler available. All tests are carried out without acceleration for both high-noise and low-noise settings, with a total of 24 steps—14 steps for high-noise and 10 steps for low-noise.
Efficiency vs Quality: Since generating high-quality images or videos with WAN 2.2 is time-consuming, the trade-off between speed and quality is critical. However, the higher the quality, the longer the time required. Therefore, the speed consideration is almost equal to that of quality. For high-noise parts, I used faster, lower-order samplers, while the low-noise parts mainly affect image details and texture. The choice of sampler doesn’t significantly affect the outcome here, so I used the same sampler for both high and low noise in the unipc and sa_solver configurations. For all other tests, I mostly used the Heun sampler for low-noise sections, which preserves fine details naturally and realistically.
Single Seed & Prompt: Due to the lengthy testing process, all parameters were tested using the same seed and prompt. Note that variations in the seed could result in issues like body part breakdown, but the focus will be on texture, composition, adherence to prompt, and noise factors.
prompt:
Epic fantasy, post-apocalyptic wasteland, worship of strength, the art of war (realistic).A vast, boundless battlefield, shrouded in swirling dust, littered with tattered banners and scattered weapons. Countless frail human soldiers are swept off their feet or hurled into the air by the sheer force of the mighty air currents whipped up by a dragon’s tail. In the distance, shattered city walls and pillars of fire pierce the sky.
A majestic silver-winged dragon, its scales shimmering with a metallic luster, spreads its enormous wings. Its tail is thick, muscular, and tipped with razor-sharp bone spurs. Clinging with unyielding strength to the base of the dragon’s tail is a formidable warrior clad in battle armor. Captured through time-lapse photography, the dragon—its tail gripped tightly by the warrior—executes an epic, sweeping rotation across the battlefield at breathtaking speed.
This moment is broken down into a sequence of continuous, highly dynamic, motion-blurred images, depicting the dragon’s tail carving an immense arc through the air, magnifying to the extreme the overwhelming force and devastating impact capable of mowing down entire ranks. Soldiers are flung in all directions like weightless scraps of paper by the centrifugal force, the scene drenched in speed, impact, and the sheer release of power.
A fisheye lens distorts the entire battlefield to the extreme, placing the dragon and the warrior at the center, making them appear impossibly massive and warped, while the soldiers and terrain around them are compressed and stretched to the edges of the frame—creating an intense visual shock and an immersive sense of chaos. The primary light source comes from lightning in the heavens or magical radiance emanating from the dragon itself, casting high-contrast, dramatic shadows. The dominant palette is a cold metallic blue and the gray-brown of the battlefield, accented with the flicker of firelight and splatters of blood.
unipc_bh2-beta:

euler-beta:

unipc-beta:

deis_beta:

sa_solver:

res_multistep-beta:

euler_a-beta:

er_sde-beta:

res_multistep_a-beta:

LCM-beta:

dpm_fast-beta:

Testing Results - Text-to-ImageIf I were to rate the above pictures, I think the order from highest to lowest would be:
unipc_bh2-beta: 10 points
euler-beta, unipc-beta, deis_beta: 9 points
sa_solver, res_multistep-beta, euler_a-beta: 8 points
er_sde-beta: 7 points
res_multistep_a-beta: 6 points
LCM-beta, dpm_fast-beta: Below Expectations
Key Notes:
Although in text-to-image generation—especially for photorealistic and static images—using high-order samplers at high noise levels (such as the DPM++ series and various 2M or 2S high-order samplers) can yield better results, these samplers take twice as long to process. If you are pursuing the ultimate image quality when generating pictures with WAN 2.2, it might be worth trying them. Here, I’m showcasing the results achievable when using only low-order samplers at high noise levels.
In my actual testing, when it comes to video generation, some high-order samplers do not perform better than low-order samplers in the high-noise stage. This is because the high-noise stage mainly determines composition, color, movement, and so on. In practice, I used high-order samplers like Heun or DPM++ 2S_a—which are more time-consuming—during the low-noise stage. If you put them in the high-noise stage, you could spend two or even three times more processing time without necessarily getting better results. However, when placed in the low-noise stage, the extra time consumption compared to using other low-order samplers like Euler or DEIS is almost negligible, yet the detail quality improves significantly. I believe this is a cost-effective sampling strategy.
For text-to-video generation, the following setup was used:
High-noise: 6 steps without acceleration
Low-noise: 4 steps with LightX acceleration
Total Steps: 10 (6+4)
Based on experience from text-to-image parameters, can we see whether the parameters that perform excellently in text-to-image tasks can also perform equally well in the 6+4 combination for text-to-video tasks?This is a complex camera movement that combines a high-altitude dive, rotation around the subject, and a pull-back shot, and it also requires the final camera movement to display text with artistic processing. With only 6+4 steps, which combinations could flawlessly accomplish the task?
prompt:A beautiful aerial fault, with a floating sea of flowers and shattered isles suspended within the thin mist. In the early morning, the slanted sunlight pours in from the front right, weaving together with the drifting haze. A white-haired humanoid woman, draped in a transparent cape, steps onto a gravel path surrounded by petals, her long hair and the blossoms dancing in the wind.
(The camera dives swiftly down through the clouds, plummeting at high speed before rapidly closing in, then circling the heroine as it ascends and glides. It then performs a swift 270-degree shoulder-level rotation, pushing forward to the edge of the fault, where the vast and distorted space beyond slowly comes into view. Suddenly, the camera pulls sharply back, revealing the flower sea, floating isles, and drifting petals, which, from a macro perspective, naturally combine to form the seven giant letters “Civitai.”)
The texture matches the rocky gravel path, entwined with vines and scattered petals, blending seamlessly with the colors of the environment. Under the interplay of sunlight and mist-light, delicate shadows and highlights refract across the scene. The strong contrast between cool and warm tones, paired with cinematic photorealistic textures and lighting, finally forms a breathtaking spiral of time and space.
euler-beta:
euler_a-beta:
er_sde-beta:
sa_solver-beta:
unipc-linear_quadratic (high) + heun-beta (low):
unipc_linear_quadratic:
dpmpp_sde-beta:
LCM-beta:
unipc_beta、unipc-simple, unipc-normal:
Summary of Best Parameters for Video Generation:
Unfortunately, unipc can no longer work properly on 6+4 text-to-video tasks. The only parameter setup that allows unipc to perform normally is when it is used together with the linear quadratic scheduler, but the image quality in this case is clearly not good enough. So as a compromise, I turned off “return noise” for the high-noise sampler and turned on “add noise” for the low-noise part. This allows for more sampler combinations, and even enables the use of different schedulers for the high- and low-noise stages. However, this will affect the continuity between the high-noise and low-noise sections and will have some impact on image quality.
It can be seen that unipc-linear_quadratic (high) combined with heun-beta (low) does indeed preserve unipc’s camera movement, composition, and dynamics from the high-noise part, while the beta scheduler with the heun sampler adds more realistic and natural details. However, due to the inconsistent schedulers and the noise being added in a way that does not follow the normal official process, the final image quality is still not as natural as using the beta combination throughout.
If I were to rank the above video results from best to worst, I would say:
Best: euler_a-beta, er_sde-beta (10 points)
Good: euler-beta (9 points)
Average: sa_solver-beta, unipc-linear_quadratic (high) + heun-beta (low),dpmpp_sde-beta (8 points)
Low: unipc_linear_quadratic, LCM-beta, unipc_bh2-linear quadratic (7 points)
Unacceptable: unipc_beta, unipc-simple, unipc-normal
Workflow Used in Testing:
I have already uploaded here the text-to-image and text-to-video workflows used for testing. You can also click the link below to instantly receive 1,000 RH coins, which is enough to generate nearly a hundred images and about thirty videos. All of the videos above were generated on RunningHub.
t2i:👉https://www.runninghub.ai/post/1959217923569168386/?inviteCode=rh-v1216
t2v:👉https://www.runninghub.ai/post/1959138293399105538/?inviteCode=rh-v1216
Alternatively, you can download the images and videos, which include the generation metadata. I hope the above tests can be helpful to you.
Supplementary test conclusion for image-to-video:
under the 6+4 acceleration scheme, DDIM-beta offers the best overall combination of speed, image quality, and prompt compliance—there’s simply no contender.
