In my last article, I covered my setup for Wan 2.2 image2video using SwarmUI. In this one, I'll just cover the basic changes need to the Swarm setup to get text2video (with the appropriate wan 2.2 t2v models and lightx2v t2v LORAs of course) working.
Key Model Setup
Make sure you've got whatever version (safetensors or GGUF) of the wan 2.2 T2V files downloaded.
In swarm, you'll select the HIGH t2v model as your "base" model (either in the Models tab or from the Model dropdown at the bottom left).
Then, under the Refine/Upscale settings, you'll set the refiner control percentage to 0.5, the refiner method to Step-Swap (SDXL Refiner Model Original). Note: NOT the Step-Swap Noisy (it gave me garbage outputs).
Then under Refiner Param Overrides under the Refine/Upscale settings, you'll change the Refiner Model to your wan 2.2 LOW t2v model.
Swarm Prompt Tags
This is where the Swarm magic happens to connect the dots with lightx2v (and any additional high/low LORAs you want to apply).
In i2v, we use <video> and <videoswap> tags to delineate the high and low passes.
In t2v, we need to use <base> and <refiner> tags.
So at the END of my t2v prompts, I have this as a starting point:
<base><lora:video/control/loras/Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank64:1.5><refiner><lora:video/control/loras/Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank64:1.5>
As you start typing the <lora: you'll get a dropdown so you can find and select the correct high and low LORAs easily. This is also where you can add the appropriate high and low versions of any additional LORAs (high under the <base> tag, low under the <refiner> tag).
That's really it for the basic "get it working" changes versus my i2v setup.
Note on resolution: I'm playing with 480x720 for speed right now, but wan 2.2 can handle higher res. Play around as you see fit. I'm getting 2.5 minute total generation times on a 3090 at my resolution.
Attached preset has everything I've got configured. Not 100% happy with sigma shift, LORA weights, etc., but that's down to tweaking. The basic flow is working.
Have fun!