home models images videos articles comics challenges updates shop

WAN 2.2 4-Stage SVI Promptorama for Nice Long Videos

Name: WAN 2.2 4-Stage SVI Promptorama for Nice Long Videos
Rating: 5 (23 reviews)
Author: Ponder_Stibbons

568

Updated: Apr 19, 2026

tool

long videos wan2.2 wan22 svi

Download

1 variant available

Archive Other

4.73 MB

Verified: 5 months ago

Download (4.73 MB)

Details

Type

Workflows

Stats

328

Reviews

Positive

(11)

Published

Jan 26, 2026

Base Model

Wan Video 2.2 I2V-A14B

Hash

AutoV2

AE87592836

About this version

default creator card background decoration

Ponder_Stibbons

See the bottom for issues with Lora Manager syntax issues and red CRT node(s).

This is a 4 stage workflow for WAN 2.2 that uses SVI Pro nodes to seamlessly transition between each 81 frame generation. Each generation can use unique LoRAs. Prompts and associated LoRAs are saved in groups and switched together - once they are set, all that is needed to select the appropriate COLOR group and DIRECTIONAL subgroup. Set up your scenes and access them at your leisure, without the constant annoying and error-prone mix and match. The outgoing latent is saved so you can continue the generation indefinitely on subsequent runs. A 5th stage is included for FLF, to create an infinite loop. This stage isn’t as good as my other FLF workflows yet, not even close, but it is there.

The workflow uses WRAPPER nodes only, and LoRA Manager is required- alternate LoRA loaders can be substituted in each stage if needed-all data up to that point is LoRA syntax switched as strings. Triple kSampler and full base model loader is included as alternate inference stage.

v2 info:

This here v2 is what v1 should have been. Apologies to anyone who tried to use the dumpster-fire that was v1. Major bugs zapped. Much cleaner and easier to follow. The stages now read from left to right: brown, red, yellow, cyan, with the optional green looping stage down below. This works, but not nearly as well as it can when completed. I haven’t added the cleanup and blending that is part of the dedicated SVI FLF workflow yet, so if you want to try it out, save the 4-stage merge as well in case it sucks.

Here’s what is improved, anything referenced here supersedes information from v1 below:

No more radio button toggles. Orientation is automatic, from the input and the input first frame selection is done from the dark blue FAST GROUP MUTER. Can be drag-drop image, last frame from video PATH, or the last frame of a folder (if you are saving your output as frames). This also uses a PATH but you only enter your master frame folder once in the purple string node- this will set the folder so saving as frames will go to a folder named automatically from your input image, you call this folder by entering just the folder name in the frame loader group.
Continue from latent is fixed. It has its own switch. I didn’t link this with the last saved folder however (in case you want to do something out of order)- make sure to enter the folder name that was created by the last run so it grabs the appropriate previous frame for the anchor when you continue from latent. Don’t forget to specify a latent directory, you’ll get an error with my paths.
LoRAs are now in ONE subgraph instead of TWO. That was stupid of me. Well, I did it because they were originally all loaders instead of string nodes, which made everything crawl, and it was a pain in the ass to join them after I fixed it. But they are together now as they should be. The slots are the same exact layout as the prompts on the main graph (four colors, four groups per color) and H/L are next to each other, with high on the left and low on the right. Use one of the loader nodes on either side for autocomplete and thumbnails, then copy the syntax into the appropriate slots.
I’ve added a bunch of new prompt sets, I think maybe only five empty spots are left, not counting the temporary experimenting set on the bottom left. Don’t forget to check the syntax of the strings - I still have a bunch of renamed files on my end from before I started using the LoRA Manager.
TRIPLE KSAMPLER nodes are now included, they have their own inference subgraphs, along with a BASE model loader in case you want to use it. It works great, but it’s slow, so I’ve bypassed them by default. Just swap out the subgraphs if you want to use them. Base model can be toggled on with a switch.
Upscale and interpolation group is included. Currently it’s linked to the output of stage 4, not the loop. FYI. Make sure you know what you’re doing if you enable these. Without good memory management, this is a one-way ticket to OOM. Aside from a few RAM purges, that is up to you. There is a 4090 specific management node as well. Obviously don’t use that if you misplaced your 4090.
NAG is now incorporated - using it necessitates using the cached text encode node, as its the only one that has the separate embed outputs that NAG needs.

I probably mentioned this below, but make sure to go through all of the green string and concatenate nodes to align everything with your comfy directory setup. All of the string nodes at the top too. Anything with a string I guess. Or just run it and fix the errors as they pop up.

——————————ORIGINAL v1 DESCRIPTION BELOW————

Ok, I've finally got this somewhat presentable for sharing. There are plenty of SVI workflows posted by now. Most of them better than mine, I'm sure. I really like SVI though, so the more the merrier. The main idea here is to automate the switching of the prompts and LoRAs, so you can just pick a preset scenario and crank it out. No typing, no selecting from drop-downs, no oh shit look what I did they're all backwards type scenarios. You should definitely be proficient with comfy and WAN before trying SVI- do not jump straight into this stuff if you're getting started. You will lose hair. It's still pretty messy, and it sure as hell isn't one of these magnificent beasts. My starting point was this, which is great and a much easier intro to SVI if that is what you need. But it's now my main WF, and I'm really pleased with the way it came out. Switching four prompts and four LoRAs independently is seriously annoying.

Here are the main features:

Each stage gets it's own set of prompts- you can pick from one of 16 sets (I have only completed 7 of them so far, the rest are currently empty). You only need to use TWO switches to do this. You pick one of the colors- brown, red, yellow or cyan. These are the big blocks with four sets in each of them. Then you pick a number- you don't have to worry about the indices, I have labeled them Top Right, Top Left, Bot Right, Bot Left so it should be clear enough to pick a set within a color group. Selection is done with the two Fast Group Muter nodes. I'm sure that that if I add any updates to this WF it will have more slots filled in. It is annoying to get a group all set and locked in, but once it's done, it's done.
You do NOT need to put image quality or SVI-specific motion prompts into the sets- I've already done that. It happens in the stages themselves. If you do need to change any of that, look at the two concats that take the GET nodes and go to the T5 encoder. Each stage has them. That's where you can put specific transition/picture quality stuff. So keep your prompts limited to actions and descriptions.
Here's the nice part. Setting your prompt also sets your LoRAs. So you only need to do it once for every set you make. Take a journey down into the subgraph to configure these. The groups are laid out as exact copies of the promptage, so just throw the syntax into the appropriate nodes. All of the sub-subgraphs were initially Manager loaders, but this caused huge lag, so they are strings now. I did put in loaders on the side so if you need previews and auto-complete, do it in there and copy your string to the set. I took a snapshot of filtered 2.2 LoRAs and popped it into a load image node- I suggest you do the same, as it is very helpful when setting up your groups if you've got a bunch of thumbnails of exactly what you have. Of course you don't have to use the LManager- the data is strings until it gets to the stages, that's where you would change loaders.
You can start with an image or a frame from a video with the INPUT radio button.
To continue from the last saved latent in your latent directory, click the USE LAST LATENT radio button. Obviously you need to have one saved first, default directory is 'latents' in your comfy output folder. I'm still working on this stage- it might be buggy- if the output starts from the original frame instead of the last frame, just grab the frame you want and put it in the anchor spot of the first SVI node and make sure the latent gets to the 'prev_latent' input. This spot is normally empty in a first-run. I'll get this straightened out if it's not working right, and I'm going to add an option to embed frames from an already-encoded video like you can with my SVI FLF workflow, so you can continue straight from a video (and get motion data instead of just grabbing a frame, which is the point of SVI).
The default model is https://civitai.com/models/2053259?modelVersionId=2477539 I really like this one, great camera prompting. Then there is SVI PRO of course. If you want to use full models with lightning LoRAs, there are loaders in there for them. Disabled by default. Strength for lightning and SVI is set at the top left with all the loaders. Oh, don't forget the T5 encoder that works with wrapper. It's a naughty encoder.
I've got my own preferred resolutions pre-set with a switch and an aspect ratio flipper, you can ditch all this if it irks you. Resize input goes through Contrast Adaptive Sharpening. This is really important. I highly suggest you try it. I guarantee you that half of your generation faceplants are from garbage input. Ask me how I know. CAS won't fix garbage, but what it does fix is the weird blurring you get sometimes from resizing. I've put a comparison node in there so you can slide to the middle and have one hand free to slap your forehead.
Uhh what else is there... oh filenames, check the string nodes and concats, set to your preferred filing, prefix, suffix etc. If you want to save straight to video you can, but I'd advise saving frames. This is default- makes a unique folder (iterated int suffix to avoid saving in a previous folder, the number doesn't mean anything). A video is set to save at the end, but it is tagged at as a preview, with a high crf.
Unfortunately the demands of this setup necessitate the exclusion of an upscale and interpolation stage. You can add it, but it requires a lot of extra offloading and purges that can screw up your next generation. I've offloaded that stage to a different machine. Mac Studio with M2 Ultra actually handles it great, I was surprised. 4x upscale models can freak it out with huge batches, but for the most part it works great. But I digress. Point is, output is raw WAN.
Stages 1, 2, 3 have their own previews, the last preview is concatenated. The 1+2, 1+2+3, etc. previews are there but hidden and minimized in their stage groups. Too many previews.
Is that everything? Ah yes, bottom left prompt box has LoRA loaders next to it (will give you preview thumbs). That's there so you can do your experimental setups without going into the subgraph, play around, get it nailed down, then copy it to an open slot.
Obviously, by now, this is using WRAPPER nodes and LoRA Manager, 13 custom node packs in all. That's not a lot. Get all of them. If an idiot like me has a node, there is no reason at all why a genius like you does not. Sometimes you can get away with substituting core nodes, sometimes you can't. Some people actually know what they are doing. I hope to be one of those people when I grow up.

There are copious notes in the WF. Anything noteworthy has a note. Please do let me know if you use this and encounter bugs. I love being embarrassed. And it's hard to debug every permutation of every setting, so there are bound to be landmines in here.

Oh did I mention that this workflow is a monster? It will spit out 309 frames. I run comfy on a Cray X-MP with a neural net processor, a learning computer. But it gets so hot when I run this that I can't even sit on the seats. You have been warned: do not sit on the computer - you will get burned.

128GB of RAM will get you through safely. You'll probably have a few percent left over at the end, but unload what you can first. Definitely turn off defender, firewalls, and anti-virus stuff whenever using any of my workflows. They suck RAM like vacuums.

Try VRAM too. You need some of that, but not as much as the RAM. If you are one of those GGUF pussies, I can't help you. Go away.

If you have red CRT nodes, check to see if you have the latest version- it has a problem with updates so you may need to uninstall/reinstall. If CRT is up to date and the 'load last image' node shows up as red, it's because the association between the old version of the node and the new one is broken. That can be fixed by grabbing a fresh node of the same name (or any equivalent node from wherever you like) and replacing it manually.

As a general note for anyone still learning the ropes, don't forget that when you open a new WF to a sea of red, many times it's just a matter of throwing in a node that you already have. Stuff like float/INT conversion, math expressions, resizing, batches and lists... most can be substituted with core nodes. I think KJ has a load from folder that may work as an alternative to the CRT one. Not that KJ is core, but a lot more people have it. Also CRT is one of those packs that loves to upgrade numpy, which will screw up stuff like nunchaku and reActor. I wrote a little .bat to quickly downgrade to numpy<2.0 after updates. That's handy if you have the same problem.

On the recent update to Lora Manager - If you have the most recent update, you can no longer pass a string of syntax into the string input of the loader. As the WF is configured, this will result in nothing being loaded. However, the updated nodes accommodate the change with a new node called WanVideo Lora Select From Text (LoraManager). All you need to do is take the string of syntax that comes out of the GET nodes (L1H, L1L, L2H, etc.) and move it from the loader string input to the 'lora_syntax' input of the new node. Connect the 'lora' output of the new node to the 'prev_lora' input of the loader. Not a huge hassle, just bridge all the loaders and you're golden.*