Sign In

Yet Another Workflow for Wan 2.2 - Step By Step with RunPod + Template (v0.38b)

0

I had someone who wanted very plain instructions on how to get rolling with RunPod and my ComfyUI workflow, Yet Another Workflow. I put together an article to walk through getting the Runpod template up and running. This is aimed at folks who want really basic instructions or are looking for help with problems. There are some gaps here in explaining some basic details of computers like how to use terminal and moving files around that I may flesh out later, but the intention is that following these steps gets you to making stuff.

A new version of this article will be updated with each major release of the workflow with new learnings. This version has been updated for the v0.38b template release. The previous version is here. Additionally, there will be a guide for LTX 2.3, which I will link to here in the near future.

Along with some general clean-up, this version adds advanced guidance for how to setup a custom script to perform additional changes during pod initialization.

Pre-requisites

  • A general willingness to poke around. I'm not going to explain how to navigate the RunPod website or the basics of ComfyUI at the moment. Just be curious, and you'll get there.

  • You'll want to be willing to explore using the command line / terminal / Powershell. You will use it to get files back to your computer with runpotctrl. You don't need to know much, and it's a very useful thing to learn.

The goal I have here is to reduce friction, but I cannot eliminate it. Downloads can fail. Message me if you want hands on help.

Why RunPod?

If you're not using it, please use my link here. We'll both get some free credit. But why use it at all? GPU's are extremely expensive these days, and fast GPU's even more so. Once you break down the usage, the cost of hardware plus electricity doesn't make much sense. (I'm paying less than a $1 an hour for access to an RTX 5090.) You're not locked in to your investment when you rent. When the next gen comes, you'll get a faster card for less than you would have paid upfront. Chances are you're not running your card 100% of the time, so for most folks, buying a card is absolutely the wrong answer in this market.

I've got some cost estimates on the workflow page, if you'd like to read a bit more.

Why Wan 2.2?

With the emergence of LTX-2, it's worth noting where Wan still has it's advantages. (LTX-2.3 as of this writing.) There are two clear advantages with LTX-2: video length and sound. However, Wan has significantly better prompt adhesion, exponentially better workflow and LoRA support, high quality output.

Beyond just being poor at prompt adhesion, LTX-2.3 introduces more opportunity for issues with the addition of audio. In general, I have found I need

to do signficantly more generations with LTX-2.3 to arrive at a gen I like. Additionally, Wan seems to "know more" by default. It will happily guess at what you mean with a vague prompt and will even work with incorrect prompting at times.

LTX-2.3 requires extremely verbose prompts, and still doesn't always listen. You must always be very specific and full describe your intent. LTX-2.3 is also prone to quality hits, visual artifacts, and distortion in some situations.

In this way, Wan 2.2 remains a much more reliable tool at the moment. They are both worth spending some time with. Wan moved to closed weights for the foreseeable future, so LTX video is likely to close the gaps and exceed Wan fully in a future release. We will see!

If you are familiar with YAW already, there is already an LTX 2.3 template you can use with a beta of the workflow. (Full guide coming soon.)

How RunPod + Yet Another Workflow?

  1. Once you have an account and some credit loaded on RunPod, go to the "Pods" section. I'll suggest $10 as a good test amount; it will get you ~11 hours of use. I try to keep my balace at about $5-10; if you hit $0, the pod will be killed which deletes everything, so give yourself a little buffer.

  2. Select either the RTX 5090 or the H100 SXM (or whichever CUDA 12.8 GPU that you want). I've found the RTX 5090 to be the best value in terms of speed/cost, but wth H100 SXM will give you a substantial speed boost if you're willing to pay a bit more per video. Adjust for your budget and patience. Please be aware that demand for these cards seems to be rising! You may need a bit of patience to find availability at nearby data center. I've noted a recent improvement here.

    1. There are options at the top: Additional Features > CUDA Version > 12.8

    2. You can adjust the region: 'Any Region' by deault. Closer to you will generally be more responsive. I've had some poor download times with some data centers so watch the pod's System log during initialization. (See notes below for more troubleshooting tips.)

  3. Select the "Yet Another Workflow - ComfyUI - CUDA12.8 - Wan2.2" template.

  4. You can Edit the template for additional option.

    1. You may want to adjust the Container Disk, but it depends on what you download. You will want to set this based on how many LoRA's you end up wanting to download. The videos you make are small, so you really just need enough room for the models and LoRA's. You can get away with less if you don't mind downloading models as you need them, or you can do more if you want the flexibility. Container Disk is very cheap! I do not recommend Volume Disk. (See addendum below.)

    2. If you want to download a few LoRA's, Expand the Environmental Variables and set the civitai_token to your CivitAI API key. Then change the LORAS_IDS_TO_DOWNLOAD to a comma separated list of CivitAI LoRA AIR codes (Search the page for "AIR", it's the second number. ex: 123456, 123457, 123458)

    3. There are Environmental Variables for Smooth Mix model downloads for the Smooth Mix workflow. Please add 80gb to you Container Disk size if you do! You must set the CivitAI API Key if you want to download Smooth Mix! In addition, you can choose to disable the base model downloads (download_2_2_14B_fp16_models_115gb, download_2_2_14B_Kijai_fp8_models_60gb). If you do this, the other workflows will not work!

    4. Advanced Users: You can now also specify a URL for a commandline script to perform additional terminal commands before ComfyUI starts. Some examples might be: downloading text encoders from Huggingface, installing additional custom nodes, exploring a different checkpoint model, manage your CivitAI LoRA downloads manually. There are lots of possible reasons you might do this. I've attached a script to this article that you can use as starting point. (download_loras_blank.sh) You'll have to sort out how to provide a valid share point for the file, but the startup script will attempt to download and run whatever you provide here. Any publicly accessible web URL should work. Please be careful with where and how you share files with API keys. Need help? Shoot me a message here or on Discord.

  1. Double check your settings and press "Deploy on Demand".

  2. The pod will initialize. Check the Logs tab on the pod instance to see what it's doing. Eventually, the Log will report that ComfyUI is up, and the service will show as ready. Start up time will take an average of 15 minutes to do all of the installation without extras. This can go much faster or slower, so watch the System and Container logs in the logs tab of the pod.

  3. Open one of the YAW workflows in the Workflows menu on the left.

  4. Press "Run" on the workflow, you should be good to go!

Additional steps you should take:

Install RunPodCtrl on your local machine. This is the best way to get your generations off the RunPod remote machines. Once installed, you can open Web Terminal and type:

runpodctrl send /ComfyUI/output/

This will generate a .zip file of all of the videos you've made and create a link for you to run locally. Be patient, it can be slow to create the connect between the computers. Once you have it, paste it on your local command line, and it will send you the files very fast.

Addendum: Which workflow should I use?

A good and reasonable question! In general use the standard v0.38 workflow. It's my main workflow. Use the others if you have a specific reason to use use them. They each exist to allow you to try different things with the same interface. Learn one, and the rest are there for you.

If you are brand new to ComfyUI, you may find the MoE version to be a bit easier to use. It removes some options and a combined sampler to make things slightly more visually simple. I recommend moving to standard once you've got a bit of a feel for things.

If you're excited about Smooth Mix, use that version! You must specify the optional downloads as described above for this to work.

If you're an advanced user, consider the WanVideo version for a change. This version produces different results for a many reasons, and uses completely different nodes from a normal workflow. It's also very fussy and can be difficult to adjust. While it is powerful, there are way more options to consider if you are making changes, so please avoid this until you have a good amount of video generation experience under your belt.

Addendum: Why not Volume Disk?

Someone asked about the startup time: Why do 15 minutes of initialization every time when you can store things on a their network storage volumes?

The spin-up time is mostly downloading models. You certainly CAN do this. I don't recommend it unless you want to to pay for Network Volume storage (and you are fine with the first point below), which I do not: I will explain. I'm very cheap about this, so adjust to your budget and time concerns. I want my money to go towards compute time, not storage.

  1. First, an important point: Volume Disk is not portable! You have to create it at a specific datacenter. If your preferred datacenter does not have the GPU you want, you will have to wait till one becomes free. Container disk is always available.

  2. 300 gb of storage is $21 a month, so ~$0.70 a day. I'd probably end up closer to 400gb if I was keeping everything around, but we'll use 300gb as an example, since that's what I generally use.

  3. The same 300gb container disk storage is about $0.04 cents an hour.

  4. My total compute costs for L40S total about $0.90 per hour, so it's a ~$0.22 cent startup cost.

  5. That $21 is just over 23 hours of compute time including the container disk, so it's a matter of perfering to maximize my resources that way for the minor inconvenience of waiting for the bootup. That time is never 0, the image still has to initialize, so we're talking 5-10 minutes of difference. Weirdly, so far as I can tell, the Volume Disk cannot act as the main disc for the container, a template must be initialized.

  6. I don't make videos every day. If I take a break, I'm not spending.

So! All of that is to say, that's why I use container disk. I can often fill the boot-up time with other things I want to do: posting videos here on Civit, organizing files, making images for i2v.

As a point of clarification: I do actually use a volume for SD image making, because the space requirement is so much smaller and therefore cheaper. I mostly do this because LoRA management for SD models is a nightmare, where as Wan is very clean.

Help! Startup is slow!

Popularity of these services is growing. Here are some observations and tips.

  • First tip: If the GPU is in good availability, it can be worth just killing a pod that's being slow to start and trying another data center. This is a die roll tho. Sometimes it helps, sometimes it doesn't.

  • In general, my startup times are good. Esspecially with the H100 SXM. (I tend to have the most pain here with 5090's. I suspect they have lower network priority than the "big" machines running the server grade cards.)

  • Demand has a big effect on things. This is the big factor. Part of this is popularity. Both of the preferred cards are either in low supply or unavailable at times - data center demand is growing. Depending on the data center you land in, and its particular network saturation and the saturation of total demand for hits to either HuggingFace for Docker Hub, you can get some gnarly slow down. This also means there's a time of day consideration.

    • To elaborate a bit more: The big downloads are the Runpod template itself (Docker Hub) and then the models (HuggingFace or CivitAI for SmoothMix).

  • Second tip: If you don't expect to use a specific model, you adjust that in the template environmental variables. (For example, I usually only use the fp16 models, so I often skip downloading the fp8 models.)

  • Runpod doesn't cache images across their network. There's no way to force this or pay for it on my end either, tho I would if I could. They are pulled from the Docker Hub servers. If you magically hit a machine that already has the template, it will be instant. (This has been rare in my exprience, but has happened.)

  • If you're experiencing slow down in the UI, that's probably an issue with your local computer's hardware. I've read you can turn-off hardware accelertion for your browser to improve performance, but that's outside of my personal experience.

  • If you get see the following cycle in the system log, terminate the Pod and create a new one. Try to specify a different data center:

image pull: docker.io/boobkake22/comfyui-wan-yaw:latest: pending error
 creating container: image pull: docker.io/boobkake22/comfyui-wan-yaw:latest:
 pending create container docker.io/boobkake22/comfyui-wan-yaw:latest

If demand continues, expect the availability and performance to improve over time.

0