With WAN 2.1 text2video being all the rage I tried it using my 7800XT. It has 16gb and should deliver decent output.
I used ComfyUI-ZLUDA (https://github.com/patientx/ComfyUI-Zluda). Follow ithe installation instructions.
I downloaded the latest ZLUDA 4.x version (https://github.com/vosen/ZLUDA).
You might need to change the comfyui.bat to use this ZLUDA.
Check the attached workflow.
I used WAN's biggest t2v model with 14b parameters and fp16 tensors.
Creating a 4+ sec video you'll need a lot of patience. I took roughly 2:30 h for me.
I tried several batches. The very first frame is always a bit brighter then the rest of the video which comes across a bit washed out. I'm currently repeating the creation on an RTX 3060 with 12gb VRAM. Let's see whether that effect changes on original NVidia hardware.
