Wan 2.1 Text-to-Image!
Who knew Wan 2.1 was an absolute beast at generating stunning, single-frame text-to-image outputs? Well… you do now.
Originally trained for rendering video, Wan 2.1 wasn’t meant to be a full-blown T2I model - but it turns out, this thing absolutely slaps when it comes to creating high-detail, expressive, and stylish compositions from simple prompts. Anime scenes, suggestive portraits, or moody cinematic stills, Wan 2.1 brings an uncensored edge and a surprising amount of depth, lighting finesse, and expressiveness to every gen.
This model card contains everything you need to get started using Wan 2.1 as an image generator with ~12 to 16 GB VRAM, utilizing GGUF models. You will need some custom ComfyUI nodes, so make sure ComfyUI is up to date, and pull those in with the Comfy Manager!