v1.2b Adds Canny edge detection and ControlNet for SDXL and QWEN image generations.
v1.2a INTRODUCING QWEN MORPH !!
v1.2 (pre) Brings AI generated background music to your video creations (DEMO VIDEOS INCLUDE AI GENERATED MUSIC JUST UNMUTE VIDEOS). Also, as nice as .png quality for video images was, over the long run it created HUGE files, so I've switched the pipeline to use .webp now (with very little compression) . Also, to bypass ComfyUI file size limitations I've added a note so that you can set your launch args to handle very large files. Also, to calculate generated audio length to match created video length, I've had to write a custom Math-Divide node into JPS nodes, for this I'm now providing the modified JPS node python main, so you can replace your with this to make my custom math node available. I've also noticed in the last version I forgot to lower the steps for SDXL to QWEN refiner (because of distilled model) so I've fixed that too.
v1.1 adds advanced dual lora setup, TeaCache and Enhance-A-Video with explanations.
v1.1a LightX2V proof of concept, crazy fast WAN 2.2 renders using only 5 steps and CFG 1
v1.1b is an iteration of 1.1a where I've added an A/B video comparison function and took the opportunity to use the first WAN 2.2 lora available for demonstration.
v1.1c T2V2V Proof-Of-Concept: This version creates a monster !! WARNING !! Don't even attempt this unless you can spare 144GB virtual memory (system ram + swap file) . It will juggle 4 WAN 2.2 models (T2V + I2V) in one connected pipeline. It will get the last frame of T2V and continue I2V generation with that and join two videos seamlessly. In case you are using fp8 scaled models like I do; this will take a massive 130-ish GB virtual memory to swap things around. I have 64GB RAM, 32GB VRAM and I had to raise my page file in windows to 80GB for this to complete. Once you have all that though, it actually takes surprisingly little time to complete. So, MAKE SURE your system ram + page file is at least 144GB (if you are using quantized GGUF versions this will obviously be much smaller). In the upcoming v1.2 major upgrade, this workflow will evolve into a complete production suite, think of this as a whistleblower for that ;)
v1.1d SDXL to Video Pipeline 🔥🔥🔥 Well, basically your dreams come true 😉 I'll take your buzz for this one though 😈 haha, just kidding. Whatever I do, I do for love, not for fame 😘 Enjoy !
v1.1e DON'T DOWNLOAD THIS GET THE (FIXED) VERSION INSTEAD
SDXL2V InfiniteChain to create a virtually infinitely long video with as many different poses/actions as you like. It works by generating an image you like in SDXL and from then on creates a feedback loop as you feed that image into WAN 2.2 I2V. Will automatically extract last frame of your rendered video so that you can start your next generation using that. Also, in case that video's last frame is not clean, has a Crop-A-Video suite so that you can crop your video to exactly the frame you want it to end, again extracting a last frame for you. Features a video combiner suite so you can keep chaining videos you create back-to-back, in effect allowing for an infinite video length. From this version on all creations will use AWS (Advanced Workflow Suite) folder in your outputs and designated subfolders within that. I strongly recommend you keep this intact as with upcoming v1.2, AWS will be a full-fledged all-in-one suite with cross referencing generators etc.
v1.1e(fixed) I have noticed a bizarre bug in VHS video decode where, over long video combines (like this version does) it keeps adding a weird hue at the beginning of the video after a while (hence the purple face on the v1.1e demo video) . I have no idea why and this really should not happen. So I've fiddled a lot with it but it doesn't go away. I've decided to use in-house comfyUI video creation nodes instead and now there is no problem, so please (I'm sorry) re-download this version instead.
v1.1f SDXL distilled last frame proof of concept. Brings the SDXL distilled last frame refinement I was mentioning in the previous version. And it works even better than I expected to be honest :) So now, whenever a last frame is generated (I2V rendering or cropping a video), that last frame is distilled using SDXL, enhancing details. This needs to be done very carefully as to not cause glitches in the video-flow or any sudden noticeable changes. Also, we sample one frame before the last frame and interpolate, so the switch to the enhanced frame is somewhat more smooth. This way, at the end of every generation cycle, we are effectively injecting some SDXL magic back into WAN 2.2 video chain, trying to help long video context degradation. Both the refiner steps, and the interpolate/sharpen steps still needs fine tuning, I'll perfect these in the upcoming versions, but for now I wanted to release as a proof-of-concept because already as it is now is a massive quality upgrade. The whole process is automated by the way, you don't need to do anything, just create your videos usually and the whole thing will be handled seamlessly.
Please read the "read me" sections in the workflow.
N-Joy !!