JoyAI-Echo — GGUF (for low-VRAM ComfyUI)
Quantized GGUF weights and extracted components for running JD's JoyAI-Echo — a text → video + audio model — on consumer GPUs.
The original release is a single ~46 GB bf16 checkpoint built for ~48 GB-class hardware. These files let it run on 8 GB VRAM + 16 GB RAM through the companion ComfyUI nodes.
▶ Nodes / how to run: https://github.com/RealRebelAI/ComfyUI_JoyAI_Echo_GGUF_Nodes
▶ Model Files: https://huggingface.co/realrebelai/JoyAI-Echo_GGUF/tree/main
▶ Colorful Int Nodes (IN WORKFLOW): https://github.com/RealRebelAI/Rebels_Better_Int_Node
▶ Animated Nodes (IN WORKFLOW): https://github.com/RealRebelAI/Rebels_Animated_Nodes
⚠️ Experimental. These run, but on small hardware loads and the first encode are slow. See the node repo for expectations and troubleshooting.
What's in this repo
Files
JoyAI-Echo-DiT-Q2_K.ggufDiffusion transformer, Q2_KSmallest / lowest RAM. Start here on 8 GB.JoyAI-Echo-DiT-Q4_K_M.ggufDiffusion transformer, Q4_K_MHigher quality, more RAM.joyai_echo_video_vae.safetensorsVideo VAERequired.joyai_echo_audio_vae.safetensorsAudio VAERequired.joyai_echo_vocoder.safetensorsVocoderRequired for audio.joyai_echo_embeddings_processor.safetensorsText-embedding connectorRequired.joyai_echo_config.jsonArchitecture configAlso ships inside the node pack.
(Sizes are approximate; check the file listing. Q2_K is the lighter option, Q4_K_M the heavier/higher-quality one.)
Why GGUF (and why these specific files)
Stock ComfyUI GGUF loaders build the wrong architecture for JoyAI-Echo — it's a modified LTX-2.3 (different scale_shift_table and connector dimensions). The companion nodes rebuild it correctly with JoyAI's own configurator and keep the DiT weights packed in GGUF form, dequantizing on the fly so the model fits in limited RAM/VRAM.
The VAEs, vocoder, and connector are extracted from the original checkpoint as standalone files so you don't need the 46 GB bundle at runtime.
Usage
Install the ComfyUI nodes.
Download the files above and place them in the ComfyUI folders listed in the node README (
models/unet,models/vae,models/audio_encoders,models/text_encoders).Add the RebelsJE_StagedPipeline node →
CreateVideo→SaveVideo, pick the files from the dropdowns, prompt, and queue.
Pick Q2_K first if you're on 8 GB; move up to Q4_K_M if you have RAM headroom and want better quality.
Hardware
Minimum target: 8 GB VRAM + 16 GB RAM (RTX 3070 class).
SSD strongly recommended — the files load much faster than from a spinning disk.
Acknowledgements
JoyAI-Echo — Echo Team @ Joy Future Academy, JD
LTX-2.3 — Lightricks
Gemma-3 — Google
GGUF tooling: city96/ComfyUI-GGUF
Rebel AI - Quantization of model and accompanying model files
Rebel AI - GGUF format Custom Node Set
License
These quantized weights are derived from JoyAI-Echo and are released for academic research and non-commercial use only, following the upstream JoyAI-Echo license. Gemma components (downloaded separately) are governed by Google's Gemma license. By downloading you agree to those terms.

