Ace Step 1.5. Turbo and SFT model with Ollama Text to Audio/Song (examples below)
Ace Step uses TAGS and LYRICS to create a song. These can be generated by Ollama or by own prompts.
Can use any Song, Artist as reference or any other description to generate tags and lyrics.
Will output up to two songs, one generated by Turbo model, the other by the SFT model (experimental).
Keyscales, bpm and song duration can be randomized.
able to use dynamic prompts.
creates suitable songtitle and filenames with Ollama.
Lora Loader included, hope to see some Loras soon!
Important: Do not use sage attention in your comfyui starting parameters, avoid --lowvram setting, as this might force Texencoder to run very slow on CPU instead of GPU.
Download Files:
Ace Step 1.5 TURBO model: https://huggingface.co/Comfy-Org/ace_step_1.5_ComfyUI_files/tree/main/split_files/diffusion_models
Ace Step 1.5 SFT model: https://huggingface.co/ACE-Step/acestep-v15-sft/tree/main (download model.safetensor and rename it)
Textencoder: https://huggingface.co/Comfy-Org/ace_step_1.5_ComfyUI_files/tree/main/split_files/text_encoders (Qwen_0.6b and Qwen_4b required, 1.7b is a smaller alternative to 4b)
VAE: https://huggingface.co/Comfy-Org/ace_step_1.5_ComfyUI_files/tree/main/split_files/vae
Ollama Models, required for tags, lyrics and songtitle, you can choose 1,2 or 3 different models, tags and lyrics might need a bigger model >7b, songtitle can use a smaller model:
https://ollama.com/huihui_ai/qwen3-vl-abliterated (for tags and lyrics, able to use thinking)
https://ollama.com/artifish/llama3.2-uncensored (small and fast for songtitle and tags)
https://ollama.com/mirage335/Llama-3-NeuralDaredevil-8B-abliterated-virtuoso (allround model, fast, usable for tags, lyrics and songtitle)
Update 9th of Feb 26: Alternative Turbo and SFT Models :
Turbo continuous: https://huggingface.co/ACE-Step/acestep-v15-turbo-continuous/tree/main
SFT-Shift1: https://huggingface.co/ACE-Step/acestep-v15-turbo-shift1/tree/main
SFT-Shift3: https://huggingface.co/ACE-Step/acestep-v15-turbo-shift3/tree/main
Merges of SFT, Turbo and Base model: https://huggingface.co/Aryanne/acestep-v15-test-merges/tree/main
Which models to start with? => Turbo, SFT-Shift1 and Llama3-NeuralDaredevil for Ollama.
Save Location:
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── acestep_v1.5_turbo.safetensors
│ ├── 📂 text_encoders/
│ │ ├── qwen_0.6b_ace15.safetensors
│ │ └── qwen_4b_ace15.safetensors (or 1.7b)
│ └── 📂 vae/
│ └── ace_1.5_vae.safetensors
Custom Nodes used:
optional (use Beta57 scheduler for a bit more punch, requires RES4LYF): https://github.com/ClownsharkBatwing/RES4LYF
Examples various styles:


