Chroma is a fantastic and highly versatile model capable of producing photo-like results, but it can require careful prompting. This finetune aims to improve reliability in realistic/photo-based styles while preserving Chroma’s broad concept knowledge (subjects, objects, scenes, etc.). Chroma can probably do anything this model can, but UnCanny aims to be more lenient.
Personally I'd recommend downloading the non-flash model, then you can experiment with steps, CFG, flash-lora-ranks to suit your needs. The flash version has a rank-128 lora baked in. Some example images were made using a flash or low-step lora - see settings below. GGUFs on HuggingFace.
Example Generation Notes
Prompting: For photos, simply describing what you want to see in natural sentences seems to work well. Tags seem to push the model closer towards art/anime, natural language towards photos. For photos use terms like photo or photography, but avoid photorealistic. Photorealistic might be good for photorealistic art.
Example settings (not necessarily optimal):
Workflow: Chroma template workflow in ComfyUI
Steps (flash lora): 15 works well with rank-128. Depends on flash-lora rank.
Steps (base): ~30-35
CFG (flash lora): 1 works well with rank-128. Depends on flash-lora rank.
CFG (base): ~3.5
Sampler:
res_2mScheduler:
bong_tangent
Support
Have too much money? Want to support further training?
https://ko-fi.com/dawncreates
Training Details
The model was trained locally, using Chroma-HD as the base. Each epoch included images at 3–5 different resolutions, though only a subset of the dataset was used per epoch. Except for the extra resolutions, OneTrainer's default config for 24gb Chroma finetuning was used. The dataset consists almost exclusively of SFW-images of people and landscapes, so to retain Chroma-HD's original conceptual understanding, several layers were merged back at various ratios. All the juice, compositions, subjects, and concepts come from Chroma itself, my model just nudges it towards realism. Honestly, this version is more of a showcase of how good Chroma is than a great finetune in itself. I do think it shows how much potential Chroma has for finetuning though - so get to work on Chroma finetuners - it has so much potential!
I aim to continue finetuning and experimenting, but the current version has some juice.
All images were captioned using JoyCaption: https://github.com/fpgaminer/joycaption
The model was trained using OneTrainer: https://github.com/Nerogar/OneTrainer
NOTE: The original v1 had some bugged layer names - this is now fixed (as of the evening of the 31st of October). Having the wrong version shouldn't affect generation in ComfyUI - but it might affect things like training and quantization.
