Updated: Oct 26, 2025
base modelThis page contains fp8 scaled DiT models of Neta Lumina.
Neta Lumina (NT)
NetaYume Lumina (NTYM)
And a fp8 scaled Gemma 2 2b (the text encoder) as well.
All credit belongs to the original model author. License is the same as the original model.
FYI:
The fp8 scaled DiT of Lumina 2 is only 2.5Gb. Yes, this means you can even run it without swapping layers with a 3GB GPU card. Although it is meaningless since nobody is gonna actually use a GTX 1050 to run it nowadays. It's a way to show off efficiency.
About "scaled fp8":
"scaled fp8" is not fp8. "scaled fp8" can give you identical quality comparing to the original model.
-50% VRAM usage.
Comfyui supports it out-of-the-box. You don't have to change anything, just load it as a normal model using the same loader node.
Unfortunately no full fp8 calculation support (10/20/2025), all calculations are still bf16. I tried, but got overflowed.
May run a little bit faster if the bottleneck of your GPU card is the memory bus. Otherwise no difference.
fp8 scaled Gemma 2 2b:
Usually not necessary since text encoder only runs once and then will be offloaded to CPU. Useful when you RAM is also not enough. E.g. Full bf16 model needs 10GB RAM to load (DiT 4.8GB, TE: 5.5GB), which is a problem if your total system RAM is <=16GB. Full fp8 scaled model only needs ~5.5GB (2.5 + 3).

