Running HYPER FLUX.1 Dev 16 Steps Lora on LOW-VRAM

@16 Steps : Prompt executed in 44.23 seconds

According to ByteDance Hyper-SD is one of the new SOTA diffusion model acceleration techniques. On Aug 26 2024 they released 8-steps and 16-steps FLUX.1-dev-related LoRAs. A new checkpoint distilled from FLUX.1-dev. They also recommend LoRA scales around 0.125 for a better result and the default flux guidance scale could be kept on 3.5.

In order to test the performance of this lora, I wanted to share this basic workflow, especially for those with a lower-end machine resources like mine with a configuration of 16 GB GPU Memory and 32 GB RAM with an NVIDIA GeForce RTX 4060 Ti.

According to my own config, the results are quite interesting, with 40s / 16 Steps instead of the community average of 60s / 20 steps.

Hyper FLUX.1 Dev-16Steps-lora workflow v.0.1

The simple comfyUI workflow looks like this:

So I'm proud to share with you this workflow (see attached file) designed around the default FLUX.1dev model released by the Black Forest Lab team.

You also, will need the following file to use the workflow : Hyper-FLUX.1-dev-(8 or 16) steps-lora.safetensors.

Important: If your image display only noise, make sure the lora strength is set to 0.125.

I hope you will find this workflow useful. Leave a message if you have any question, request or hint. Thanks!

Zovir