Sign In

Experiment with txt2img double model generation

0

Experiment with txt2img double model generation

In the case of flux2klein 9b there is two model available:

  • The base model (non distilled, we will call it the base)

  • The distiled model (we will call it turbo model)

It seems that both model have strength and weakness

  • base strength: good random generation, more flexible, can take negative prompt

  • turbo strength: fast/less total steps + good at details

So the idea is to use the base model for the first generations steps and then use the turbo model for the last steps/details. (nothing new or ground breaking as someone did it on zImg base+turbo so I just took his work and tested it on flux => https://www.reddit.com/r/StableDiffusion/comments/1tgkoag/two_staged_workflow_zib_to_zit/)

What is specific to flux2 is the scheduler and guidance. for the base it's good to use the cfgguider node (we have high cfg and pos/neg prompt), for the turbo we use fluxGuidance+basicGuider node (better for a distilled model with cfg=1 and no neg prompt).

The last important thing is the flux2scheduler used by the two ksampler (base & turbo). we can't just give the same schedule to base and turbo as the models use differents schedule. Base use between 24 and 40 steps, turbo use 4 to 9 steps.

So the approch used is to set the base scheduler to 24 total steps and keep only the 11 first steps (with the splitSigma node). Fors tthe turbo part we set the scheduler to 8 total step but use only the 4 last steps. The image give more details, the bleu curve is the base steps sigmas, the orange is the turbo steps sigmas. The black curve is what we get when we use a single scheduler with the 11+4 effective steps (so we can compare if our curve is close to the ideal one or not). I also compared if the end of the bleu curve and start of the trubo/orange are close together.

I will maybe adjust the parameter after some test but this is good for now: 11/24 base => 4/8 trubo.

image.pngimage.png

0