Sign In

Image composition is HARD!

2

Image composition is HARD!

When Forge start on my laptop, the cold hard reality is clearly stated:

[GPU Setting] You will use 50.01% GPU memory (3072.00 MB) to load weights, and use 49.99% GPU memory (3071.00 MB) to do matrix computation.

And when generating, in the best case scenario (limited number of LoRA and tokens in the prompt), it's just sad...

[Memory Management] Target: KModel, Free GPU: 4484.83 MB, Model Require: 2448.29 MB, Previously Loaded: 0.00 MB, Inference Require: 3111.62 MB, Remaining: -1075.09 MB, CPU Swap Loaded (blocked method): 1398.05 MB, GPU Loaded: 1050.25 MB

In this kind of setup, building a good picture composition is a challenge. What if you want not one but two characters? And they are different? In a specific DIFFERENT pose? SDXL models really struggle with this. I can't generate a LOT of pictures and hope for the best if i use LoRA and lots of token...

That's when ControlNet saves the day but... what could be the control image?

For this experiment, i tried something a bit different: mass generate pictures with a limited setup, hope for the best and build from this.

The idea here was to have 2girls, one sitting, one standing. Go ahead and try it, you will get a lot of:

  • both are sitting

  • the standing one has her legs cut off

  • mixed monstrosity of legs and stuff

Especially if the prompt is long, in order to describe outfits, elements of decor, view angle, etc...

So, here, i went for something simple: Quick scheduler (Euler a), few steps (less than 20), no adetailer, hires fix or whatever and a SHORT prompt (less than 75 tokens):

PonyScores7, (very aesthetic, masterpiece, best quality:1.3), 2girls, (a woman, standing):1.5, (a woman, sitting, armchair):1.5, feet out of frame, looking at viewer, from below, black bodysuit, at night, moonlight

After around 20 pictures, i got the seed i wanted and re-did it with a better scheduler and adetailer to get this:

Good enough! But not yet what i wanted. It was time to start playing with ControlNet. Now, OpenPose give nice result but it still give too much liberty to the model and sometimes, you get ugly stuff:

What happened? The floor ate the poor girl legs! (don't mind the back and forth model switching, i just kept a few intermediate image for this article, most were done with BancinXL but this one isn't)

That's why i switched to CN-anytest from iroiro. From here, i did a first version to get the outfit in it:

And then i used the newly generated picture as a new control picture for a better version, with LoRA and more steps:

And now, to test models, it was time to use THIS new picture as a control picture and go the homerun: https://civitai.com/posts/10527818

2