Sign In

How to make SD1.5 more easily controllable

How to make SD1.5 more easily controllable


I don't know English, the following translation is from chatGPT.If there are any shortcomings, please forgive me.


你是不是也像我一样,在使用SD1.5时经常被它的构图能力所劝退。比如明明提示词写的是portrait,它却跑出个close-up甚至upper body,要是偶尔一两次倒也罢了,但事实是跑对的概率并不高,跟抽奖一样,无奈的抽卡党们的万恶之源,家人们心累谁懂啊

有人可能会建议使用 ControlNet 来解决,当然,如果我们有合适的参考图,ControlNet又恰巧顺利无误的提取了其中的信息,在SD不抽其他风的前提下,我们是可以得到较好结果的,不过,这也太不人工智能了吧。

今天我想探讨的是,是否有一种简单的方法,既不像 ControlNet 那么繁琐,又可以比较准确的控制构图,比如用一个lora。这里是案例。

Many have struggled with the composition control in SD1.5. Often, when you intend to get a close-up shot, it provides you with a portrait or even upper body shot, which can be incredibly frustrating, testing our patience. Some may suggest that using ControlNet can address this issue. Certainly, finding an appropriate composition image and then extracting information through ControlNet can offer better control. However, what we are exploring today is whether there is a simpler method, such as using Lora.The lora link: SDXLrender


Here's my approach:

  1. Data Preparation: I used images rendered with SDXL. This way, SD1.5 can learn not only composition but also the lighting and textures from SDXL.


  2. Concise Prompting: The simpler, the better. I used only two prompt words, "character" and "composition," to make SD's learning more focused

    第二步就是打标,由于我只针对构图提示词进行训练,所以只打了构图提示词的标,让SD更加聚焦。例图中只用了两个词“1girl, from above”

  3. Training Data: I configured my setup to train only UNET, this is very important! meaning I only trained the images and not the prompt words. This significantly reduces contamination from the prompt words, preventing SD from learning irrelevant content.




Here are the results of the training


For training and testing purposes, I've uploaded its dataset


The lora link: SDXLrender

希望这篇分享对你有所帮助,hope you have fun!

You can view more content sharing on my homepage →