home models images videos posts articles bounties challenges events updates shop

One LoRA controls consistency across multiple roles

Name: One LoRA controls consistency across multiple roles
Rating: 5 (10 reviews)
Author: Shu_ShengXia

Updated: Mar 6, 2025

character

Download (292.23 MB)

Verified: 8 months ago

SafeTensor

Details

Type	LoRA
Stats	93 0
Reviews	Positive (10)
Published	Mar 6, 2025
Base Model	Flux.1 D
Training	Steps: 2,500 Epochs: 10
Usage Tips	Clip Skip: 1 Strength: 1
Hash	AutoV2 59D565CC49

1 File

default creator card background decoration

Shu_ShengXia

The FLUX.1 [dev] Model is licensed by Black Forest Labs. Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs. Inc.

IN NO EVENT SHALL BLACK FOREST LABS, INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.

If it's helpful to you, you can follow my Bilibili account or YouTube account.

Multi role control consistency has always been a challenge in ComfyUI. Previously, single role LoRA training was used to control single role consistency,

Subsequently, I merged the dataset and annotated the prompt words. However, the selected prompt word annotation strategy had the problem of semantic pollution, so the model did not achieve the expected effect well.

label:

In the photo with a pure white background, susuxi stands with hands on hips on the left side of the picture, wearing a white shirt and black pants with a yellow belt on the pants and yellow pockets on the shirt. The expression is happy and the mouth is laughing. dreamoo is in the upper body photo on the right side of the picture, looking at the gray top and red short sleeved shirt on the right side

However, after testing, ignoring the region control of prompt words can lead to feature fusion in images.

susuxi and dreamoo are swinging on the swing,

Subsequently, inspired by in context lora, I modified the labeling method and increased the Flux model's perception ability of image regions. By using different prompt word forms to correspond to the features of different regions in the image, I completed the training of the lora model,

label：

[Two different characters scene], <dreamoo><ssx> group photo, <ssx stands with hands on hips, wearing a white shirt and black pants, with a yellow pocket on the shirt>, <dreamoo wears a gray shirt with a red inner layer, facing right>, pure white background,

I recommend using the following method to write prompt words and adding characters in the same scene in the final scene, which will better put different characters in the same scene.

[Two different character in one scenes], <dreamoo><ssx> Describe the scenario, <Use trigger words to describe the attire and status of character one>, <Use trigger words to describe the attire and status of character two>, Overall scene description,

The following is the test result chart. From the results, it can be seen that the model has strong generalization ability and can maintain consistency in its roles.

If combined with the Wanxiang video generation model, video production can be completed and the effect is very good.