Sign In

ComfyUI Multi-Subject Workflows

560
6.6k
97
Type
Workflows
Stats
2,458
Reviews
Published
Jun 11, 2023
Base Model
Other
Hash
AutoV2
DC470DC4B5
default creator card background decoration
Bronze Tools Badge
Bocian

Last updated workflow:

  • Interaction OpenPose v2.2 -> v2.3

  • Latent Couple v4.0 -> v4.1

Please download from the model version, not "Update [...]" as I delete and recreate it every time I release a new version to push notifications and deleting a version removes statistics, reviews and comments made on it.

I have a working version of the Region Lora workflow, you can get the prototype by downloading the file for the update notification push, but be warned that it is a mess and then some. Comfy is as annoying as ever with the dumbest issues and what works perfectly as a bunch of nodes refuses to work after converting it into a group node (generates first 1-3 steps, and crashes). I'm trying to find a workaround, but don't get your expectations too high. I have a pretty bad record of keeping motivated to put up with all of those dumb issues (I literally have a group node that works in one workflow but refuses to when copied to or even recreated in another).


This is a collection of custom workflows for ComfyUI

They can generate multiple subjects. Each subject has its own prompt.

They require some custom nodes to function properly, mostly to automate out or simplify some of the tediousness that comes with setting up these things.

To install any required custom nodes, best way is to get the ComfyUI Manager, then go to Manager, and click "Install Missing Custom Nodes". If you're still missing nodes, refer to the dependencies listed in the "About this version" section for that workflow

I can now run SDXL somewhat fine, so I will likely try to get these workflows working with it in the future.
------------------------------------------------------------------------------------------------------

There are five methods for multiple subjects included so far:

Latent Couple

Allows for more detailed control over image composition by applying different prompts to different parts of the image.

From my testing, this generally does better than Noisy Latent Composition.

This is pretty standard for ComfyUI, just includes some QoL stuff from custom nodes

Noisy Latent Composition (discontinued, workflows can be found in Legacy Workflows)

Generates each prompt on a separate image for a few steps (eg. 4/20) so that only rough outlines of major elements get created, then combines them together and does the remaining steps with Latent Couple.

This is pretty standard for ComfyUI, just includes some QoL stuff from custom nodes

Character Interaction (Latent) (discontinued, workflows can be found in Legacy Workflows)

First of all, if you want something that actually works well, check Character Interaction (OpenPose) or Region Lora. This doesn't, I'm leaving it for archival purposes.

This is an """attempt""" at generating 2 characters interacting with each other, while retaining a high degree of control over their looks, without using ControlNets. Extremely inconsistent and unreliable.

We do this by generating the first few steps (eg. 6/30) on a single prompt encompassing the whole image that describes what sort of interaction we want to achieve (+background and perspective, common features of both characters help too).

Then, for the remaining steps in the second KSampler, we add two more prompts, one for each character, limited to the area where we "expect" (guess) they'll appear, so mostly just the left half/right half of the image with some overlap.

I'm not gonna lie, the results and consistency aren't great. If you want to try it, some settings to fiddle around with would be at which step the KSampler should change, the amount of overlap between character prompts and prompt strengths. From my testing, the closest interaction I've been able to get out of this was a kiss, I've tried to go for a hug but with no luck.

The higher the step that you switch KSamplers at, the more consistently you'll get the desired interaction, but you'll lose out on the character prompts (I've been going between 20-35% of total steps). You may be able to offset this a bit by increasing character prompt strengths.

Character Interaction (OpenPose)

Another method of generating character interaction, except this time it actually works, and very consistently at that. To achieve this we simply run latent composition with ControlNet openpose mixed in. To make it more convenient to use, the OpenPose image can be pregenerated, so there is no need to hassle with inputting premade ones yourself. As a result, it's not too complicated as compared with a normal generation. You can find instructions in the notes in the workflow itself after importing it into ComfyUI.

From a more technical side of things, implementing it is actually a bit more complicated than just applying OpenPose to the conditioning. Because we're dealing with a total of 3 (or more!) conditionings (background and both subjects) we're running into problems. Applying ControlNet to all three, be it before combining them or after, gives us the background with OpenPose applied correctly (the OpenPose image having the same dimensions as the background conditioning), and subjects with the OpenPose image squeezed to fit their dimensions, for a total of 3 non-aligned ControlNet images. For that reason, we can only apply unchanged OpenPose to the background. Stopping here, however, results in there being no ControlNet guidance for our subjects and the result has nothing to do with our OpenPose image. Therefore, now we crop parts of the OpenPose that correlate with subject areas and apply that to the subject conditioning for each subject before combining them into the final conditioning. Only then can we generate the final image.

The following image demonstrates our resulting conditioning:

btw the workflow will generate similar ones for you :)

Background conditioning covers the entire image and contains the entirety of the pose data.

Subject 1 is represented as the green area and contains a crop of the pose that is inside that area.

Subject 2 is represented as the blue area and contains a crop of the pose that is inside that area.

The image itself is generated first, then the pose data is extracted from it, cropped, applied to conditioning and used in generating the proper image. This saves you from having to have applicable OpenPose images on hand.

And here is the final result:

This includes a second pass after upscaling, face restoration and additional upscaling at the end, all of which are included in the workflow.

A handy preview of the conditioning areas (see the first image) is also generated. Ideally, it would happen before the proper image generation, but the means to control that are not yet implemented in ComfyUI, so sometimes it's the last thing the workflow does. Sadly, I can't do anything about it for now.

Some more use-related details are explained in the workflow itself.

Region Lora

As the name implies, this workflow will let you apply Lora models to specified areas of the image. Currently, the maximum is 2 such regions, but further development of ComfyUI or perhaps some custom nodes could extend this limit.

You can, for example, generate 2 characters, each from a different lora and with a different art style, or a single character with one set of loras applied to their face, and the other to the rest of the body - cosplay!

How does it work?

The secret is the TwoSamplersForMask node from Impact Pack. It allows us to generate parts of the image with different samplers based on masked areas. That means we can put in different Lora models, or even use different checkpoints for masked/non-masked areas. This is the central piece, but of course, it's not actually as simple as just using that. The full workflow is as follows:

First, we generate an image of our desired pose with a realistic checkpoint and pass it through a ControlNet OpenPose Preprocessor:

The raw OpenPose Image is then applied to the conditioning of both subjects

The next step is in a Preview Bridge (another node from Impact Pack), which is essentially a preview image node with image and mask output. This is where the second of the above images ends up. Once it is there, we stop the image generation, and open this image in the MaskEditor, where we can draw our mask over one of the characters like so:

This mask is now used to determine where each sampler should be applied on a brand new image, while the OpenPose ControlNet applied to conditioning ensures proper anatomy/framing/perspective etc.

Final Result:

Mask: ヰ世界情緒 Isekai Joucho Nemophila ver

No Mask: Hoshimachi Suisei 星街すいせい / Hololive