Hello and welcome to my guide on how to create the artwork that you want. This article will teach you a way to create artworks with very specific characters (multiple characters in one image too) in specific poses/ scenarios. We will be using very basic tools to achieve that. We are also getting rid of the general process of "making one prompt to get the image you want" and instead are going to modify it multiple times during the creation of our artwork. I usually use this to create some spicy nsfw artworks, but for the purpose of this guide (and to make it available for anyone) I'll stick to something sfw :3c
The prerequisite to follow this guide is going to be a decent understanding of:
a1111 txt2img, img2img and inpainting
External editing tools like gimp, paint dot net, photoshop etc.
We will be using methods to not require a ton of vram, so don't worry about that
If you understand and are able to use those already you should have no issue with following this guide, but you can still read it and learn those things while following it too!
Step 1: Think about what you want to make
This is the first and most important step. The entire process can take a few hours depending on what you are trying to make, so spending a little extra time to think about what you want to make will save you potential struggles and wasted time later down the line. One thing to also consider at this step is if there are LoRAs available for each part that you want to make.
Here is an example that I will use for this guide: I want to make an artwork with Amber, Mona, Fischl and Barbara from Genshin Impact sitting at a table enjoying a meal, talking and just having fun. So now I want to check if there are models for each part that I want in my artwork. For the characters there are, and since I don't need any specific poses/artstyles that require additional LoRA I am good to start with the next step.
Step 2: Get a good base image for your pose/scenario
To achieve this there are two options: Either you get an already existing image and just let img2img make something similar with it, or use txt2img to create it from scratch. If you went for something with a specific pose you also want to use that LoRA for this step (this helps when making lewd stuff). Since I don't need that for my example (since my scenario is easily doable for most models) I'll just prompt it in txt2img without any additional LoRA. You also want to consider the dimensions of your artwork in this step. I am going to go for a 3:2 aspect ratio since the artwork that I am imagining is better in landscape than in portrait. For the initial resolution I would advise to go for something that can be cropped to a usable resolution later. So I am going to go for 1152x768 (aka 768x512 hires fix upscaled by 1.5, if you GPU can't handle that resolution you can use an upscaling script on the original resolution instead)
Since I am making the base image myself I'll just make a simple prompt like
masterpiece, best quality, CG, wallpaper, HDR, high quality, high-definition, extremely detailed, 4girls, table, eating, fun, indoors, food, tavern
and let it give me a good batch until I find something that looks like a good base. One thing to disregard in this step is how the original characters displayed in the image look, we will replace them anyways. Here is what I decided to start with:
We will mess up some aspects of it later anyways, so don't bother fixing smaller things yet. Now that I got my base image I can move on to the next step
Step 3: Cutting down the image / lots of inpainting
Before we start to Inpaint we want to cut the image into smaller parts. We do this to get a smaller area on which models and LoRA will have an easier time to add the things we want. I will use paint dot net to do that, but you can use whatever you want to achieve this. The first part that I will be editing is the left half of it. I will split it exactly 50/50 and will end up with a 512 by 768 image:
The next step is to throw the cropped image and the original prompt into img2img Inpaint. Before we start to do the Inpainting we want to edit the prompt a bit to change anything that could mess with the cropped image. In my case since I used 4girls in the original prompt I want to change that to 2girls to avoid the model trying to add more while Inpainting.
The next step is to apply the first LoRA, add its triggerwords to the prompt and mask the first character we want to "swap out". I would generally advise you to start with the ones further in the back. If you were to start at the front you might end up with the one in the back overlapping a bit. So for the first step we would do something like this:
Another thing that is important here is to not mask anything at the edge of where you cropped the image to avoid getting cut off body parts. You can fix that later but that is just a time waste if you can just avoid it in the first place. You can also mask more to give it room to add things that the character may need. In my case Amber's hair ribbon is too big to fit in the masked area, so I masked a bit more at the top.
Ok now that we got the mask we want to let it run. Just keep in mind that the output might not be perfekt. Either keep regenerating new outputs, or just Inpaint the smaller parts that you don' like. If you feel like there are not enough details, just ignore that for now. We will add those at a later step. If you see that it tries to make the character bigger than it was originally, lower the denoising strength and find the sweet spot at which you'll get the right size. Also don't worry about hands yet, we will fix those later too. After a bit of fiddling around I got the following image:
Now we will repeat what we did for amber for the next character in the image, in my case Mona. We will mask the second character and and try to get a good output:
You might see here that parts of the mona lora started to bleed onto amber, so now we will go back and fix that by reapplying her lora:
Now we will reattach it to the full image:
For the next crop we just want to make sure that the character is fully in the image. In my case I got both in:
And now we repeat what we did before for the last 2 characters until we get a decent result, and then reattach the crop to the full image:
Which is decent, but you might be able to see that the background has suffered a bit. We'll now do some more inpainting to get it to not look so scuffed anymore. If you are on a weaker GPU you can crop it again, but I will just inpaint the full image to save me some time. This is the final result for this resolution:
Step 4: Upscaling
We do now have the desired scenario, but it definately looks rough at the moment. To fix that we want to upscale the image. By a decent amount. We do this to make it look cleaner and to match potential differences in LoRAs artstyles, and just to add a ton of detail. Do not worry if you have a weak GPU, since you can upscale with tiling. There are multiple ways to upscale, but the one that I will be using is the SD upscale script in img2img. So we start by sending the last image we got into img2img. Then we want to prompt what we are seeing at the moment. You can either do that manually, or use an extension to get the tags. Do not add any character specific tags to the prompt since it will start to apply those to other characters too. Once you did that you want to scroll to the bottom of img2img and select the SD upscale script. The default settings work fine, but I recommend using a different upscaler. I use 4x AnimeSharp:
Once that is done we want to also adjust our sampling steps and denoising strength. I recommend a lot denoising strength (0.15 to 0.2) and a really high amount of steps (I do 150). Do not have any of the character LoRA active for this. Once we have that selected we run it:
This is starting to look better. At this step you might notice how certain details are starting to disappear, or how certain things like eye color is starting to be wrong, but don't worry for now, we will fix that later. Now we want to do it again. We upscale the image one more time (if your GPU lets you do this. If not, downsize the image a bit first and then run it again from a smaller resolution. You should be able to reach 3k by 2k pixels with 8GB of vram with the script):
Had to make it smaller, 5mb file size max. for articles.
Ok with that done we can move on to the final step
Step 5: Fixing the things that are bad
The last (and probably most annoying) thing we have to do is to fix everything that still looks messed up or is wrong. That would include eye color, bad hands, lost details that you want it to have etc., and to achieve that we want to once again crop out smaller parts of it and do more inpainting. Obviously you are not limited to just that. You can edit parts like hands with your image editing tool, or paste a hand shape that you want on top of it before inpainting. Since I explained everything before I will just put the final result here:
Once again downsized a bit.
I could definately have done more, but this is enough for this example.
This basically summarizes my workflow. There are probably better methods out there, but this is definately the best one if you don't want to install and use a bunch of extensions. This one will also work for almost anything. So if you have characters that are overlapping, are like arm in arm or have any other more complicated pose, you will be able to pull it off. You can also use this for more simple tasks like generating a character and then inpainting the head to add a different LoRAs head to it.
And even if you decide that you don't need this procedure, maybe you still took away a thing or two. The main point that I tried to teach here is that you are not limited to just the standard tools. Be more creative, crop out certain areas in large images that you want to enhance if your pc can't handle large resolutions, loading and unloading LoRAs while inpainting etc. can be a useful thing to know at some point.