Sharing some workflow and general approach that has been quite a game changer for me.
Required tools and models:
A good, recent checkpoint with good consistency and anatomy.
ComfyUI
4x_NMKD-Siax_200k upscaler https://huggingface.co/gemasai/4x_NMKD-Siax_200k/resolve/main/4x_NMKD-Siax_200k.pth
DMD2 lora https://huggingface.co/tianweiy/DMD2/resolve/main/dmd2_sdxl_4step_lora_fp16.safetensors
Important note: the PAG implementation in Forge and co is broken and will not produce better results that with it off. You need to use ComyUI or SwarmUI
Method:
Use Perturbed Attention Guidance between 0.8 and 2 for the initial image, with dpmppp_2m_sde or dpmpp_sde sampler, Karras scheduling, low CFG ( around 3 ), 40-50 steps
Upscale to 2x using 4x_NMKD : it will scale to 4x, downscale to half size using Lanczos
Use DMD2 lora with LCM sampler, CFG 1, 4-12 steps for the upscaled diffusion.
Use the advanced KSampler, play with the number of steps, the starting step, the scheduler if you need to have fewer or more details : ( steps-starting_step) / steps ~= denoise amount percentage
You can use the workflow file attached to this article as an example and starting point.