home models images videos posts articles bounties challenges events updates shop

SDXL Portrait-to-Scene Master Workflow

Name: SDXL Portrait-to-Scene Master Workflow
Rating: 5 (2 reviews)
Author: Enomoto_Takane

Updated: Mar 31, 2026

style

inpainting character design controlnet txt2img sdxl

Download (3.9 KB)

Verified: 8 days ago

Other

Details

Type	Workflows
Stats	32 0
Reviews	Positive (2)
Published	Mar 24, 2026
Base Model	SDXL Lightning
Hash	AutoV2 28B26C8EB2

1 File

About this version

Phase 1: Decoupled Generation & Character Prototyping

By utilizing an uploaded pose reference (OpenPose), the initial generation concentrates computational power exclusively on the character subject. This step significantly reduces randomness and prevents background complexity from interfering with character details, ensuring a foundational subject with a high-fidelity match to your intended appearance, attire, and posture.

Phase 2: Interactive Composition & Spatial Layout

This phase introduces a "Quick Canvas" mechanism, allowing the generated character to be freely moved and scaled within the frame. Once the position is finalized, the system automatically extracts the LineArt and Zoe Depth maps of the character in that specific location. This spatial data serves as a positional guide for subsequent background generation, effectively solving common issues with character-environment scale mismatch.

Phase 3: Background Synthesis & Lighting Integration

Backgrounds are generated independently while maintaining the character's designated position. Subsequently, the Qwen Instruct model performs a logical analysis of the composite lighting. Through the BlendMap node, the workflow executes image blending and color grading, ensuring that character edges, shadow depth, and ambient occlusion are perfectly unified with the environmental lighting of the background.

Phase 4: Qwen-VL Intelligent Self-Correction & Repair Loop

This is the core closed-loop of the process. The system invokes Qwen-VL (Vision-Language Model) to scan the image for potential anatomical errors or logical inconsistencies (such as hand artifacts or unnatural limb postures). Qwen-VL provides specific repair instructions, which are fed back into the inpainting module for targeted structural correction.

Phase 5: High-Res Resampling & Final Optimization

Following the logical self-check, the image enters the Ultimate SD Upscale stage. Utilizing Tiled Diffusion and high-definition upscale models, this phase preserves the established structure and lighting while enhancing textures for skin, hair, and environmental details, ultimately producing a high-resolution, production-ready masterpiece.

Required Models & Resources

To ensure this workflow runs correctly, please download and place the following models in their respective folders:

1. Base Model & VAE

Checkpoint: wai-illustrious-SDXL (The core painting style)
Qwen VAE: qwen_image_vae.safetensors

2. ControlNet Models (SDXL/Illustrious)

OpenPose: NoobAI-XL Controlnet OpenPose
Depth: Illustrious-XL-Depth-Midas
LineArt: Illustrious-XL-LineArt-Anime

3. Multi-Modal & VLM (Qwen Series)

Qwen-Image GGUF: Qwen-Rapid-AIO-NSFW-v19_Q6_K.gguf
Qwen2.5-VL GGUF: Qwen2.5-VL-7B-Instruct-Q3_K_S.gguf
Special LoRA: Qwen-Image-Edit-F2P

⚠️ Hardware & Setup Note

VRAM Optimization: The Qwen3-VL Loader is configured to download necessary weights automatically.
For 8GB VRAM Users: If you encounter Out-of-Memory (OOM) errors, please replace the loader or use a lower quantization version of the GGUF models to ensure smooth operation.

default creator card background decoration

Enomoto_Takane

This workflow is designed to maintain a consistent SDXL artistic style by generating characters and backgrounds independently.

Key Features:

Interactive Layout: Utilize a manual canvas to freely adjust character positioning and scaling.
Spatial Awareness: Automatically extracts Depth and LineArt from the character to guide background synthesis, ensuring perfect spatial integration.
AI-Powered Refinement: Leverages Qwen-VL for intelligent image blending, automated self-checking of anatomical/logical issues, and targeted inpainting repairs.

It transforms a standard generation process into a professional, feedback-driven production pipeline.