Sign In

z-image-turbo-flow-dpo

Updated: Feb 25, 2026

style

Verified:

SafeTensor

Type

LyCORIS

Stats

703

6

2

Reviews

Published

Feb 25, 2026

Base Model

ZImageTurbo

Hash

AutoV2
1FD3C728AD
Witch's Brew Badge
fok3827's Avatar

fok3827

Z-Image-Turbo Photorealistic Lighting LoRA (Flow-DPO)

This is a specialized LoRA adapter for Alibaba-Tongyi/Z-Image-Turbo, finetuned using Flow-DPO (Direct Preference Optimization for Flow Matching) to significantly enhance photorealistic lighting, cinematic shadows, and overall image quality.

By utilizing Flow-DPO on perfectly spatially-aligned image pairs, this LoRA fixes the common "flat," "washed-out," or "plastic" artifacts often found in ultra-fast distilled models, delivering stunning, physically accurate lighting in just 8 inference steps.

🧠 Training Details & Methodology

This model was trained using a custom implementation of Flow-DPO (Improving Video Generation with Human Feedback, arXiv:2501.13918).

1. The Dataset (Strict Spatial Alignment)

To prevent the model from hallucinating or altering image structures (Catastrophic Forgetting), the preference dataset was constructed using strict spatial alignment:

  • Win (Chosen): High-quality, professional photographs with perfect lighting and textures.

  • Lose (Rejected): The exact same images degraded programmatically (Gaussian blur, lowered contrast, extreme exposure shifts, gaussian noise, and heavy JPEG compression artifacts).

  • Alignment: No cropping or warping was applied, ensuring the Flow Matching trajectory learned to solely correct lighting and texture.

2. Discrete Timestep Distillation Preservation

Unlike standard diffusion models where $t$ is sampled continuously $t \in [0, 1]$, Z-Image-Turbo is a distilled model specifically optimized for 8 fixed timesteps. During the Flow-DPO training, we dynamically extracted the exact discrete $t$-distribution from the FlowMatchEulerDiscreteScheduler and restricted the random sampling to these exact 8 nodes. This ensures the LoRA retains the turbo model's extreme speed without causing output blurriness.

3. Hyperparameters

  • Base Model: Alibaba-Tongyi/Z-Image-Turbo (6B Single-Stream DiT)

  • Learning Rate: 1e-4

  • KL Penalty ($\beta$): 1.0

  • Effective Batch Size: 1

  • Mixed Precision: bfloat16

⚠️ Limitations

  • Not an Image-to-Image Restorer: This LoRA changes the prior distribution of the Text-to-Image generation. It is designed to generate better original images from text prompts, not to be used as an img2img filter to fix user-uploaded bad photos (unless combined with RF-Inversion techniques, which are highly unstable for 8-step models).

  • Color Saturation: Pushing the LoRA scale too high (e.g., > 1.5) might result in over-sharpened or overly saturated images due to the nature of DPO margin maximization. Keep the scale around 0.6 - 1.0 for the most photorealistic results.