SD WebUI Forge Neo + Z-Image Turbo β Colab T4 Setup (March 2nd 2026 : fixed output)
What is this?
A single-cell Google Colab notebook that sets up a complete image generation environment on a free T4 GPU. It installs SD WebUI Forge Neo (a feature-rich Stable Diffusion interface) with the Z-Image Turbo model β a fast, high-quality text-to-image model that generates images in just 8 steps. Everything runs in the cloud. No local GPU or installation required.
What does it do exactly?
When you run the cell, the notebook automatically:
Detects the GPU and verifies you have a T4 (15 GB VRAM).
Mounts Google Drive and creates persistent folders for your LoRAs, outputs, and input images.
Clones the Forge Neo repository (branch neo) from GitHub.
Creates an isolated Python 3.11 virtual environment with PyTorch 2.6 (CUDA 12.4).
Installs the ADetailer extension for automatic face/hand detail enhancement.
Downloads three FP8 quantized models (~13.5 GB total) using fast multi-connection downloads: z-image-turbo-fp8-e4m3fn.safetensors (diffusion, 5.7 GB), qwen_3_4b.safetensors (text encoder, 7.5 GB), and ae.safetensors (VAE, 0.3 GB).
Creates symlinks so your LoRAs, outputs, and input images are stored on Google Drive and persist across sessions.
Launches the WebUI with a public Gradio link you can open in any browser.
The entire setup takes about 3 minutes on a fresh runtime. Re-runs skip everything already done and launch in under 30 seconds.
How to use it
First launch:
Open the notebook in Google Colab.
Make sure the runtime is set to T4 GPU (Runtime β Change runtime type β T4 GPU).
Run the single cell. Authorize Google Drive access when prompted.
Wait for the setup to complete. A public URL will appear at the bottom (e.g. https://xxxxx.gradio.live).
Click the link to open the WebUI.
WebUI settings (first time only)
Once the interface loads, configure these settings:
UI Preset: zit
Checkpoint: z-image-turbo-fp8-e4m3fn.safetensors
VAE: ae.safetensors
Text Encoder: qwen_3_4b.safetensors
Diffusion in Low Bits: Automatic (fp16 LoRA)
Sampler: Euler / Schedule: Beta
Steps: 8 / CFG Scale: 1 / Shift: 3β6
Resolution: 1024Γ1024 or 896Γ1152 (portrait)
The "Diffusion in Low Bits" setting is found under Settings and is important if you plan to use LoRAs β it keeps VRAM usage manageable on the T4.
Google Drive folder mapping
Your files are stored on Google Drive and accessible in the WebUI through symlinks:
LoRA models β My Drive/SD-Forge/Lora (auto-detected by WebUI)
Generated images β My Drive/SD-Forge/outputs (auto β saved here)
Input images (img2img, batch) β My Drive/SD-Forge/Image input β WebUI path: /content/sd-webui-forge-neo/inputs
To use input images in Batch mode (img2img tab), enter /content/sd-webui-forge-neo/inputs as the input directory. For single img2img or inpainting, just drag and drop images directly into the browser.
Adding LoRAs
Drop .safetensors LoRA files into My Drive/SD-Forge/Lora on Google Drive. They will appear in the WebUI's LoRA browser few minutes later (or drop them before running the cell). The --lowvram flag and Automatic (fp16 LoRA) setting ensure LoRAs work within the T4's memory limits.
Subsequent sessions
When your Colab runtime resets, just re-run the cell. Models are re-downloaded (Colab doesn't persist local storage), but your LoRAs, outputs, and input images remain safe on Google Drive.
Technical details
GPU: Tesla T4 (15 GB VRAM)
Models: FP8 quantized (full BF16 models exceed T4 VRAM)
PyTorch: 2.6.0 with CUDA 12.4
Attention: Native PyTorch SDPA (xformers is incompatible with this torch/CUDA combination)
VRAM mode: --lowvram (offloads models between steps to prevent out-of-memory errors with LoRAs)
Extensions: ADetailer (automatic face/body detail refinement)
EDIT : a version "All on google drive" has been added.

