Updated: May 27, 2026
base modelDownload
1 variant available
This is the FLUX.2 semantic VAE used by Microsoft's Lens text-to-image model. Lens generates images in the FLUX.2 VAE's latent space rather than training its own, which is one of the design choices that lets it hit competitive quality on a much smaller training budget. This page mirrors the encoder/decoder weights so you can run Lens and Lens-Turbo on-site without pulling them separately.
The underlying VAE comes from FLUX.2, released by Black Forest Labs. All credit for the VAE design and weights goes to Black Forest Labs. The specific bundled file here was redistributed by Microsoft Research as part of the microsoft/Lens repository, where it lives under the vae/ folder alongside the Lens denoiser and GPT-OSS text encoder. Civitai is hosting a mirror so creators can run Lens on-site - head to the upstream repos for the canonical weights and updates.
Built by
Black Forest Labs built FLUX.2 and its semantic VAE. Microsoft Research selected the FLUX.2 VAE as the latent backbone for Lens and bundled it in their distribution - project leads Dong Chen, Fangyun Wei, and Ziyu Wan, with core contributors Jiawei Zhang, Jinjing Zhao, Sirui Zhang, Yang Yue, and Zhiyang Liang.
How Lens uses it
The VAE is frozen - Lens does not finetune it. Latents from the FLUX.2 VAE define the space the 48-block MMDiT denoiser learns to model, and the decoder turns the final denoised latents back into pixels. Because Lens reuses an existing high-quality semantic latent space, it avoids paying the compute cost of training its own VAE from scratch.
Format
The bundled file is a 336 MB safetensors checkpoint in bf16, with a small config.json describing the architecture. It is the same VAE that ships with FLUX.2-dev and stays in bf16 even when Lens is run with quantized text encoder or denoiser.
Upstream license
FLUX.2 is published under the FLUX Non-Commercial License, which is more restrictive than Lens's MIT license. If you are running Lens for non-commercial / research purposes the bundle is straightforward. If you intend any commercial use, read the FLUX.2 license terms on the FLUX.2-dev model card before relying on this VAE - the surrounding Lens code being MIT does not relax those terms.
What this model page is for
You do not generate images with this VAE alone. It is a dependency of Lens and Lens-Turbo, mirrored here so on-site generation can resolve the full pipeline without external fetches. If you are looking for the image generator itself, see the Lens model page.
Links
- Lens (parent model): huggingface.co/microsoft/Lens
- Lens-Turbo: huggingface.co/microsoft/Lens-Turbo
- FLUX.2 (upstream VAE source): huggingface.co/black-forest-labs/FLUX.2-dev
- Black Forest Labs: huggingface.co/black-forest-labs
- Lens code: github.com/microsoft/Lens
- Lens license: MIT - FLUX.2 VAE license: FLUX Non-Commercial
