Sign In
Evolution of Text-to-Image: 2025

Go to year: 2022202320242025


Introduction

Now most models are good at text and aren't as prone to mangling human anatomy. A major addition this year included multiple models for editing images with text prompts.

Prompts and project details can be found at the bottom of the article. High resolution versions of the comparison image grids are in this article's attachment.

The Models

Lumina 2.0

January 2025

Lumina is a more lightweight release than most that will follow it this year. It seems like a marginal improvement over base SDXL in some ways, but can't compete with most current models. It is better at SDXL at following instructions, but it still struggles with anatomy and text.

250100_Lumina Image 2_small.jpg

Lumina 2

✅free download (GitHub | Hugging Face: all-in-one, diffusion variant)

Firefly Image 4

April 2025

Adobe released Firefly Image 4 and Firefly Image 4 Ultra. Based on my tests, the Ultra version seems to heavily favor realism over stylized images. Overall, it seems to be a big improvement over Firefly 3, but probably not worth choosing unless you're already in the Adobe ecosystem.

250400_FFLY4_Firefly Image 4_small.jpg

Firefly Image 4

❌free download

250400_FFLY4_Firefly Image 4 Ultra_small.jpg

Firefly Image 4 Ultra

❌free download

GPT1 Image

April 2025

GPT1 has great prompt adherence, is good at text, and can make aesthetically pleasing images even when only given minimal prompts.

250400_GPT1_GPT1 Image Medium_small.jpg

GPT1 Image

❌free download

HiDream-I1

April 2025

HiDream had great prompt adherence, is good with text, and leans toward stylized images. Both the Dev and Full versions are available to run locally.

250400_HIDR_HiDream-I1 Dev_small.jpg

HiDream-I1 Dev

✅free download (CivitAI | Hugging Face: fp8/bf16, gguf)

250400_HIDR_HiDream-I1 Full_small.jpg

HiDream-I1 Full

✅free download (CivitAI | Hugging Face: fp8/bf16)

Midjourney 7

April 2024

Midjourney's new model emphasizes personalization that can adapt to the user's preferences. This means you might get images that look very different from mine with the same prompts.

250400_MJD7_Midjourney 7_small.jpg

Midjourney 7

❌free download

Seedream 3

April 2025

Seedream 3 is ByteDance's text-to-image model. It's good with text and aesthetics, but sometimes seemed to prioritize style over following my instructions. This version is pretty good, but it's going to have a couple more updates in only a few months.

250400_SEED3_Seedream 3_small.jpg

Seedream 3

❌free download

Google imagen 4 (reg and ultra)

August 2025

This model from google seems to favor stylized images. It's good at prompt adherence and great with text.

250800_GI4_Google Imagen 4_small.jpg

Google Imagen 4

❌free download

250800_GI4_Google Imagen 4 Ultra_small.jpg

Google Imagen 4 Ultra

❌free download

Nano Banana

August 2025

A.K.A Gemini 2.5 Flash Image/Gemini 3 Pro Image. Another model under the Google umbrella. It's good at prompt adherance, text, and aesthetics. It can be used to generate images directly and to edit existing images.

250800_NB_Nano Banana_small.jpg

Nano Banana

❌free download

Qwen Image

August 2025

Qwen is currently a community favorite with an Apache license. It's great at prompt adherence, text, anatomy, and styles. It's resource intensive, but variations have been released that can be run on lower vram graphic cards.

250800_Qwen Image_small.jpg

✅free download (CivitAI | Hugging Face: fp8/bf16, gguf, nunchaku)

Seedream 4.0

September 2025

ByteDance's second release of Seedream this year, but not its last.

250900_SEED3_Seedream 4_small.jpg

Seedream 4.0

❌free download

Hunyuan Image 2.1

September 2025

Hunyuan can be run locally but it is resource-intensive. It is good with text and prompt adherence. It can create aesthetically pleasing images, but the images tended to be more basic without extra guidance. Without a refiner, this model tends to be blurry and lack fine details.

250900_Hunyuan Image 2-1_small.jpg

✅free download (Hugging Face: official, gguf)

Kandinsky 5

November 2025

Kandinsky is a Russian model, and if you use it long enough that may become obvious since it sometimes randomly adds Russian text to images. It's one of the more inconsistent models I tested; sometimes it's great at prompt adherence and other times it ignores basic requests and does its own thing.

251100_Kandinsky 5_small.jpg

✅free download (Hugging Face)

Z Image Turbo

November 2025

Z-Image Turbo is one of the few new models released this year that can be run on lower vram and still offer significant improvements over the older Stable Diffusion models. It's good with text, realism, and is fairly good at prompt adherence (my experience was good but not as good as some of the other current models).

251100_Z Image Turbo_small.jpg

✅free download (CivitAI | Hugging Face: bf16, fp8, gguf)

Flux 2

November 2025

The latest in the Flux lineup has been released, and it's quite resource hungry. I haven't had much time to experiment with it yet.

251100_Flux Dev 2-0_small.jpg

Flux Dev 2.0

✅free download (CivitAI | Hugging Face: fp8, gguf)

251100_Flux Pro 2-0_small.jpg

Flux Pro 2.0

❌free download

Seedream 4.5

November 2025

And yet another Seedream release. I haven't used it much, but it looks like it has a focus on creating stylized images.

251200_Seedream 4-5_small.jpg

Seedream 4.5

❌free download

Project Details

Disclaimer

I'm not an insider with special access to anything or a programmer who understands how all this works under the hood. I took some time to research, but this is from information found online and I can't guarantee everything is accurate. This is a work in progress; I'm still working on filling in missing information.

Also note that this is only a comparison of base models. Some models can produce significantly better images by using trained checkpoints, styles, presets, or detail enhancers.

Criteria

  • Must still be publicly accessible in 2025 without a complicated setup.

  • I'm trying to keep no more than two of each release: a "top of the line" version and the smaller version released for consumers. I've left out variants like those for turbo, editing, and less resource intensive versions.

Process

  • I chose 15 prompts that show a variety of photo realism, art styles, people, animals, objects, specific instructions, open-ended short prompts, text, and abstract concepts.

  • All images come from the first generation set and I never picked from more than 1-4 images.

  • When possible, I used images from the same seed which can show differences between minor versions of the same model.

  • I used the recommended settings for each model or the default offered online.

  • I didn't use additional styles or presets.

Prompts

  • african hydropunk princess

  • artificial intelligence

  • astronaut exploring an alien planet

  • overhead view of a breakfast plate with eggs, toast, strawberries, coffee, and a fork

  • exterior of a cafe watercolor painting

  • person wearing cyberpunk accessories in a high tech neon city

  • druid man character design

  • ethereal fairy in the style of oil painting

  • graphic design logo with fennec fox and succulents and text "Desert Design"

  • man and a woman in love

  • photo of a deer in an enchanted forest with cinematic lighting

  • Photo portrait of a woman with long black curly hair in natural light. She's wearing a fashionable purple blouse, a gold necklace with a locket, and hoop earrings. Bokeh background.

  • pixel art city street scene with shops and pedestrians at night

  • red potion bottle with text "health" on the left, blue potion bottle with text "mana" in the middle, green potion bottle with text "poison" on the right, on a wooden table in a dark alchemist's laboratory, in the style of a detailed digital painting

  • woman lying on the grass

Article Updates


Go to year: 2022202320242025

9