Sign In

Evolution of Text-to-Image: 2024 part 1

4

Evolution of Text-to-Image: 2024 part 1

Go to year: 202220232024 (pt 1)2024 (pt 2)


Introduction

There were so many new models in 2024 that I split the year in two. This first half covers January through July, and the second half covers August through December.

Prompts and project details can be found at the bottom of the article. High resolution versions of the comparison image grids are in this article's attachment.

The Models

Pony v6 XL

January 2024

Pony is to Stable Diffusion XL what Niji is to Midjourney, except with a whole lot more NSFW. This is a model that is trained on anime and illustration styles. Although it is capable of high quality output, it's a specialized model that covers a limited range of subjects and requires its own prompt format for best results.

✅free download (civitai)

Midjourney Niji 6

January 2024

Midjourney updated their Niji model to more closely match Midjourney 6's capabilities.

❌free download

Stable Cascade

February 2024

Stable Cascade is an experimental offshoot of Stable Diffusion built on the Würstchen architecture.

✅free download (civitai | huggingface)

Ideogram 1

February 2024

Ideogram 1 was released and already shows potential with rendering legible text and following prompt instructions.

❌free download

Pixart Sigma

April 2024

Pixart Sigma is a model capable of directly generating images at 4K resolution. It's freely available, but hasn't gained the popularity of the Stable Diffusion models.

✅free download (github)

Firefly 3

April 2024

Adobe released a new version of Firefly that greatly improved image resolution, details, and prompt coherency.

❌free download

Meta AI

May 2024

Meta's text-to-image generator is free to generate with online, but requires a Facebook or Instagram account. It's capable of rendering legible text and has decent prompt coherency. It puts a watermark in the lower corner of the image.

❌free download

Stable Diffusion 3 (Medium)

June 2024

The long-awaited Stable Diffusion 3 was a flop. It was heavily censored and had major problems with anatomy. Its monstrous depictions of "girls lying on the grass" became a joke, and it's what inspired me to add the prompt to my list for these comparisons. It's not all bad though. It can still produce some nice looking images with legible text.

✅free download (civitai | huggingface)

Kolors

July 2024

Kolors had surprisingly good output for a model I can't find much documentation on and rarely hear anyone mention. It's capable of making legible text and followed the instructions in all of my prompts on the first try. The images below are all from the generator on their official website. I couldn't find version information on the available models, so I'm not sure if it's the same as what's available for download on CivitAI.

✅free download (civitai | huggingface)

Midjourney 6.1

July 2024

Midjourney refined their version 6 model.

❌free download

Project Details

Disclaimer

I'm not an insider with special access to anything or a programmer who understands how all this works under the hood. I took some time to research, but this is from information found online and I can't guarantee everything is accurate. This is a work in progress; I'm still working on filling in missing information.

Also note that this is only a comparison of base models. Some models can produce significantly better images by using trained checkpoints, styles, presets, or detail enhancers.

Criteria

  • Must still be publicly accessible in 2024 without a complicated setup.

  • For this series, I've excluded turbo/fast versions of the models.

Process

  • I chose 15 prompts that show a variety of photo realism, art styles, people, animals, objects, specific instructions, open-ended short prompts, text, and abstract concepts.

  • All images come from the first generation set and I never picked from more than 1-4 images.

  • When possible, I used images from the same seed which can show differences between minor versions of the same model.

  • I used the recommended settings for each model or the default offered online.

  • I didn't use additional styles or presets.

Prompts

  • african hydropunk princess

  • artificial intelligence

  • astronaut exploring an alien planet

  • overhead view of a breakfast plate with eggs, toast, strawberries, coffee, and a fork

  • exterior of a cafe watercolor painting

  • person wearing cyberpunk accessories in a high tech neon city

  • druid man character design

  • ethereal fairy in the style of oil painting

  • graphic design logo with fennec fox and succulents and text "Desert Design"

  • man and a woman in love

  • photo of a deer in an enchanted forest with cinematic lighting

  • Photo portrait of a woman with long black curly hair in natural light. She's wearing a fashionable purple blouse, a gold necklace with a locket, and hoop earrings. Bokeh background.

  • pixel art city street scene with shops and pedestrians at night

  • red potion bottle with text "health" on the left, blue potion bottle with text "mana" in the middle, green potion bottle with text "poison" on the right, on a wooden table in a dark alchemist's laboratory, in the style of a detailed digital painting

  • woman lying on the grass

Article Updates

  • Nov 26, 2024: Added download links.


Go to year: 202220232024 (pt 1)2024 (pt 2)

4