Sign In

ChatGPT Images 2.0

0

Updated: Apr 21, 2026

base model

Type

Checkpoint Trained

Stats

0

Reviews

No reviews yet

Published

Apr 21, 2026

Base Model

OpenAI

License:

OpenAI

Originally posted at https://openai.com/index/introducing-chatgpt-images-2-0/

Images are a language, not decoration. A good image does what a good sentence does—it selects, arranges, and reveals. It can explain a mechanism, stage a mood, test an idea, or make an argument.

A year ago, we released ChatGPT Images, showing that images created by AI can be both beautiful and useful. ChatGPT Images 2.0 is the next step: a state-of-the-art model that can take on complex visual tasks and produce precise, immediately usable visuals.

This model is a step change in detailed instruction following, placing and relating objects accurately, and rendering dense text, with the ability to generate across aspect ratios. Its sense of composition and visual taste means results feel less AI-generated and more intentionally designed. It’s accurate across languages and uses its expanded visual and world knowledge to fill in the gaps for you, so you get smarter images with less prompting.

To extend the model’s capabilities for the most complex tasks, Images 2.0 is our first image model with thinking capabilities. When a thinking or pro model is selected in ChatGPT, Images 2.0 can search the web for real-time information, create multiple distinct images from one prompt, and double-check its own outputs. With thinking, the model can take on even more of the heavy lifting between idea and image, especially when accuracy, up-to-date information, consistency, and visual cohesion matter most.

With both the intelligence of OpenAI’s reasoning models and a vast understanding of the visual world, this model moves image generation from rendering to strategic design, from a tool to a visual system, helping people turn ideas into outputs they can understand, share, teach with, and build from. It’s available starting today to all users in ChatGPT, Codex, and the API.

Greater precision and control

Images 2.0 brings an unprecedented level of specificity and fidelity to image creation. It can not only conceptualize more sophisticated images, it actually brings that vision to life effectively, able to follow instructions, preserve requested details, and render the fine-grained elements that often break image models: small text, iconography, UI elements, dense compositions, and subtle stylistic constraints, and at up to 2K resolution in the API. Instead of getting something vaguely in the neighborhood of what you meant, you get something you can actually use.

Stronger across languages

To date, our image generation models have been more consistent in English and other Latin-script languages, but less precise beyond that, especially when text was complex or dense.

Images 2.0 moves beyond that barrier with stronger multilingual understanding and significant gains in non-Latin text rendering, particularly in Japanese, Korean, Chinese, Hindi, and Bengali. It can produce images with non-English text that’s not only rendered correctly but with language that flows coherently.

That includes not just translating a label or two, but generating visually coherent outputs where language is part of the design itself, from posters and explainers to diagrams and comics. This makes the model more globally useful and helps people create visuals that work in the languages they actually use.

Stylistic sophistication and realism

Images 2.0 also shows significantly improved fidelity across a wide range of visual styles. It is better able to capture the defining characteristics of photos—including the tiny flaws that add realism—as well as cinematic stills, pixel art, manga, and other distinctive visual languages, with greater consistency in texture, lighting, composition, and fine detail.

As a result, the model can produce outputs that more faithfully reflect the style requested, rather than approximating it. This is especially useful for game prototyping, storyboarding, marketing creative, and creating assets in a particular medium or genre.

Flexible aspect ratios

The new model also gives you more flexibility in how those images are delivered. With support for aspect ratios as wide as 3:1 and as tall as 1:3, Images 2.0 can generate outputs that are ready to fit the formats you need, from wide banners and presentation slides to posters, mobile screens, bookmarks, and social graphics. Ask for the aspect ratio you want in the prompt, or select from preset options to regenerate any image in new dimensions.

Real-world intelligence

Images 2.0 brings a more up-to-date understanding of the world into image creation, with a knowledge cutoff of December 2025, for more relevant and contextually accurate outputs. This is especially important for artifacts like explainers, educational graphics, and visual summaries, where correctness and clarity matter just as much as aesthetics.

Its intelligence allows it to expertly handle tasks end-to-end: synthesizing information, writing the story, and laying it out with clean structure, intentional whitespace, and strong visual flow.

A visual thought partner

When a thinking model is selected in ChatGPT, the model takes more time and does more agentically behind the scenes to thoroughly understand and execute the task. It can use the web to find relevant information, transform uploaded materials into clear visual explainers, and reason through the structure of the image before generating. In this mode, Images 2.0 acts more like a visual thought partner, helping carry a project from rough concept to finished asset with significantly less work on your part.

With thinking, it can also produce multiple distinct images at once, a first for image generation in ChatGPT. That opens up workflows that were previously cumbersome: a sequence of manga pages, a set of redesign directions for every room in a house, a family of poster concepts, or a collection of social graphics in different aspect ratios and languages.

Instead of prompting one image at a time and stitching the project together yourself, you can ask for a coherent set of up to eight outputs in one go with character and object continuity, that sequentially build on one another.

Using image generation in Codex

Images in Codex brings visual creation into one workspace for creating, iterating, and shipping apps, slide decks, and other work, making Codex more useful for broader tasks across design, marketing, product, sales, and learning & development.

For example, you can generate multiple UI directions, concepts, and prototypes, compare options quickly, and then turn the strongest ideas into live products or website experiences without leaving the Codex app. You can create images in Codex with your ChatGPT subscription without creating a separate API key.