Step1X Edit GPT4o Style Image Editing
https://www.runninghub.ai/post/1916456042962817026
We have released the state-of-the-art image editing model Step1X Edit, whose performance rivals closed-source models such as GPT 4o and Gemini2 Flash. More specifically, we leverage a multimodal LLM to process reference images and user editing instructions. It extracts latent embeddings and integrates them with a diffusion image decoder to obtain the target image. To train the model, we built a data generation pipeline to produce high-quality datasets. For evaluation, we developed GEdit Bench, a novel benchmark rooted in real user instructions. Experimental results on GEdit Bench demonstrate that Step1X Edit significantly outperforms existing open-source baselines and approaches the performance of leading proprietary models, making a major contribution to the field of image editing. For more details, please refer to our technical report.