i1: A Simple and Fully Open Recipe for Strong Text-to-Image Models
Boya Zeng, Tianze Luo, Shu Pu, Jucheng Shen, Taiming Lu, Gabriel Sarch, Zhuang Liu
Princeton University
[arXiv][code][dataset][project page]
Introduction

We investigate the design space of text-to-image diffusion models to understand how modeling and data choices affect model capabilities. This exploration culminates in i1, a 3B-parameter model that performs competitively with leading open-weight models at 1024-resolution, as measured by the average percentage score across GenEval, DPG-Bench, PRISM, CVTG-2K, and LongText-Bench. We open-source our model, code, and data to support future research.
Quick Start
Install PyTorch inference environment
conda create -n i1_torch_infer python=3.11 -y
conda activate i1_torch_infer
python -m pip install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu124
python -m pip install numpy==1.26.4 pillow tqdm transformers==4.57.1 diffusers==0.35.1 accelerate safetensors sentencepiece
Generate image with your custom prompt
git clone https://github.com/zlab-princeton/i1
cd i1/torch_inference
python generate.py \
--prompt "Render the following text at the center of the image on a clean background: 'Flow on, river! flow with the flood-tide, and ebb with the ebb-tide! Frolic on, crested and scallop-edg'd waves!'"
More detailed instructions can be found in our codebase.
Citation
If this model or the i1 recipe is useful for your research, please cite the following work:
@article{zeng2026i1,
title={i1: A Simple and Fully Open Recipe for Strong Text-to-Image Models},
author={Zeng, Boya and Luo, Tianze and Pu, Shu and Shen, Jucheng and Lu, Taiming and Sarch, Gabriel and Liu, Zhuang},
journal={arXiv preprint arXiv:2606.11289},
year={2026}
}