New Paper: Unified Text-Image Generation with Weakness-Targeted Post-Training
We post-train multimodal models to unify text and image generation in a single inference call, enabling the model to automatically transition between reasoning about an image and generating it.
Our weakness-targeted synthetic dataset and reward function analysis leads to significant text-to-image performance improvements over the base model.
January, 2026
New Paper: Multi-Modal Language Models as Text-to-Image Model Evaluators
We present a text-to-Image (T2I) model evaluation method leveraging vision-language models as evaluator agents that generate image prompts and judge generated images.
Our method’s T2I model rankings match existing benchmarks' rankings while using 80x less prompts and achieves higher correlations with human judgments.
April, 2025
"Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression" accepted at ECCV 2024
Paper on the text-to-image foundation model used in various Meta platforms.
July, 2024
Joined the Generative AI Team at Meta
Joined the team that founded generative AI efforts at Meta. Working on text to image diffusion models and their product applications.
The New York Times wrote about our team
here.
December, 2022