Yuheng Li

Yuheng Li

Hi! I am a research scientist at Adobe Research. Before joining Adobe, I got my PhD in Computer Science from University of Wisconsin–Madison in 2024, under the supervision of Prof. Yong Jae Lee.

I am broadly interested in two main areas: (1) personalized AI systems that leverage user memory to create more tailored and context-aware experiences, and (2) controllable generative models, with recent focus on developing unified multimodal models and advancing video generation capabilities.

Feel free to contact me for collaboration.

Research

Personalized AI Systems

Building AI systems that learn personal visual concepts and maintain long-term memory for tailored, user-specific experiences.

Personalized AI
  • CamRoll: A hierarchical-memory agent for personal camera roll.
  • VisualMem: A hybrid visual–text memory architecture for personal visual memory.
  • PEARL: A personalized streaming video understanding model with persistent user memory.
  • Yo'Chameleon / Yo'LLaVA: Embedding user-defined visual concepts into multimodal LLMs for personalized recognition and generation.

Controllable Generative Models

Building controllable generative models with spatially grounded image synthesis and seamless multimodal integration.

Controllable Generation
  • GLIGEN: Grounded text-to-image generation conditioned on open-set spatial inputs including boxes, keypoints, and reference images.
  • X-Fusion: Grafting visual understanding and generation onto frozen LLMs, preserving language capabilities while enabling multimodal reasoning.
  • UniTemp: Autoregressive distillation for video generation in any temporal order, enabling flexible non-sequential video synthesis.

Publications

·
CamRoll: Personal AI Agent for Camera Roll VQA
Thao Nguyen, Krishna Kumar Singh, Donghyun Kim, Yong Jae Lee*, Yuheng Li*  (*equal advising)
arXiv, 2026
·
VisualMem: Personal Visual Memory from Explicit and Implicit Evidence
Viet Nguyen, Thao Nguyen, Vishal M. Patel*, Yuheng Li*  (*equal advising)
arXiv, 2026
·
PEARL: Personalized Streaming Video Understanding Model
Yuanhong Zheng, Ruichuan An, Xiaopeng Lin, Yuxing Liu, Sihan Yang, Huanyu Zhang, Haodong Li, Qintong Zhang, Renrui Zhang, Guopeng Li, Yifan Zhang, Yuheng Li*, Wentao Zhang*  (*equal advising)
arXiv, 2026
·
Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Sicheng Mo, Thao Nguyen, Richard Zhang, Nick Kolkin, Siddharth Srinivasan Iyer, Eli Shechtman, Krishna Kumar Singh, Yong Jae Lee, Bolei Zhou, Yuheng Li
Conference on Computer Vision and Pattern Recognition (CVPR), 2026
·
Relational Visual Similarity
Thao Nguyen, Sicheng Mo, Krishna Kumar Singh, Yilin Wang, Jing Shi, Nicholas Kolkin, Eli Shechtman, Yong Jae Lee*, Yuheng Li*  (*equal advising)
Conference on Computer Vision and Pattern Recognition (CVPR), 2026
·
Learning an Image Editing Model without Image Editing Pairs
Nupur Kumari, Sheng-Yu Wang, Nanxuan Zhao, Yotam Nitzan, Yuheng Li, Krishna Kumar Singh, Richard Zhang, Eli Shechtman, Jun-Yan Zhu, Xun Huang
International Conference on Learning Representations (ICLR), 2026
·
Yo'Chameleon: Personalized Vision and Language Generation
Thao Nguyen, Krishna Kumar Singh, Jing Shi, Trung Bui, Yong Jae Lee*, Yuheng Li*  (*equal advising)
Conference on Computer Vision and Pattern Recognition (CVPR), 2025
·
X-Fusion: Introducing New Modality to Frozen Large Language Models
Sicheng Mo, Thao Nguyen, Xun Huang, Siddharth Srinivasan Iyer, Yijun Li, Yuchen Liu, Abhishek Tandon, Eli Shechtman, Krishna Kumar Singh, Yong Jae Lee, Bolei Zhou, Yuheng Li
IEEE International Conference on Computer Vision (ICCV), 2025
·
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Bhishma Dedhia, David Bourgin, Krishna Kumar Singh, Yuheng Li, Yan Kang, Zhan Xu, Niraj K. Jha, Yuchen Liu
IEEE International Conference on Computer Vision (ICCV), 2025
·
Beyond Simple Edits: X-Planner for Complex Instruction-Based Image Editing
Chun-Hsiao Yeh, Yilin Wang, Nanxuan Zhao, Richard Zhang, Yuheng Li, Yi Ma, Krishna Kumar Singh
AAAI Conference on Artificial Intelligence (AAAI), 2025
·
Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding
Mu Cai, Zeyi Huang, Yuheng Li, Haohan Wang, Yong Jae Lee
IEEE Winter Conference on Applications of Computer Vision (WACV), 2025
·
Yo'LLaVA: Your Personalized Language and Vision Assistant
Thao Nguyen, Haotian Liu, Yuheng Li, Mu Cai, Utkarsh Ojha, Yong Jae Lee
Neural Information Processing Systems (NeurIPS), 2024
·
Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
Yuheng Li, Haotian Liu, Mu Cai, Yijun Li, Eli Shechtman, Zhe Lin, Yong Jae Lee, Krishna Kumar Singh
European Conference on Computer Vision (ECCV), 2024
·
Edit One for All: Interactive Batch Image Editing
Thao Nguyen, Utkarsh Ojha, Yuheng Li, Haotian Liu, Yong Jae Lee
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
·
Improved Baselines with Visual Instruction Tuning (LLaVA-1.5)
Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
·
GLIGEN: Open-Set Grounded Text-to-Image Generation
Yuheng Li, Haotian Liu, Qingyang Wu, Fangzhou Mu, Jianwei Yang, Jianfeng Gao, Chunyuan Li, Yong Jae Lee
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
·
Visual Instruction Inversion: Image Editing via Visual Prompting
Thao Nguyen, Yuheng Li, Utkarsh Ojha, Yong Jae Lee
Neural Information Processing Systems (NeurIPS), 2023
·
What Knowledge Gets Distilled in Knowledge Distillation?
Utkarsh Ojha*, Yuheng Li*, Anirudh Sundara Rajan*, Yingyu Liang, Yong Jae Lee  (*equal contribution)
Neural Information Processing Systems (NeurIPS), 2023
·
Towards Universal Fake Image Detectors that Generalize Across Generative Models
Utkarsh Ojha*, Yuheng Li*, Yong Jae Lee  (*equal contribution)
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
·
Generate Anything Anywhere in Any Scene
Yuheng Li, Haotian Liu, Yangming Wen, Yong Jae Lee
arXiv, 2023
·
Contrastive Learning for Diverse Disentangled Foreground Generation
Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, Krishna Kumar Singh
European Conference on Computer Vision (ECCV), 2022
·
GIRAFFE HD: A High-Resolution 3D-aware Generative Model
Yang Xue, Yuheng Li, Krishna Kumar Singh, Yong Jae Lee
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
·
Delving Deeper into Anti-aliasing in ConvNets
Xueyan Zou, Fanyi Xiao, Zhiding Yu, Yuheng Li, Yong Jae Lee
International Journal of Computer Vision (IJCV), 2022
·
Collaging Class-specific GANs for Semantic Image Synthesis
Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, Krishna Kumar Singh
IEEE International Conference on Computer Vision (ICCV), 2021
·
PartGAN: Unsupervised Part Decomposition for Image Generation and Segmentation
Yuheng Li, Krishna Kumar Singh, Yong Jae Lee
British Machine Vision Conference (BMVC), 2021
·
MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation
Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Lee
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020