About me

I co-founded Voyage AI and led its research to develop the best embeddings models and rerankers for semantic search and information retrieval in the industry. I did my PhD at Stanford University, affiliated with Stanford AI Lab and Stanford NLP group. My research interests broadly lie in language models, multimodal information retrieval, and reasoning.

News

Checkout our latest project MoCa, which scales multimodal embedding models with unlabeled interleaved multimodal data.

Language Models

MoCa: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings
Haonan Chen, Hong Liu, Yuping Luo, Liang Wang, Nan Yang, Furu Wei, Zhicheng Dou.
ArXiv 2506.23115 [Code][Twitter]

Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma.
ICLR 2024 [Twitter]

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu, Zhiyuan Li, David Hall, Percy Liang, Tengyu Ma.
ICLR 2024 [Code] [Twitter]

Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu, Sang Michael Xie, Zhiyuan Li, Tengyu Ma.
International Conference on Machine Learning (ICML), 2023, Oral. [Code] [Twitter]

Self-supervised Learning is More Robust to Dataset Imbalance
Hong Liu, Jeff Z. HaoChen, Adrien Gaidon, Tengyu Ma.
International Conference on Learning Representations (ICLR), 2022, Spotlight. [Code] [Twitter]

Domain Adaptation and Transfer Learning

Cycle Self-Training for Domain Adaptation
Hong Liu, Jianmin Wang, Mingsheng Long.
Neural Information Processing Systems (NeurIPS), 2021. [Code]

Learning to Adapt to Evolving Domains
Hong Liu, Mingsheng Long, Jianmin Wang, Yu Wang.
Neural Information Processing Systems (NeurIPS), 2020.

Meta-learning Transferable Representations with a Single Target Domain
Hong Liu, Jeff Z. HaoChen, Colin Wei, Tengyu Ma
Arxiv 2011.01418.

Towards Understanding the Transferability of Deep Representations
Hong Liu, Mingsheng Long, Jianmin Wang, Michael Jordan.
Arxiv 1909.12031.

Transferable adversarial training: A general approach to adapting deep classifiers
Hong Liu, Mingsheng Long, Jianmin Wang, Michael Jordan.
International Conference on Machine Learning (ICML), 2019, Long Talk. [Code]

Separate to Adapt: Open Set Domain Adaptation via Progressive Separation
Hong Liu, Zhangjie Cao, Mingsheng Long, Jianmin Wang, Qiang Yang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [Code]