1. Academic Summary
David Blei is a leading researcher in machine learning, Bayesian statistics, and probabilistic modeling, best known for pioneering topic models and latent Dirichlet allocation (LDA), a foundational method for discovering thematic structure in large text corpora.
His influential research spans variational inference, Bayesian nonparametrics, and large-scale data modeling, with applications across text, networks, images, and scientific data.
2. Education
B.S., Computer Science and Mathematics, Brown University, 1997.
Ph.D., Computer Science, University of California, Berkeley, 2004.
Doctoral Thesis: Probabilistic Models of Text and Images (Advisor: Michael I. Jordan).
3. Professional Experience
Professor of Statistics and Computer Science, Columbia University (2014–present).
Associate Professor, Department of Computer Science, Princeton University (2011–2014).
Assistant Professor, Department of Computer Science, Princeton University (2006–2011).
4. Research Interests
Bayesian statistics and probabilistic machine learning
Topic models and latent variable models
Variational inference and scalable Bayesian computation
Machine learning applications in text mining, recommendation systems, neuroscience, and computational social science
5. Honors & Awards
ACM Prize in Computing (formerly ACM-Infosys Award), 2013.
Fellow of the Association for Computing Machinery (ACM), 2015.
Guggenheim Fellowship, 2017.
Simons Investigator Award, 2019.
Presidential Early Career Award for Scientists and Engineers (PECASE), Office of Naval Research Young Investigator Award, Sloan Fellowship, and other distinguished honors.
ACM-AAAI Allen Newell Award, 2023.
6. Selected Contributions
Co-developer of Latent Dirichlet Allocation (LDA) — a seminal generative statistical model used worldwide for discovering topics in large text collections.
Innovations in stochastic variational inference, enabling scalable Bayesian learning on massive datasets.
Extensive work on probabilistic graphical models, Bayesian nonparametrics, and causal inference in machine learning.