David Blei

1. Academic Summary



David Blei is a leading researcher in machine learning, Bayesian statistics, and probabilistic modeling, best known for pioneering topic models and latent Dirichlet allocation (LDA), a foundational method for discovering thematic structure in large text corpora. 


His influential research spans variational inference, Bayesian nonparametrics, and large-scale data modeling, with applications across text, networks, images, and scientific data. 



2. Education


  • B.S., Computer Science and Mathematics, Brown University, 1997. 

  • Ph.D., Computer Science, University of California, Berkeley, 2004. 



Doctoral Thesis: Probabilistic Models of Text and Images (Advisor: Michael I. Jordan). 



3. Professional Experience


  • Professor of Statistics and Computer Science, Columbia University (2014–present). 

  • Associate Professor, Department of Computer Science, Princeton University (2011–2014). 

  • Assistant Professor, Department of Computer Science, Princeton University (2006–2011). 




4. Research Interests


  • Bayesian statistics and probabilistic machine learning 

  • Topic models and latent variable models 

  • Variational inference and scalable Bayesian computation 

  • Machine learning applications in text mining, recommendation systems, neuroscience, and computational social science 




5. Honors & Awards


  • ACM Prize in Computing (formerly ACM-Infosys Award), 2013. 

  • Fellow of the Association for Computing Machinery (ACM), 2015. 

  • Guggenheim Fellowship, 2017. 

  • Simons Investigator Award, 2019. 

  • Presidential Early Career Award for Scientists and Engineers (PECASE), Office of Naval Research Young Investigator Award, Sloan Fellowship, and other distinguished honors. 

  • ACM-AAAI Allen Newell Award, 2023. 




6. Selected Contributions


  • Co-developer of Latent Dirichlet Allocation (LDA) — a seminal generative statistical model used worldwide for discovering topics in large text collections. 

  • Innovations in stochastic variational inference, enabling scalable Bayesian learning on massive datasets. 

  • Extensive work on probabilistic graphical models, Bayesian nonparametrics, and causal inference in machine learning.