Event
Mathematical Biology seminar: "Making Bayesian phylogenetics like training a neural network"
Erick Matsen (Fred Hutchinson Cancer Research Center)
Bayesian posterior distributions on phylogenetic trees remain difficult to sample despite decades of effort. The complex discrete and continuous model structure of trees means that recent inference methods developed for Euclidean space are not easily applicable to the phylogenetic case. Thus, we are left with random-walk Markov Chain Monte Carlo (MCMC) with uninformed tree modification proposals; these traverse tree space slowly because phylogenetic posteriors are concentrated on a small fraction of the very many possible trees.
In this talk, I will describe our wild adventure developing efficient alternatives to random-walk MCMC, which has concluded successfully with the development of a variational Bayes formulation of Bayesian phylogenetics. This formulation leverages a "factorization" of phylogenetic posterior distributions that we show is rich enough to capture the shape of posteriors inferred from real data. Our proof-of-concept implementation of variational inference using this method gives very promising results, and I will describe our ongoing efforts to develop an efficient implementation that integrates with modern modeling frameworks.
This line of work was started by Cheng Zhang (now faculty at Peking University) when he was in my group; ongoing work is by Michael Karcher, Seong-Hwan Jun, Andy Magee, and Mathieu Fourment (University of Tech, Sydney).