School Seminar Series: Statistically Consistent Estimation and Efficient Inference for Natural Language Parsing

Statistically Consistent Estimation and Efficient Inference for
Natural Language Parsing
By Shay Cohen, University of Edinburgh.

Abstract:
In the past few years, there has been an increased interest in the machinel earning community in spectral algorithms for estimating models with latent variables. Examples include algorithms for estimating mixture of Gaussians or for estimating the parameters of a hidden Markov model.

The EM algorithm has been the mainstay for estimation with latent variables, but because it is guaranteed to converge to a local maximum of the likelihood, it is not a consistent estimator. Spectral algorithms, on the other hand, are often shown to be consistent. They are often more computationally efficient than EM.

In this talk, I am interested in presenting two types for spectral algorithms for latent-variable PCFGs, a model widely used in the NLP community for parsing. One algorithm is for consistent estimation of L-PCFGs, and the other is for efficient inference with L-PCFGs (or PCFGs). Both algorithms are based on linear-algebraic formulation of L-PCFGs and PCFGs.

BIO:
Shay Cohen is a Chancellor’s fellow (assistant professor) at the University of Edinburgh (School of Informatics). Before that, he was a postdoctoral research scientist in the Department of Computer Science at Columbia University, and held an NSF/CRA Computing Innovation Fellowship. He received his B.Sc. and M.Sc. from Tel Aviv University in 2000 and 2004, and his Ph.D. from Carnegie Mellon University in 2011. His research interests span a range of topics in natural language processing and machine learning, with a focus on structured prediction. He is especially interested in developing efficient and scalable parsing algorithms as well as learning algorithms for probabilistic grammars.

Event details

  • When: 21st January 2015 14:00 - 15:00
  • Where: Cole 1.33a
  • Series: School Seminar Series
  • Format: Talk