NO.036 Discrete Algorithms Meet Machine Learning
August 10 - 13, 2013 (Check-in: August 9, 2013 )
- Hal Daumé III
- University of Maryland, USA
- Kevin Duh
- Nara Institute of Science & Technology, Japan
- Samir Khuller
- University of Maryland, USA
Structured prediction is the task of learning a function that maps inputs to structured outputs, and forms the heart of problems in natural language processing. Syntactic analysis, word alignment for machine translation, semantic understanding, action/goal recognition and translation itself are all examples of structured prediction tasks. In all these tasks, the goal at prediction time is to produce a combinatorial structure, typically by exploiting an off-the-shelf or hand-rolled combinatorial optimization algorithm. The fact that a combinatorial optimization algorithm consumes the output of a machine learning system in order to make predictions renders the learning problem difficult: one must reason about statements like “if I change my model in such a way, how will this affect the matching that the Hungarian algorithm (for weighted matchings) would return?”
Our goal is to solve such questions not only for specific problems such as graphs but also to develop a robust framework in the language of modern combinatorial algorithms by generalizing from spanning trees to matroids, or matchings to matroid intersections. All of these frameworks are polynomial-time for making predictions, and our goal is to construct similar polynomial-time learning procedures.
Although studying polynomial time solvable problems are useful for some specific natural language processing problems, we also wish to turn our attention to approximation algorithms for NP-complete prediction tasks (such as natural language generation or machine translation). So far, such problems have been the bane of structured prediction, requiring either the theoretically ungrounded use of approximate prediction within learning, or efficient but lower-quality search-based solutions with heavy pruning.
The key idea we wish to pursue is that NP-complete problems are characterized by efficient verification. We propose to push this further to efficient separation at learning time (which appears to work for all polynomial-time algorithms we have looked at thus far). If this is possible, then we can explicitly train models to give correct solutions, even when a polynomial time approximation algorithm is run at test time. Such a result has the potential to change how the natural language processing community thinks about learning in computationally hard problems.
From the algorithmic perspective, for decades, our basic assumption has been that the input data is sacrosanct and essentially all basic optimization algorithms make this assumption. The problems we envision solving, now question this basic assumption, since we have to modify the input data, while minimizing the norm of the perturbation, so as to satisfy certain properties. We hope this meeting will unite the large sub-community in natural language processing community that deals with combinatorial prediction problems with the algorithms community that studies them.