evandrix · September 13, 2012 03:11
diff --git a/gistfile1.txt b/gistfile1.txt
 Parsing
 Klein & Manning: "Accurate Unlexicalized Parsing" (shows that lexicalization is not necessary to achieve reasonably good parsing accuracy)
 Klein & Manning: "Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency" (a revolution in unsupervised dependency parsing)
 Nivre "Deterministic Dependency Parsing of English Text" (shows that deterministic parsing actually works quite well)
 McDonald et al. "Non-Projective Dependency Parsing using Spanning-Tree Algorithms" (the other main method of dependency parsing, MST parsing)

 Machine Translation
 Knight "A statistical MT tutorial workbook" (easy to understand, use instead of the original Brown paper)
 Och "The Alignment-Template Approach to Statistical Machine Translation" (foundations of phrase based systems)
 Wu "Inversion Transduction Grammars and the Bilingual Parsing of Parallel Corpora" (arguably the first realistic method for biparsing, which is used in many systems)
 Chiang "Hierarchical Phrase-Based Translation" (significantly improves accuracy by allowing for gappy phrases)

 Language Modeling
 Goodman "A bit of progress in language modeling" (describes just about everything related to n-gram language models)
 Teh "A Bayesian interpretation of Interpolated Kneser-Ney" (shows how to get state-of-the art accuracy in a Bayesian framework, opening the path for other applications)

 Machine Learning for NLP
 Sutton & McCallum "An introduction to conditional random fields for relational learning" (everyone should know CRFs, and this paper is the easiest to understand)
 Knight "Bayesian Inference with Tears" (explains the general idea of bayesian techniques quite well)
 Berg-Kirkpatrick et al. "Painless Unsupervised Learning with Features" (this is from this year and thus a bit of a gamble, but this has the potential to bring the power of discriminative methods to unsupervised learning)

 Information Extraction
 Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora. COLING 1992. (The very first paper for all the bootstrapping methods for NLP. It is a hypothetical work in a sense that it doesn't give experimental results, but it influenced it's followers a lot.)
 Collins and Singer. Unsupervised Models for Named Entity Classification. EMNLP 1999. (It applies several variants of co-training like IE methods to NER task and gives the motivation why they did so. Students can learn the logic from this work for writing a good research paper in NLP.)

 Computational Semantics
 Gildea and Jurafsky. Automatic Labeling of Semantic Roles. Computational Linguistics 2002. (It opened up the trends in NLP for semantic role labeling, followed by several CoNLL shared tasks dedicated for SRL. It shows how linguistics and engineering can collaborate with each other. It has a shorter version in ACL 2000.)
 Pantel and Lin. Discovering Word Senses from Text. KDD 2002. (Supervised WSD has been explored a lot in the early 00's thanks to the senseval workshop, but a few system actually benefits from WSD because manually crafted sense mappings are hard to obtain. These days we see a lot of evidence that unsupervised clustering improves NLP tasks such as NER, parsing, SRL, etc, and this work is one of the roots of unsupervised clustering of words)

 Automatic Text Summarization
 J. Clarke and M. Lapata. Modeling Compression with Discourse Constraints. EMNLP-CoNLL 2007. (shows importance of joint inference)
 K. Knight and D. Marcu. Summarization beyond sentence extraction. Artificial Intelligence 139, 2002. (opens the door to statistical approach to sentence compression)
 R. McDonald. A Study of Global Inference Algorithms in Multi-Document Summarization ECIR 2007. (formulates summarization task as global optimization problem using integer linear programming)
 W. Yih et al. Multi-Document Summarization by Maximizing Informative Content-Words. IJCAI 2007. (introduces stack decoding to this field)
	Parsing
	Klein & Manning: "Accurate Unlexicalized Parsing" (shows that lexicalization is not necessary to achieve reasonably good parsing accuracy)
	Klein & Manning: "Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency" (a revolution in unsupervised dependency parsing)
	Nivre "Deterministic Dependency Parsing of English Text" (shows that deterministic parsing actually works quite well)
	McDonald et al. "Non-Projective Dependency Parsing using Spanning-Tree Algorithms" (the other main method of dependency parsing, MST parsing)

	Machine Translation
	Knight "A statistical MT tutorial workbook" (easy to understand, use instead of the original Brown paper)
	Och "The Alignment-Template Approach to Statistical Machine Translation" (foundations of phrase based systems)
	Wu "Inversion Transduction Grammars and the Bilingual Parsing of Parallel Corpora" (arguably the first realistic method for biparsing, which is used in many systems)
	Chiang "Hierarchical Phrase-Based Translation" (significantly improves accuracy by allowing for gappy phrases)

	Language Modeling
	Goodman "A bit of progress in language modeling" (describes just about everything related to n-gram language models)
	Teh "A Bayesian interpretation of Interpolated Kneser-Ney" (shows how to get state-of-the art accuracy in a Bayesian framework, opening the path for other applications)

	Machine Learning for NLP
	Sutton & McCallum "An introduction to conditional random fields for relational learning" (everyone should know CRFs, and this paper is the easiest to understand)
	Knight "Bayesian Inference with Tears" (explains the general idea of bayesian techniques quite well)
	Berg-Kirkpatrick et al. "Painless Unsupervised Learning with Features" (this is from this year and thus a bit of a gamble, but this has the potential to bring the power of discriminative methods to unsupervised learning)

	Information Extraction
	Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora. COLING 1992. (The very first paper for all the bootstrapping methods for NLP. It is a hypothetical work in a sense that it doesn't give experimental results, but it influenced it's followers a lot.)
	Collins and Singer. Unsupervised Models for Named Entity Classification. EMNLP 1999. (It applies several variants of co-training like IE methods to NER task and gives the motivation why they did so. Students can learn the logic from this work for writing a good research paper in NLP.)

	Computational Semantics
	Gildea and Jurafsky. Automatic Labeling of Semantic Roles. Computational Linguistics 2002. (It opened up the trends in NLP for semantic role labeling, followed by several CoNLL shared tasks dedicated for SRL. It shows how linguistics and engineering can collaborate with each other. It has a shorter version in ACL 2000.)
	Pantel and Lin. Discovering Word Senses from Text. KDD 2002. (Supervised WSD has been explored a lot in the early 00's thanks to the senseval workshop, but a few system actually benefits from WSD because manually crafted sense mappings are hard to obtain. These days we see a lot of evidence that unsupervised clustering improves NLP tasks such as NER, parsing, SRL, etc, and this work is one of the roots of unsupervised clustering of words)

	Automatic Text Summarization
	J. Clarke and M. Lapata. Modeling Compression with Discourse Constraints. EMNLP-CoNLL 2007. (shows importance of joint inference)
	K. Knight and D. Marcu. Summarization beyond sentence extraction. Artificial Intelligence 139, 2002. (opens the door to statistical approach to sentence compression)
	R. McDonald. A Study of Global Inference Algorithms in Multi-Document Summarization ECIR 2007. (formulates summarization task as global optimization problem using integer linear programming)
	W. Yih et al. Multi-Document Summarization by Maximizing Informative Content-Words. IJCAI 2007. (introduces stack decoding to this field)