50.040 Natural Language Processing

Course Description

Natural Language Processing (NLP) is an important area within the general field of artificial Intelligence (AI). Modern NLP models focus on using machine learning algorithms for solving various text processing problems. This course covers fundamental topics within the domain of NLP, including part-of-speech tagging, word embeddings, chunking, syntactic parsing, semantic role labeling, semantic parsing, named entity recognition, sentiment analysis, generation, summarization and machine translation. Students will get a chance to learn fundamental algorithms as well as state-of-the-art, deep-learning based techniques for NLP and will get a chance to implement and play with advanced NLP algorithms and models.

Pre-requisite/Co-requisite

50.007 Machine Learning/40.319 Statistical & Machine Learning and
A good foundation in: 1) programming, 2) design and analysis of algorithms, 3) mathematics including linear algebra, calculus, optimization, probability, and statistics.

Learning Objectives

By the end of the course, students will be able to

Explain the fundamental tasks within NLP
Explain possible algorithms as solutions to NLP tasks
Implement the algorithms used for various NLP tasks
Design novel algorithms for solving new NLP tasks, and use existing NLP technologies for solving real problems

Measurable Outcomes

Explain the major tasks within NLP that involve supervised structured prediction
Explain the major tasks within NLP that involve unsupervised learning
Apply the relevant models that need to be used for each task
Apply the major guiding principles when choosing a model for a specific task within NLP
Decide when to and when not to use neural network based or deep learning methods for a specific task within NLP
Design and implement fundamental algorithms used for various NLP tasks
Analyze the time complexity involved for a specific NLP algorithm
Evaluate the performance of an NLP model based on certain evaluation metrics on standard datasets

Topics Covered

Introduction, Review of ML
Syntactic Tagging, Word Senses and Embeddings
Language Modeling
Chunking (Shallow Syntactic Parsing)
Information Extraction
Syntactic Parsing
Semantic Role Labeling (Shallow Semantic Parsing)
Semantic Parsing
Sentiment Analysis
Text Generation
Machine Translation

Textbook(s) and/or Other Required Material

Required to read some relevant chapters from:

Chris Manning and Hinrich Schütze, Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999
Dan Jurafsky and James H. Martin, Speech and Language Processing (3rd ed. draft), 2018
Yoav Goldberg, Neural Network Methods for Natural Language Processing, 2017

Course Instructor(s)

Prof Zhang Wenxuan