50.040 Natural Language Processing

Course Description

Natural Language Processing (NLP) is an important area within the general field of artificial Intelligence (AI). Modern NLP models focus on using machine learning algorithms for solving various text processing problems. This course covers fundamental topics within the domain of NLP, including part-of-speech tagging, word embeddings, chunking, syntactic parsing, semantic role labeling, semantic parsing, named entity recognition, sentiment analysis, generation, summarization and machine translation. Students will get a chance to learn fundamental algorithms as well as state-of-the-art, deep-learning based techniques for NLP and will get a chance to implement and play with advanced NLP algorithms and models.

Pre-requisite/Co-requisite
  1. 50.007 Machine Learning/40.319 Statistical & Machine Learning and
  2. A good foundation in: 1) programming, 2) design and analysis of algorithms, 3) mathematics including linear algebra, calculus, optimization, probability, and statistics.
Learning Objectives

By the end of the course, students will be able to

  1. Explain the fundamental tasks within NLP
  2. Explain possible algorithms as solutions to NLP tasks
  3. Implement the algorithms used for various NLP tasks
  4. Design novel algorithms for solving new NLP tasks, and use existing NLP technologies for solving real problems
Measurable Outcomes
  1. Explain the major tasks within NLP that involve supervised structured prediction
  2. Explain the major tasks within NLP that involve unsupervised learning
  3. Apply the relevant models that need to be used for each task
  4. Apply the major guiding principles when choosing a model for a specific task within NLP
  5. Decide when to and when not to use neural network based or deep learning methods for a specific task within NLP
  6. Design and implement fundamental algorithms used for various NLP tasks
  7. Analyze the time complexity involved for a specific NLP algorithm
  8. Evaluate the performance of an NLP model based on certain evaluation metrics on standard datasets
Topics Covered
  • Introduction, Review of ML
  • Syntactic Tagging, Word Senses and Embeddings
  • Language Modeling
  • Chunking (Shallow Syntactic Parsing)
  • Information Extraction
  • Syntactic Parsing
  • Semantic Role Labeling (Shallow Semantic Parsing)
  • Semantic Parsing
  • Sentiment Analysis
  • Text Generation
  • Machine Translation
Textbook(s) and/or Other Required Material

Required to read some relevant chapters from:

  • Chris Manning and Hinrich Schütze, Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999
  • Dan Jurafsky and James H. Martin, Speech and Language Processing (3rd ed. draft), 2018
  • Yoav Goldberg, Neural Network Methods for Natural Language Processing, 2017
Course Instructor(s)

Prof Lu Wei

 

Image Credit