50.035 Computer Vision

This is an advanced undergraduate level course on the concepts, algorithms and system design in computer vision. The particular focus in this course is on the underlying computational/mathematical principles, and data-driven and neural networks (aka “deep learning”) approaches. The course introduces different computer vision tasks such as image/video classification, localization, detection, among others, and discusses different computational algorithms for these tasks, including recently proposed deep learning methods: convolutional neural networks (CNN), recurrent neural networks (RNN), long short-term memory (LSTM), Generative Adversarial Networks (GAN), etc. Students will learn to design, implement, train and debug their own systems and neural networks, and gain understanding of, and the skills to use, cutting-edge technologies in computer vision. A semester-long, 1-D design project requires students to design, implement, and train multi-million parameter neural networks to address real-world computer vision problems.

Pre-requisite

50.007 Machine Learning or
40.319 Statistical & Machine Learning (ESD)

Learning Objectives

List useful real-world applications of computer vision
Apply and design computer vision systems and algorithms
Evaluate appropriate computer vision algorithms for a variety of problems

Measurable Outcomes

Design image convolution and filtering using OpenCV [LO 1,2]
Design image recognition system using data driven approach and linear classification [LO 2]
Design convolutional neural networks using TensorFlow [LO 2]
Develop the training of convolutional neural networks using back-propagation and stochastic gradient descent [LO 2]
Design image recognition system using convolutional neural networks [LO 1,2,3]
Develop the training of convolutional neural networks using GPU programming [LO 2]
Design convolutional neural networks using dropout and batch normalization [LO 2]
Design image segmentation systems using convolutional neural networks [LO 1,2,3]
Design object detection and localization systems using convolutional neural networks [LO 1,2,3]
Design video activity recognition systems using recurrent neural networks [LO 1,2,3]
Develop the training of LSTM using TensorFlow [LO 2]

Topics Covered

Image, Filtering, Convolution
Image Histogram
Image classification, data-driven approach, knn
Linear Classifier
Gradient Descent
Deep Learning
Convolutional Neural Network
CNN Architectures
Back Propagation
Regularization
Object detection and segmentation
Advanced gradient descent
Generative Adversarial Networks (GAN)
Improved GAN and Applications

Course Instructor(s)

Prof Cheung Ngai-Man