50.041 Distributed Systems and Computing

Course Description

This course introduces fundamental concepts for designing and implementing large‐scale distributed systems. The course will not only focus on the design aspects of distributed systems, but will also focus on the fundamental principles to ensure the correctness in a distributed environment. We will apply the concepts via hands on assignments using GO programming language. The course will also deep dive into specific concepts of distributed systems e.g. designing distributed file systems (such as in Google File Systems) to accommodate arbitrarily many application‐level users. Finally, we will discuss concepts on recovering from faults (both normal and byzantine) in a distributed system.

Prerequisites

50.004 Algorithms or consultation with the instructor

Learning Objectives
  1. Design and implement a distributed system from scratch.
  2. Apply key ideas to maintain the correctness in distributed systems.
  3. Learning techniques to design and develop massively parallel systems using GO programming language.
  4. Learning techniques to design and implement a distributed file system.
  5. Learning and applying techniques to recover from faults in distributed systems.
Measurable Outcomes
  1. Build models of distributed systems [LO 1].
  2. Prototype distributed software systems [LO 1,2,3,4,5].
  3. Build distributed algorithms using industry-strength programming language [LO 3].
  4. Build algorithms to analyse the correctness of distributed systems [LO 2].
  5. Prototype software and systems to manage files and records in a distributed environment [LO 4].
  6. Build algorithms to analyse and test possible faults in distributed systems [LO 1, 5].
  7. Build techniques to recover from faults in distributed systems [LO 5].
  8. Build techniques at the level of supervisory software to support distributed applications [LO 1,4,5].
Topics Covered
  • Introduction, Clocks, Election
  • Clocks and Election
  • Distributed Mutual Exclusion
  • Consistency Model
  • Distributed File Systems
  • Fault Tolerance 1
  • Fault Tolerance 2
  • Byzantine Faults
Textbook(s) and/or Other Required Material

Recommended Book
Andrew S. Tanenbaum and Maarten Van Steen. Distributed Systems ‐ Principles and Paradigm. Pearson International Edition (Chapters 6, 7, 8, 11 and 13).

 

Programming Language
GO programming language will be used: https://golang.org/

 

There is no fixed textbook or fixed set of literature that cover all the materials of the course. The course materials will be designed from a set of selected literature, especially during the later half of the course.

Course Instructor(s)

Prof Sudipta Chattopadhyay