B649 Home Page

B649
Topics in Distributed Systems

Instructor

Scott Stoller

Hours

Lectures: Tuesday & Thursday, 2:30 PM - 3:45 PM. Ballantine Hall 233.
Note: The listing in the printed "Spring 1997 Schedule of Classes" is obsolete.
The Registrar's on-line Schedule of Classes is correct.

Office Hours: Tuesday & Thursday, 3:45 PM - 4:45 PM, or by appointment, or by chance. Lindley Hall 201D.

Course Description

This course discusses the core problems of synchronization (consistency), fault-tolerance, and security that arise in the design of distributed systems. The course will explore concepts, techniques, and protocols that can provide a foundation for robust distributed applications.

Textbooks

The primary textbook (hereafter called "CDK") is: Here are Errata to the fourth impression.

Optional secondary references are:

The lack of errata for these two books should not be interpreted to mean that they lack errors.

These books will be supplemented with articles as needed.

Tentative Syllabus

Here's a tentative syllabus. Normally there are 30 lectures in the spring semester, but we have missed the first 2. For each topic, I have indicated the corresponding chapters (or sections) of the primary textbook, if any.
LecturesTopicReading
1 Introduction 1-2
2-3 Models of Distributed Systems and Communication 3-5
4 Name Services 9
5-6 Distributed File Systems 7,8 (or Mullender, ch. 14)
7-8 Logical and physical time; Global snapshots 10.3, System Models (or Mullender, ch. 4)
9-12 Coordination: Leader Election 10.4, Garcia-Molina, Lynch
13-15 Active Replication: Group Commun, Gossip Kaashoek & Tanenbaum, 18.5, 11
16-20 Atomic Transactions 12, 14, 15.1-15.2. Babaoglu & Toueg
21-22 Passive Replication: Primary-backup Mullender, ch. 8
23-28 Security 16

Chapters 3-5 discuss the basics of networks and interprocess communication. P536 and B538 together cover most of that material, so I will discuss it only briefly. If you are not already familiar with that material, don't despair: just start reading those chapters on your own. Of course, you should feel free to ask questions on those chapters or any other topic at any point during the course.

Project

The project will not be "one size fits all". Students will be allowed (and expected!) to do a project of their choice. Some suggestions will be made; students are encouraged to suggest their own projects as well. Projects may be theoretical or implementation-oriented (or both!). Students may work alone or in teams of 2 or 3 on the project.

A list of suggestions for possible projects will be distributed in early March. If you have ideas (even vague ideas) for a project earlier in the semester, please come discuss them with me.

A proposal describing your proposed project is due on Monday, March 10. Each proposal should specify the name(s) of the student(s) on the team. During the week of March 10, I will meet with each team, to discuss its proposal.

The project is due on Saturday, May 3.

Grading

Grades will be based on homework assignments (mostly problem sets, possibly some programming), the project, and an exam. Weights are as follows:
Project 45%
Homework 40%
Final exam 15%

Grading Statistics

ItemMeanStddev
Homework 1 (out of 5)4.01.2
Homework 2 (out of 5)4.50.7
Homework 3 (out of 10)100
Homework 4 (out of 15)12.23.5
Homework 5 (out of 15)14.24.4
Homework 6 (out of 5)3.71.9
Homework 7 (out of 10)8.12.8
Final Exam (out of 55)34.17.7


Weekly Announcements


Week of May 5

Plot of scores on final exam:

[scores on final exam]

Solutions to Final Exam: LaTeX, compressed PostScript.

Week of April 28

Homework: Here are Solutions to Homework 7.

Final Exam: A few Practice Problems for Final Exam are available. The final exam will be longer and cover more topics than these practice problems. Nevertheless, these problems can be a useful part of your preparations for the exam. Solutions to these problems will not be distributed. When you have tried your best, if you are unsure of whether some of your answers are correct, you are welcome to ask each other, or ask me.

Lecture is cancelled for Tue, April 29, since I will be out of town.

Week of April 21

On-line Course Evalation Form

Homework: Homework 7 is due on Thu, May 1.

Project: The last round of project meetings will be held May 4-8 (instead of the week of April 28), so I will have an opportunity to read the report describing your project before the last meeting. If this schedule change inconveniences anyone, please let me as soon as possible, and we will make alternative arrangements.

Final Exam: 5PM-7PM on Thursday, May 8, in the usual place. The final exam covers the entire course, though it does not contain questions specifically about name services, leader election, or the gossip architecture. During the exam, you may use only your notes, the required and optional textbooks, and handouts distributed during lecture or available via the course web page.

Grading: The tentative weights listed at the beginning of the semester have been finalized, with a small amount shifted from the project to the final exam (see above table).

Week of April 14

Reading: Chapter 16 of CDK, on security. I have decided to abort the discussion of the part-time parliament protocol, which we started briefly on April 10, since I think continuing it would leave too little time to talk about security.

Week of April 7

Reading: Primary-backup is discussed briefly in Section 15.4 of CDK. It is covered in more detail in Chapter 8 of Mullender. Sorry, I don't have a good on-line reference for this material. You can copy Chapter 8 of Mullender, if you like.

Week of March 31

Homework: Here are Solutions to Homework 6.

Reading: Recovery for atomic transactions is covered in Sections 15.1-15.2 of CDK and in the tech report by Babaoglu and Toueg (see link below).

Project: This is a reminder of what is expected for the project meetings this week. For non-implementation projects, you should be prepared to summarize 2 to 4 (depending on their length) papers that you have carefully read. Bring copies of those papers to the meeting! Also, bring copies (or at least names) of the papers you plan to read next (i.e., during the following 2 weeks). For implementation projects, you should be prepared to describe the design and functionality of your proposed system and the initial steps of the implementation. Also, be prepared to describe the next step in your implementation effort (i.e., what will be accomplished during the following two weeks).

Week of March 24

Homework: Homework 6 is due on Sat, Apr 5.

Erratum: There was a minor flaw in the description of the recovery protocol in lecture on Mar 27. I said that a participant records <VOTE, v> in its log (on stable storage) when it sends <VOTE, v> to the coordinator. I should have added that the coordinator records <VOTE, v> in its log at some appropriate point, e.g., immediately before or after sending the VOTE_REQ messages.

Questionaire: If you did not fill out the Mid-course Questionnaire in class on Mar 27, I would appreciate it if you would print it, fill it out, and put it in my mailbox.

Reading: Chapter 12, as an introduction to transactions. Sections 14.1-14.3 on atomic commitment, including Sections 15.1-15.2 on recovery. The presentation of atomic commitment in lecture is based on Understanding Non-Blocking Atomic Commitment, by Babaoglu and Toueg. That work is also described in Chapter 6 of Mullender, though in abbreviated form (specifically, Chapter 6 of Mullender lacks the discussion of recovery).

Week of March 17

[Spring Break] Spring Break!

Week of March 10

Reading: Chapter 11 (if you didn't finish it last week). Start reading Chapter 12, as an introduction to transactions.

Week of March 3

Homework: Brief Solutions to Homework 5.

Reading: Chapter 11 of CDK, covering group communication and gossip.

Project: Here is a more specific suggestion for a group-communication implementation project: implement and measure performance of a variant of the broadcast protocol used in Amoeba, in which (when r>0) instead of the sequencer receiving ack's and broadcasting accept for each message, other processes take turns doing this. This helps prevent the sequencer from being a bottleneck (cf. page 34 of the paper we discussed on Tuesday).

Project: Here is a more specific suggestion for a security-related implementation project: read and evaluate the papers on partially-authenticated protocols for distributed agreement by Malte Borcherding. Estimate the performance for varying degrees of authentication. Implement the protocols and measure the performance.

Project: Here is a handout with details about the course project.

Reading: Efficient Reliable Group Communication for Distributed Systems, by M. Frans Kaashoek and Andrew S. Tanenbaum. This paper describes the protocols used in the group communication system of Amoeba, a distributed OS (see Section 18.5 of CDK for an overview of Amoeba).

Week of February 24

Handout: The bully-alg-PFD handout has been updated again (and is available via the links below). The timestamp in the title of the new version is Thu Feb 27 14:06:55. This version was distributed in lecture on Feb 27. Naturally, when doing Homework 5, you should refer to this version, not the version of Wed Feb 26.

Homework: Homework 5 is due on Friday, Mar 7. Note: Problem 4 was revised at 13:37 on Feb 27.

Handout: The handout distributed in lecture on Feb 25 (entitled bully-alg-PFD Tue Feb 25 14:07:28) contains errors. A revised version is now available (ASCII, PostScript). I apologize for the confusion (in the handout and the lecture).

Homework: Brief Solutions to Homework 4.

Homework: To save ink, I no longer repeat the homework policy at the top of each assignment, but it hasn't changed: You are free to discuss homework assignments with everyone, but you must write and submit your own work.

Week of February 17

Reading: Handout excerpted from Distributed Algorithms, by Nancy Lynch. This was distributed in class on Feb 18.

Reading: "Elections in a Distributed Computing System", by Hector Garcia-Molina. This was distributed in class on Feb 18.

Week of February 10

Homework: Brief Solutions to Homework 3.

Homework: Homework 4 is due on Friday, Feb 21.

Reading: Chapter 4 of Mullender contains (almost) all of the material that will be discussed this week, so if you decided to purchase Mullender, I suggest reading it. Chapter 10 of CDK discusses a small part of the material; a significant fraction of the rest is discussed in CDK's draft material on System Models.

Week of February 3

Homework: Solution to Homework 2.

Homework: Solution to Homework 1. There were "typos" in the definitions of initialize and run_handler. They were fixed at 5:30PM on Feb 4. If you printed the solution before that, please re-print it.

Homework: Homework 3 is due on Thursday, Feb 13. Homework 3 was revised on Sunday, Feb 2, at 12:23PM. If you printed it before that, please re-print it.

Reading: Chapter 8 of CDK, or Chapter 14 of the book edited by Mullender. You might find Chapter 7 of CDK helpful as an introduction.

Reminders: Homework 1 is due on Monday, Feb 3. Homework 2 is due on Friday, Feb 7.

Week of January 27

Erratum to lecture of Jan 30: In DNS, when the local name server does not have a record for the domain name dn that needs to be translated, it looks in its database for records of type "NS" that contain domain names that are prefixes of dn (I say "prefixes" assuming the domain names are processed right-to-left; otherwise, I would say "suffixes"). It chooses among these "NS" records one of those containing the longest prefix of dn and then queries the name server specified in that record. I suggested in lecture that the local name server would query one of its ancestors, but that is not necessarily so. Sorry for the confusion.

Reading: Chapter 9; Section 6.3. I recommend that you read Section 6.3, although (or because!) I won't talk about it in class. The other sections in Chapter 6 are either fluffy or irrelevant to the other topics in this course.

Clarification for Homework 2: There is no distinction between client processes and server processes; in other words, each process can act as both a client and a server. Thus, for every process, initialize is called at start-up, and threads in the process can call sendreq at any time.

Homework: Homework 2 is due on Friday, Feb 7.

Homework: Homework 1 is due on Monday, Feb 3.

Week of January 20

Reading: Chapters 4 and 5.

Reading: Chapters 1 and 2. (Recall that Chapter 3 was assigned reading last week.)

A tentative syllabus, info on grading, and more info on projects have been added above.

Week of January 13

Reading: Chapter 3.

I will be out of town this week, so the first lecture will be on Jan 21.