CSE 40771 - Distributed Systems

CSE 40771 - Distributed Systems - Spring 2023

View the Project on GitHub

Reading List for Graduate Students

The following is a selection of "classic" papers that describe the foundational principles of distributed systems, along with a selection of more recent systems that make use of these principles. Papers marked (REQ) are required readings for all graduate students this semester. The remainder of the paper are deeper references for concepts discussed in class, and are strongly suggested for any student that intends to specialize in distributed systems.

Graduate students will form reading groups of 2 students which will meet weekly to digest each paper. First, each student should read the paper once individually, take notes, and work out some examples to understand the paper. Then, gather with your partner later in the week to discuss the paper, work out an example, and write up a summary. Then, work together to proofread and edit your document so that it is logically sound, internally consistent, and flows well from one though to the next. This is an opportunity for you to flex and develop your writing muscles, which you will need to write technical papers, and eventually your dissertation.

A good summary should be one to two pages of clearly written, well organized text. I am interested in the quality of your writing, not fancy formatting, so please use 12-point Times New Roman with standard one-inch margins. If you find it helpful to include an equation or code snippet to make your point, that's perfectly fine, but then extend the prose a corresponding amount.

If the paper presents some opinion or advice (e.g. "A Note on Distributed Computing") then you should summarize the arguments made by the paper in your own words, and offer some reflection on the completeness or applicability of the advice.

Most of the other papers will instead be presenting a system or an algorithm. For those papers, you should address the following questions:

  • What is the problem that this paper is proposing to solve?
  • What is the key observation or insight into the problem?
  • What is the overall structure of the system or algorithmic approach presented as a solution?
  • How is the solution evaluated, and what are the results?
  • How does this work relate to concepts discussed in class?

Summaries are due at 11:59PM on Sundays and should be submitted as a single PDF document via the "Assignments" tab in Sakai. Both partners must contribute to the discussion, writing, and editing of the document, and should be named as authors on the submission. One student should submit on behalf of the group. All authors will receive the same grade.

Advice and Lessons Learned

  • (REQ) Jim Waldo, Geoff Wyant, Ann Wollrath, and Sam Kendall. "A Note on Distributed Computing", Technical Report. PDF
  • Butler Lampson, "Hints for computer system design", Proceedings of SOSP, 1983. DOI
  • Saltzer, Reed, and Clark, "End-to-End Arguments in System Design", ACM Transactions on Computer Systems, 1984. PDF
  • Eric A. Brewer, Lessons from Giant-Scale Services, IEEE Internet Computing. Vol. 5, No. 4. pp. 46-55. July/August 2001. DOI
  • (REQ) McSherry, Frank, Michael Isard, and Derek G. Murray. "Scalability! But at what COST?" 15th Workshop on Hot Topics in Operating Systems (HotOS XV), 2015. PDF

Remote Procedure Call

  • Birrel and Nelson, "Implementing Remote Procedure Calls", ACM Transactions on Computer Systems, 1984. PDF
  • Andrew Tannenbaum and Robert van Renesse, "A Critique of the Remote Procedure Call Paradigm", (unclear: possibly a technical report?), PDF
  • Philip Bogle and Barbara Liskov, "Reducing Cross Domain Call Overhead Using Batched Futures", Proceedings of OOPSALA 1994, DOI
  • Sivathanu, Muthian, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. "Evolving RPC for active storage." Proceedings of the 10th international conference on Architectural support for programming languages and operating systems. 2002. DOI
  • Soumagne, Jerome, et al. "Mercury: Enabling remote procedure call for high-performance computing." 2013 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 2013. DOI
  • W. Zhang, V Fang, A Panda, S. Shenker, "Kappa: A programming framework for serverless computing." Proceedings of the 11th ACM Symposium on Cloud Computing. 2020. DOI

Robustness and Correctness

  • Butler Lampson and Howard Sturgis, "Crash Recovery in a Distributed Data Storage System", Technical Report, Xerox PARC, 1979. PDF
  • Richard Schlichting and Fred Schneider, "Fail-Stop Processors: an approach to designing fault tolerant computing systems", ACM Transactions on Computer Systems 1;3, 1983. DOI
  • Fred Schneider, "Byzantine Generals in Action: Implement Fail-Stop Processors", ACM Transactions on Computer Systems, 2;2, May 1984. DOI
  • P. J. Leu, "Concurrent robust checkpointing and recovery in distributed systems", International Conference on Data Engineering, 1988. DOI
  • Gerard J. Holzmann, "The Model Checker SPIN", IEEE Transactions on Software Engineering, 23:5, 1997. DOI
  • Yang, Junfeng, et al. "MODIST: Transparent model checking of unmodified distributed systems." NSDI'09. 2009.

Time and Ordering

  • Leslie Lamport, "Time, Clocks, and Ordering of Events in a Distributed System", Communications of the ACM 12:7, 1978. PDF
  • (REQ) K. Mani Chandy and Leslie Lamport, "Distributed Snapshots: Determining Global States of Distributed Systems", ACM Transactions on Computer Systems, 3:1, 1985. PDF
  • Rob Strom and Shaula Yemini, "Optimistic recovery in distributed systems", ACM Transactions on Computer Systems, Aug 1985. DOI
  • D. Jefferson, B. Beckman, F. Wieland, L. Blume, and M. DiLoreto. "Time warp operating system." In Proceedings of the eleventh ACM Symposium on Operating systems principles, pp. 77-93. 1987. PDF
  • D. L. Mills, "Internet time synchronization: The network time protocol", IEEE Transactions on Communication, 39:10. DOI
  • Jones, Terry, et al. "An evaluation of the state of time synchronization on leadership class supercomputers." Concurrency and Computation: Practice and Experience 30.4 (2018). DOI

Consistency and Replication

  • (REQ) C. Gray and D. Cheriton, "Leases: An Efficient Fault Tolerant Mechanism for Distributed File Cache Consistency", ACM Symposium on Operating Systems Principles, 1989. DOI
  • Ladin, Liskov, Shrira, and Ghemaway, "Providing High Availability with Lazy Replication", ACM TOCS 10:4, 1992. DOI
  • Renesse and Schneider, "Chain Replication for Supporting High Throughput and Availability", USENIX Symposium on Operating System Design and Implementation, 2004. PDF
  • (REQ) M. Balakrishnan et al., "Tango: Distributed Data Structures over a Shared Log", SOSP 2013. DOI.
  • (REQ) S. Gilbert and N. Lynch, "Perspectives on the CAP Theorem," in Computer, vol. 45, no. 2, pp. 30-36, Feb. 2012. PDF
  • (REQ) Joseph Hellerstein and Peter Alvaro, "Keeping CALM: when distributed consistency is easy", Communications of the ACM, September 2020, Vol. 63 No. 9. DOI.

Consensus

  • Hector Garcia-Molina, "Elections in a Distributed Computing System", IEEE Transactions on Computers, volume C-31, number 1, January 1982. DOI
  • Glenn Ricart and Ashok Agrawala, "An Optimal Algorithm Election for Mutual Exclusion in Computer Networks", Communications of the ACM, volume 24, number 1, January 1981. DOI
  • K. Mani Chandi, Jayadev Misra, Laura Hass, "Distributed Deadlock Detection", ACM Transactions on Computer Systems 1:2, 1983, DOI
  • (REQ) Michael Fischer, Nancy Lynch, Michael Paterson, "Impossibility of Distributed Consensus with One Faulty Process", Journal of the ACM, volume 32, issue 2, April 1985. DOI
  • (REQ) Leslie Lamport, Robert Shostak, and Marshall Pease, "The Byzantine Generals Problem", ACM Transactions on Programming Languages and Systems, volume 4, number 3, July 1982. PDF
  • Miguel Castro and Barbara Liskov, "Practical Byzantine Fault Tolerance", Proceedings of OSDI, February 1999. PDF
  • Leslie Lamport, "Paxos Made Simple", ACM SIGACT News, November 2001. PDF
  • (REQ) Diego Ongaro and John Osterhout, "(RAFT) In Search of an Understandable Consensus Algorithm", USENIX Annual Technical Conference, June 2014. PDF

Security and Trust

  • Roger Needham and Michael Schroeder,"Using encryption for authentication in large networks of computers", Communications of the ACM, 1978. DOI
  • Steiner, Neuman, and Schiller, Kerberos: An authentication service for open network systems, USENIX Winter Conference, 1988. PDF
  • (REQ) Satoshi Nakamoto, Bitcoin: A Peer-to-Peer Electronic Cash System, 2009. PDF

Case Studies in Distributed Systems

  • (REQ) Ion Stoica, Robert Morris, David Karger, M. Frans Kasshoek, Hari Balakrishnan, "Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications", ACM SIGCOMM Review, volume 31, issue 4, October 2001. DOI
  • S. Ghemawat, H. Gobioff, S. Leung, "The Google File System", SOSP 2003. PDF
  • Douglas Thain, Todd Tannenbaum, and Miron Livny, "Distributed Computing in Practice: The Condor Experience", Concurrency: Practice and Experience, 2004. DOI
  • Petros Maniatis, Mema Roussopoulos, T. J. Giuli, David S. H. Rosenthal, and Mary Baker, "The LOCKSS peer-to-peer digital preservation system", ACM Transactions on Computer Systems, 23:1, 2005. DOI
  • Sage Weil, Scott Brandy, Ethan Miller, Darrell Long, and Carlos Maltzahn, "Ceph: A scalable, high-performance distributed file system.", Proceedings of USENIX OSDI, 2006. HTML
  • Jeffrey Dean and Sanjay Ghemawat, "MapReduce: Simplified Data Processing on Large Cluster", Communications of the ACM, volume 51, issue 1, January 2008. DOI
  • Christopher Moretti, Hoang Bui, Karen Hollingsworth, Brandon Rich, Patrick Flynn, and Douglas Thain, "All-Pairs: An Abstraction for Data Intensive Computing on Campus Grids", IEEE Transactions on Parallel and Distributed Systems, 21(1), pages 33-46, January, 2010. DOI.
  • Hunt, Patrick, et al. "ZooKeeper: Wait-free Coordination for Internet-scale Systems." USENIX annual technical conference. Vol. 8. No. 9. 2010.
  • Matei Zaharia, Mosharaf Chowdhury, Michael Franklin, Scott Shenker, and Ion Stoica, "Spark: Cluster Computing with Working Sets", Proceedings of USENIX HotCloud, 2010. PDF
  • A. Lakshman and P. Malik, "Cassandra: A Decentralized Structured Storage System", ACM SIGOPS Operating Systems Review, 2010. PDF
  • Kay Osterhout, Patrick Wendell, Matei Zaharia, Ion Stoica, "Sparrow: Distributed, Low Latency Scheduling", Proceedings of SOSP 2013. DOI
  • (REQ) Philippe Dobbelaere and Kyumars Sheykh Esmaili, "Kafka versus RabbitMQ: A comparative study of two industry reference publish/subscribe implementations: Industry Paper", Proceedings of ACM Distributed and Event-based Systems (DEBS), 2017. DOI
  • M. Rocklin, "Dask: Parallel Computing with Blocked Algorithms and Task Scheduling", Proceedings of Python in Science Conference (SciPy), 2015.
  • Yadu Babuji, Anna Woodard, Zhuozhao Li, Daniel S. Katz, Ben Clifford, Rohan Kumar, Lukasz Lacinski, Ryan Chard, Justin M. Wozniak, Ian Foster, Michael Wilde, and Kyle Chard. 2019. "Parsl: Pervasive Parallel Programming in Python". In Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC '19). Association for Computing Machinery, New York, NY, USA, 25–36. DOI.
  • M. Elhemali et al, "Amazon DynamoDB: A Scalable, Predictable Performance, and Fully Managed NoSQL Database Service", USENIX ATC 2022. PDF
  • C.Tang et al, "Twine: A Unified Cluster Management System for Shared Infrastructure", OSDI 2020. PDF