I'm putting this list together as a sort of reading plan for myself in order to learn more about general cluster scheduling/utilization and various ways of generically programming to them. Lists of direct links to PDFs here in the order I think makes some sense from skimming reference sections.
Happy to here of any additions that might be sensible.
- Google File System since everything references it and data locality is a thing.
- Google MapReduce because it's one of the earlier well-known functional approaches to programming against a cluster.
- Dryad for a more general (iterative?) programming model.
- Quincy for a different take on scheduling.
- Delay Scheduling for another approach to scheduling.
- DryadLINQ for a higher-level approach.
- Pregel for graph processing.
- MapReduce Online for an iterative MapReduce.
- Distributed GraphLab for an approach that apparently embraces asynchrony.