Skip to content

Instantly share code, notes, and snippets.

@heathermiller
Created September 25, 2016 13:05
Show Gist options
  • Save heathermiller/3e14efc491902eeeba84bab019e0f4b5 to your computer and use it in GitHub Desktop.
Save heathermiller/3e14efc491902eeeba84bab019e0f4b5 to your computer and use it in GitHub Desktop.
CS4240
CS 6240: Parallel Data Processing
This course covers techniques for managing and analyzing very large data sets,
with an emphasis on approaches that scale out effectively as more compute nodes
are added. Principles of distributed data management and strategies for
problem-driven data partitioning are introduced through a selection of design
patterns from various application domains, including graph analysis, databases,
text processing, and data mining. Coursework includes hands-on programming
experience with modern big-data processing technology such as MapReduce, Spark,
HBase, and Cloud Computing. (This selection is subject to change as technology
evolves.)
Pre-reqs: CS 5800 (Algorithms) or CS 7800 (Advanced Algorithms)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment