Created
March 25, 2018 19:35
-
-
Save PrithivirajDamodaran/f1416ee9695073636230ed74af528df4 to your computer and use it in GitHub Desktop.
Open Data engineering
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <html> | |
| <head> | |
| <style type="text/css"> | |
| @import url('https://fonts.googleapis.com/css?family=Noto+Sans:400,700'); | |
| body{ | |
| background: #f0f0f0; | |
| font-family: 'Noto Sans', sans-serif; | |
| } | |
| /* The ribbons */ | |
| .corner-ribbon{ | |
| width: 200px; | |
| background: #e43; | |
| position: absolute; | |
| top: 25px; | |
| left: -50px; | |
| text-align: center; | |
| line-height: 50px; | |
| letter-spacing: 1px; | |
| color: #f0f0f0; | |
| transform: rotate(-45deg); | |
| -webkit-transform: rotate(-45deg); | |
| } | |
| /* Custom styles */ | |
| .corner-ribbon.sticky{ | |
| position: fixed; | |
| } | |
| .corner-ribbon.shadow{ | |
| box-shadow: 0 0 3px rgba(0,0,0,.3); | |
| } | |
| /* Different positions */ | |
| .corner-ribbon.top-left{ | |
| top: 25px; | |
| left: -50px; | |
| transform: rotate(-45deg); | |
| -webkit-transform: rotate(-45deg); | |
| } | |
| .corner-ribbon.top-right{ | |
| top: 25px; | |
| right: -50px; | |
| left: auto; | |
| transform: rotate(45deg); | |
| -webkit-transform: rotate(45deg); | |
| } | |
| .corner-ribbon.bottom-left{ | |
| top: auto; | |
| bottom: 25px; | |
| left: -50px; | |
| transform: rotate(45deg); | |
| -webkit-transform: rotate(45deg); | |
| } | |
| .corner-ribbon.bottom-right{ | |
| top: auto; | |
| right: -50px; | |
| bottom: 25px; | |
| left: auto; | |
| transform: rotate(-45deg); | |
| -webkit-transform: rotate(-45deg); | |
| } | |
| /* Colors */ | |
| .corner-ribbon.white{background: #f0f0f0; color: #555;} | |
| .corner-ribbon.black{background: #333;} | |
| .corner-ribbon.grey{background: #999;} | |
| .corner-ribbon.blue{background: #39d;} | |
| .corner-ribbon.green{background: #2c7;} | |
| .corner-ribbon.turquoise{background: #1b9;} | |
| .corner-ribbon.purple{background: #95b;} | |
| .corner-ribbon.red{background: #e43;} | |
| .corner-ribbon.orange{background: #e82;} | |
| .corner-ribbon.yellow{background: #ec0;} | |
| div.ui-menu li { | |
| list-style:none; | |
| background-image:none; | |
| background-repeat:none; | |
| background-position:0; | |
| } | |
| ul | |
| { | |
| list-style-type:none; | |
| padding:0px; | |
| margin:0px; | |
| } | |
| li | |
| { | |
| background-image:url(sqpurple.gif); | |
| background-repeat:no-repeat; | |
| background-position:0px 5px; | |
| padding-left:14px; | |
| } | |
| ul li.main{ | |
| margin-left: 20px; | |
| } | |
| ul li.sub{ | |
| margin-left: 35px; | |
| } | |
| </style> | |
| </head> | |
| <body> | |
| <div class="corner-ribbon top-left red shadow">Under Review</div> | |
| <center> <h1> Open Data Engineering </h1> </center> | |
| <center> <h2> An endeavour to train industry-ready data engineers - <i> by Prithiviraj Damodaran </i> </h2> </center> | |
| <center> Except where otherwise noted, this website is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/3.0/deed.en_US">Creative Commons Attribution 3.0 Unported License</a>. </center> | |
| <h2> <font color =" blue"> Table of contents </font> </h2> | |
| <div class="ui-menu"> | |
| <ul> | |
| <li class="main"> <b> Part I - Introduction: The Big data story </b> </li> | |
| <li class="sub"> A brief timeline of events </li> | |
| <li class="sub"> Distributed systems: Basics </li> | |
| <li class="main"> <b> Part II - Hadoop 1.0 </b> </li> | |
| <li class="sub"> Doug cutting’s brain child</li> | |
| <li class="sub"> Hands-on: MR and HDFS Basics </li> | |
| <li class="main"> <b> Part III - Hadoop 2.0 </b> </li> | |
| <li class="sub"> YARN and the current state of Hadoop </li> | |
| <li class="sub"> Hands-on: MR and HDFS Advanced </li> | |
| <li class="sub"> Distributed systems: A Deep dive </li> | |
| <li class="main"> <b> Part IV - Data Integration </b></li> | |
| <li class="sub"> Principles of Data Integration</li> | |
| <li class="sub"> Fundamentals of Kafka SDP</li> | |
| <li class="sub"> Data Integration with Kafka SDP - Hands-on </li> | |
| <li class="sub"> Case study: Data integration</li> | |
| <li class="main"><b> Part VI - Data storage and NoSQL </b> </li> | |
| <li class="sub"> NoSQL and storage fundamentals </li> | |
| <li class="sub"> NoSQL hands-on</li> | |
| <li class="sub"> Case study: Modelling for real-time data</li> | |
| <li class="main"> <b> Part V - Data Processing </b> </li> | |
| <li class="sub"> Principles of Data processing Spark basics</li> | |
| <li class="sub"> Spark basics hands-on </li> | |
| <li class="sub"> Advanced Spark </li> | |
| <li class="sub"> Advanced Spark: hands-on</li> | |
| <li class="sub"> Spark Streaming</li> | |
| <li class="sub"> Case study: Streaming and Batch processing</li> | |
| <li class="main"> <b> Part VII - Data Access and querying </b> </li> | |
| <li class="sub"> Enabling analytics </li> | |
| <li class="sub"> Hive-on-spark </li> | |
| <li class="sub"> Apache Drill </li> | |
| <li class="sub"> Case study: BI and Viz </li> | |
| <li class="main"> <b> Part VIII - Capstone project </b> </li> | |
| </ul> | |
| </div> | |
| <h2> <font color =" blue"> Part 1 </font> </h2> | |
| <div> | |
| <ol> | |
| <li> <b> What led to the rise of “Big data” : Converging technology trends </b></li> | |
| <ul> | |
| <li class="sub"> Web 2.0 </li> | |
| <li class="sub"> Commoditization of Hardware</li> | |
| <li class="sub"> Mobile revolution </li> | |
| <li class="sub"> Open-source movement </li> | |
| <li class="sub"> IoT </li> | |
| </ul> | |
| <li> <b> How Google set the “Big data” ball rolling ? </b> </li> | |
| <ul> | |
| <li class="sub"> Why Google wanted to overoil their web crawlers with new processing techniques </li> | |
| <li class="sub"> Enter MapReduce : How it contrasts with older data-parallel paradigms like MPPs and why is it a giant leap in parallel | |
| processing </li> | |
| <li class="sub"> 2005, Release of Google M/R paper first milestone and big data and the chain of events that is set on motion </li> | |
| <li class="sub"> Project Nutch : 2nd milestone, Doug cutting and Mike in yahoo built the first implementation of M/R and GFS and | |
| open-sourced it as Hadoop </li> | |
| </ul> | |
| <li> <b> The idea of quasi-structured and unstructured data vs structured data </b> </li> | |
| <li> <b> Change in status quo: Why distributed system and parallel processing are the new normal ? </b> </li> | |
| <li> <b> Open source wave - complex distributed data frameworks are accessible to all.</b> </li> | |
| <ul> | |
| <li class="sub"> The other frameworks that shaped the big data world as we know today. </li> | |
| <li class="sub"> Companies that made big data technology accessible </li> | |
| </ul> | |
| </ol> | |
| </div> | |
| </body> | |
| </html> | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment