Skip to content

Instantly share code, notes, and snippets.

@PrithivirajDamodaran
Created March 25, 2018 19:35
Show Gist options
  • Save PrithivirajDamodaran/f1416ee9695073636230ed74af528df4 to your computer and use it in GitHub Desktop.
Save PrithivirajDamodaran/f1416ee9695073636230ed74af528df4 to your computer and use it in GitHub Desktop.
Open Data engineering
<html>
<head>
<style type="text/css">
@import url('https://fonts.googleapis.com/css?family=Noto+Sans:400,700');
body{
background: #f0f0f0;
font-family: 'Noto Sans', sans-serif;
}
/* The ribbons */
.corner-ribbon{
width: 200px;
background: #e43;
position: absolute;
top: 25px;
left: -50px;
text-align: center;
line-height: 50px;
letter-spacing: 1px;
color: #f0f0f0;
transform: rotate(-45deg);
-webkit-transform: rotate(-45deg);
}
/* Custom styles */
.corner-ribbon.sticky{
position: fixed;
}
.corner-ribbon.shadow{
box-shadow: 0 0 3px rgba(0,0,0,.3);
}
/* Different positions */
.corner-ribbon.top-left{
top: 25px;
left: -50px;
transform: rotate(-45deg);
-webkit-transform: rotate(-45deg);
}
.corner-ribbon.top-right{
top: 25px;
right: -50px;
left: auto;
transform: rotate(45deg);
-webkit-transform: rotate(45deg);
}
.corner-ribbon.bottom-left{
top: auto;
bottom: 25px;
left: -50px;
transform: rotate(45deg);
-webkit-transform: rotate(45deg);
}
.corner-ribbon.bottom-right{
top: auto;
right: -50px;
bottom: 25px;
left: auto;
transform: rotate(-45deg);
-webkit-transform: rotate(-45deg);
}
/* Colors */
.corner-ribbon.white{background: #f0f0f0; color: #555;}
.corner-ribbon.black{background: #333;}
.corner-ribbon.grey{background: #999;}
.corner-ribbon.blue{background: #39d;}
.corner-ribbon.green{background: #2c7;}
.corner-ribbon.turquoise{background: #1b9;}
.corner-ribbon.purple{background: #95b;}
.corner-ribbon.red{background: #e43;}
.corner-ribbon.orange{background: #e82;}
.corner-ribbon.yellow{background: #ec0;}
div.ui-menu li {
list-style:none;
background-image:none;
background-repeat:none;
background-position:0;
}
ul
{
list-style-type:none;
padding:0px;
margin:0px;
}
li
{
background-image:url(sqpurple.gif);
background-repeat:no-repeat;
background-position:0px 5px;
padding-left:14px;
}
ul li.main{
margin-left: 20px;
}
ul li.sub{
margin-left: 35px;
}
</style>
</head>
<body>
<div class="corner-ribbon top-left red shadow">Under Review</div>
<center> <h1> Open Data Engineering </h1> </center>
<center> <h2> An endeavour to train industry-ready data engineers - <i> by Prithiviraj Damodaran </i> </h2> </center>
<center> Except where otherwise noted, this website is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/3.0/deed.en_US">Creative Commons Attribution 3.0 Unported License</a>. </center>
<h2> <font color =" blue"> Table of contents </font> </h2>
<div class="ui-menu">
<ul>
<li class="main"> <b> Part I - Introduction: The Big data story </b> </li>
<li class="sub"> A brief timeline of events </li>
<li class="sub"> Distributed systems: Basics </li>
<li class="main"> <b> Part II - Hadoop 1.0 </b> </li>
<li class="sub"> Doug cutting’s brain child</li>
<li class="sub"> Hands-on: MR and HDFS Basics </li>
<li class="main"> <b> Part III - Hadoop 2.0 </b> </li>
<li class="sub"> YARN and the current state of Hadoop </li>
<li class="sub"> Hands-on: MR and HDFS Advanced </li>
<li class="sub"> Distributed systems: A Deep dive </li>
<li class="main"> <b> Part IV - Data Integration </b></li>
<li class="sub"> Principles of Data Integration</li>
<li class="sub"> Fundamentals of Kafka SDP</li>
<li class="sub"> Data Integration with Kafka SDP - Hands-on </li>
<li class="sub"> Case study: Data integration</li>
<li class="main"><b> Part VI - Data storage and NoSQL </b> </li>
<li class="sub"> NoSQL and storage fundamentals </li>
<li class="sub"> NoSQL hands-on</li>
<li class="sub"> Case study: Modelling for real-time data</li>
<li class="main"> <b> Part V - Data Processing </b> </li>
<li class="sub"> Principles of Data processing Spark basics</li>
<li class="sub"> Spark basics hands-on </li>
<li class="sub"> Advanced Spark </li>
<li class="sub"> Advanced Spark: hands-on</li>
<li class="sub"> Spark Streaming</li>
<li class="sub"> Case study: Streaming and Batch processing</li>
<li class="main"> <b> Part VII - Data Access and querying </b> </li>
<li class="sub"> Enabling analytics </li>
<li class="sub"> Hive-on-spark </li>
<li class="sub"> Apache Drill </li>
<li class="sub"> Case study: BI and Viz </li>
<li class="main"> <b> Part VIII - Capstone project </b> </li>
</ul>
</div>
<h2> <font color =" blue"> Part 1 </font> </h2>
<div>
<ol>
<li> <b> What led to the rise of “Big data” : Converging technology trends </b></li>
<ul>
<li class="sub"> Web 2.0 </li>
<li class="sub"> Commoditization of Hardware</li>
<li class="sub"> Mobile revolution </li>
<li class="sub"> Open-source movement </li>
<li class="sub"> IoT </li>
</ul>
<li> <b> How Google set the “Big data” ball rolling ? </b> </li>
<ul>
<li class="sub"> Why Google wanted to overoil their web crawlers with new processing techniques </li>
<li class="sub"> Enter MapReduce : How it contrasts with older data-parallel paradigms like MPPs and why is it a giant leap in parallel
processing </li>
<li class="sub"> 2005, Release of Google M/R paper first milestone and big data and the chain of events that is set on motion </li>
<li class="sub"> Project Nutch : 2nd milestone, Doug cutting and Mike in yahoo built the first implementation of M/R and GFS and
open-sourced it as Hadoop </li>
</ul>
<li> <b> The idea of quasi-structured and unstructured data vs structured data </b> </li>
<li> <b> Change in status quo: Why distributed system and parallel processing are the new normal ? </b> </li>
<li> <b> Open source wave - complex distributed data frameworks are accessible to all.</b> </li>
<ul>
<li class="sub"> The other frameworks that shaped the big data world as we know today. </li>
<li class="sub"> Companies that made big data technology accessible </li>
</ul>
</ol>
</div>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment