Skip to content

Instantly share code, notes, and snippets.

- Best Practice: write implicit conversion to types that you own
- Implicits can always be explicitly provided
Coursera courses
- https://www.coursera.org/learn/progfun1/home/welcome
Cheatsheet for this course is present at https://github.com/lampepfl/progfun-wiki/blob/gh-pages/CheatSheet.md
- https://www.coursera.org/learn/progfun2/home/welcome
Cheatsheet for this course is present at https://github.com/sjuvekar/reactive-programming-scala/blob/master/ReactiveCheatSheet.md
1) look at thumb of raised hands on your side ... 10 times
2) Roll eyes and blink - 5 clockwise and anticlockwise
3) Write your name with eyes
4) ciliary muscle exercise - switch focus between close object and distant object
5) Open (inhale) and close(exhale) eyes - 5 times
6) massage your eyes
7) rub your hand and put them on eye
Dynamo DB notes
Dynamo uses a synthesis of well known techniques to achieve scalability and availability:
a) Data is partitioned and replicated using consistent hashing [10]
b) consistency is facilitated by object versioning [12].
c) The consistency among replicas during updates is maintained by a quorum-like technique and a decentralized replica synchronization protocol.
d) Dynamo employs a gossip based distributed failure detection and membership protocol.
Dynamo is a completely decentralized system with minimal need for manual administration. Storage nodes can be added and removed from Dynamo without requiring any manual partitioning or redistribution.
-------------------------------------------------------------------------
Training by Sameer Farooqui (https://www.youtube.com/watch?v=7ooZ4S7Ay6Y)
-------------------------------------------------------------------------
Schedulers
- Yarn/Mesos - you get dynamic partitioning (scaling)
- Local/Standalone - you get static partitioning (work is being done to get that in at least standalone more)
Hadoop MR vs Spark
- Spark is essentially a replacement for MR and not HDFS or Yarn.
//Install on ubuntu 14-04
https://www.digitalocean.com/community/tutorials/how-to-install-apache-kafka-on-ubuntu-14-04
cd ~/kafka
//Start the server
nohup bin/kafka-server-start.sh config/server.properties > kafka.log 2>&1 &
//Write to a topic TutorialTopic
echo "Hello, World" | bin/kafka-console-producer.sh --broker-list localhost:9092 --topic TutorialTopic > /dev/null
@neerajgoel82
neerajgoel82 / CS--quantum-computing
Created March 5, 2017 07:56
This will capture my thoughts around quantum computing
- Searching in an unordered list in square root - n time : Grover's Algorithm
- Cryptography - Shor's Algo
@neerajgoel82
neerajgoel82 / books--data-lake-development-with-big-data
Created February 18, 2017 03:50
Summary of Data Lake development with big data
This gist is to provide a summary of the book titled "Data Lake development with big data" by Pradeep Pasupuleti
@neerajgoel82
neerajgoel82 / books--the-inevitable
Last active June 16, 2021 15:02
This is the gist of the book "The Inevitable" by Kevin Kelly
We have seen technology evolve at a rapid pace in last 3 decades. From a point where computers were accessible to few to a world where they
are everywhere and connected. Internet moving from a point of rarity to ubiquity. This book by Kevin Kelly describes a dozen of inevitable
technological forces that have governed these changes and will continue to shape the next 30 years. He has captured their change into 12
verbs, such as accessing, tracking, and sharing. To be more accurate, these are not just verbs, but present participles, the grammatical
form that conveys continuous action. These forces are accelerating actions. Essentially these are getting amplified as we are changing
as a society. These forces are Becoming, Cognifying, Flowing, Screening, Accessing, Sharing, Filtering, Remixing, Interacting, Tracking,
Questioning, and then Beginning.
Before we move to the actual forces, we should have a note around the change itself. So, here it is.
----------------------------------------------------------
@neerajgoel82
neerajgoel82 / whitepaper--big-data-trends-2017
Created February 14, 2017 19:26
Properties of Data Platform
Relational databases have been a successful technology for twenty years providing
- persistence
- concurrency control (Multiple apps and multiple users access the DB at the same time)
- integration mechanism (This is what prevented object oriented DBs to flourish)
Drawbacks of Relational DBs
- Impedance mismatch (In-memory(object) model of an application is different from (relational) model on disk).
That's why there are ORM frameworks which lead to loss of performance
- They are not designed to run efficiently on clusters