neerajgoel82’s gists

neerajgoel82 / CS--scala

Last active May 23, 2017 14:30

	- Best Practice: write implicit conversion to types that you own
	- Implicits can always be explicitly provided

	Coursera courses
	- https://www.coursera.org/learn/progfun1/home/welcome
	Cheatsheet for this course is present at https://github.com/lampepfl/progfun-wiki/blob/gh-pages/CheatSheet.md

	- https://www.coursera.org/learn/progfun2/home/welcome
	Cheatsheet for this course is present at https://github.com/sjuvekar/reactive-programming-scala/blob/master/ReactiveCheatSheet.md

neerajgoel82 / English--EyeExercises

Created May 17, 2017 19:31

	1) look at thumb of raised hands on your side ... 10 times
	2) Roll eyes and blink - 5 clockwise and anticlockwise
	3) Write your name with eyes
	4) ciliary muscle exercise - switch focus between close object and distant object
	5) Open (inhale) and close(exhale) eyes - 5 times
	6) massage your eyes
	7) rub your hand and put them on eye

neerajgoel82 / CS--DynamoDB

Created May 17, 2017 03:23

	Dynamo DB notes

	Dynamo uses a synthesis of well known techniques to achieve scalability and availability:
	a) Data is partitioned and replicated using consistent hashing [10]
	b) consistency is facilitated by object versioning [12].
	c) The consistency among replicas during updates is maintained by a quorum-like technique and a decentralized replica synchronization protocol.
	d) Dynamo employs a gossip based distributed failure detection and membership protocol.
	Dynamo is a completely decentralized system with minimal need for manual administration. Storage nodes can be added and removed from Dynamo without requiring any manual partitioning or redistribution.

neerajgoel82 / CS--spark-notes

Last active February 10, 2018 22:16

	-------------------------------------------------------------------------
	Training by Sameer Farooqui (https://www.youtube.com/watch?v=7ooZ4S7Ay6Y)
	-------------------------------------------------------------------------

	Schedulers
	- Yarn/Mesos - you get dynamic partitioning (scaling)
	- Local/Standalone - you get static partitioning (work is being done to get that in at least standalone more)

	Hadoop MR vs Spark
	- Spark is essentially a replacement for MR and not HDFS or Yarn.

neerajgoel82 / CS--Kafka Useful commands

Last active April 4, 2022 17:54

	//Install on ubuntu 14-04
	https://www.digitalocean.com/community/tutorials/how-to-install-apache-kafka-on-ubuntu-14-04

	cd ~/kafka

	//Start the server
	nohup bin/kafka-server-start.sh config/server.properties > kafka.log 2>&1 &

	//Write to a topic TutorialTopic
	echo "Hello, World" \| bin/kafka-console-producer.sh --broker-list localhost:9092 --topic TutorialTopic > /dev/null

neerajgoel82 / CS--quantum-computing

Created March 5, 2017 07:56

This will capture my thoughts around quantum computing

	- Searching in an unordered list in square root - n time : Grover's Algorithm

	- Cryptography - Shor's Algo

neerajgoel82 / books--data-lake-development-with-big-data

Created February 18, 2017 03:50

Summary of Data Lake development with big data

This gist is to provide a summary of the book titled "Data Lake development with big data" by Pradeep Pasupuleti

neerajgoel82 / books--the-inevitable

Last active June 16, 2021 15:02

This is the gist of the book "The Inevitable" by Kevin Kelly

	We have seen technology evolve at a rapid pace in last 3 decades. From a point where computers were accessible to few to a world where they
	are everywhere and connected. Internet moving from a point of rarity to ubiquity. This book by Kevin Kelly describes a dozen of inevitable
	technological forces that have governed these changes and will continue to shape the next 30 years. He has captured their change into 12
	verbs, such as accessing, tracking, and sharing. To be more accurate, these are not just verbs, but present participles, the grammatical
	form that conveys continuous action. These forces are accelerating actions. Essentially these are getting amplified as we are changing
	as a society. These forces are Becoming, Cognifying, Flowing, Screening, Accessing, Sharing, Filtering, Remixing, Interacting, Tracking,
	Questioning, and then Beginning.

	Before we move to the actual forces, we should have a note around the change itself. So, here it is.
	----------------------------------------------------------

neerajgoel82 / whitepaper--big-data-trends-2017

Created February 14, 2017 19:26

Properties of Data Platform

	Properties of Data Platform:
	- Data should be consolidated (with different sources together)
	- It should be fast and efficient
	- It should be approachable (discoverable, explorable, self-serve, viewable)
	- It should be secure (governance, ACLs, provenance)
	- ML on top of that using Spark
	- Last and most important, it should be relevant and driven by business needs

	These are based on the following document
	https://drive.google.com/open?id=0B8eAsKPWNEi6M3d5Mm1qaFNPY3c

neerajgoel82 / books--nosql-distilled

Last active August 25, 2020 05:00

	Relational databases have been a successful technology for twenty years providing
	- persistence
	- concurrency control (Multiple apps and multiple users access the DB at the same time)
	- integration mechanism (This is what prevented object oriented DBs to flourish)

	Drawbacks of Relational DBs
	- Impedance mismatch (In-memory(object) model of an application is different from (relational) model on disk).
	That's why there are ORM frameworks which lead to loss of performance
	- They are not designed to run efficiently on clusters

Neeraj Goel neerajgoel82