Skip to content

Instantly share code, notes, and snippets.

@matyix
Last active August 29, 2015 13:56
Show Gist options
  • Save matyix/8824645 to your computer and use it in GitHub Desktop.
Save matyix/8824645 to your computer and use it in GitHub Desktop.
Recently I was working on a Hadoop 2.x/YARN based Application Master - and came across Apache Helix (a generic cluster management framework by LinkedIn).
Usually YARN gives you the framework to allocate containers among nodes based on utilization (CPU, memory), monitor, start and restart containers - and leaves the state management, fault tolerance, cluster expansion, throttling and replication and partitioning to be left at the discretion of the Application Master. This is where Helix is complementary to YARN, and does this a declarative way (by using a Finite-State Machine http://en.wikipedia.org/wiki/Finite-state_machine).
I will let you know how this 'mini-project' goes, but back to the topic ...
I had a conversation with a good friend about how they managed to do a distributed system using Apache Zookeeper and how they struggled to model multiple distributed locks at scale. While there might be a few frameworks achieving this, I was thinking to highlight the differences and the advantages of using Helix over Zookeeper (regarding distributed locks).
https://github.com/matyix/helix/blob/master/recipes/distributed-lock-manager/README.md
For reference check the Zookeeper lock recipe as well - https://github.com/apache/zookeeper/tree/trunk/src/recipes/lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment