Skip to content

Instantly share code, notes, and snippets.

@bradgignac
Last active August 29, 2015 13:57
Show Gist options
  • Select an option

  • Save bradgignac/9393906 to your computer and use it in GitHub Desktop.

Select an option

Save bradgignac/9393906 to your computer and use it in GitHub Desktop.
Memcached

Deployment

  1. Memcached Per Node, Non-Clustered
    Pilot is only aware of the memcached instance running on localhost. Caching happens per-node, so the percentage of cache hits is the lowest of all deployment strategies. Per-node deployments (IMO) are largely an artifact of dedicated servers that had extra RAM that was being wasted, i.e. a 4GB workload running on an 8GB server.

  2. Memcached Per Node, Clustered
    Pilot is aware of each memcached instance running on each web server. The memcached client uses a hashing algorithm to decide which keys are stored on which memcached server instance. The failure scenario for this strategy is interesting. When a web node goes down, you see performance degradation on the other web server because of increased load (2/3 nodes available) and a drop in cache hits. See the note below about consistent hashing.

  3. Dedicated Cluster
    Pilot is aware of each memcached instance running in a dedicated cluster. The biggest advantage is that web node failures don't affect cache hit percentage in other web nodes.

  4. Dedicated Cluster behind a Proxy
    Pilot is only aware of a single memcache server that handles distribution of requests to a cluster of memcached servers. This adds in another layer of complexity, but it solves some of the issues with consistent hashing and introduction of new nodes. See twemproxy for more information.

Gotchas

  • Persistence
    Memcached does not persist data to disk. When a node goes down, all data is lost so it is completely unsuitable for anything that requires persistence (feature flags, sessions).
  • Consistent Hashing
    Picking a memcached client that uses consistent hashing is EXTREMELY important. Otherwise, the cache is completely wrecked as nodes become unavailable or the size of the cluster changes. It can also lead to weird issues where one machine writes a key that can't be retrieved by other nodes, or queries against the cluster return stale data.
  • Adding/Removing Nodes
    Scaling up or down a cluster is kind of weird when every client knows about every server (deployment strategies 2 and 3). Chef needs to be run on every client node, but there is a small window of inconsistency where some clients are aware of the node changes but others are not. It isn't a huge deal, but you just end up with some inconsistent distribution of data and a higher rate of cache misses during deploys.

Alternatives

Redis is another high-performance key value store that is popular for caching. If we're looking at running logstash with multiple workers, we'll likely be used Redis already. The big differences are:

  1. It isn't distributed by default. You avoid all the concerns about consistent hashing and adding/removing nodes. If you care about sharding your cache, it can be set up but I don't think we need that complexity.
  2. It supports master/slave replication and the slave can be promoted to master with keepalived.
  3. It supports persistence, so it is more suitable for storage of feature flags.

Cassanda is more widely used for data storage instead of caching. However, it supports cross-DC replication, and we already know how to run it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment