Skip to content

Instantly share code, notes, and snippets.

@mrflip
Created April 4, 2013 23:32
Show Gist options
  • Save mrflip/5315322 to your computer and use it in GitHub Desktop.
Save mrflip/5315322 to your computer and use it in GitHub Desktop.

Performance Qualification

Identify all reasons why (eg) Elasticsearch cannot provide acceptable performance for standard requests and Qualifying load. The "Qualifying load" for each performance bound is double (or so) the worst-case scenario for that bound across all our current clients.

  • Performance
    • bandwidth (rec/s, MB/s) and latency (s) for 100B, 1kB, 100kB records
    • under read, write, read/write
    • in degraded state: a) loss of one/two servers and recovering; b) elevated packet latency + drop rate between "regions"
    • High concurrency
  • keepalive
  • bad input flood
  • restart of service; reboot of machine; stop/start of machine
  • Utilization, Saturation, Errors
    • commonly observed errors and their meaning
  • exemplars and mountweasels

Elasticsearch

  • Five queries everyone should know
    • their performance at baseline
  • Field Cache usage vs number of records
  • Write throughput in a) full-weight records; b) Cache map use case (lots of deletes on compaction)
  • Version upgrade
  • Recovery
    • plugin for recovery strategy
  • Shard assignment
    • Separate Read/write/transport boxes
    • probably only one or the other types of nodes are masters
    • Cross-geo replication?
  • Machine sizes: m1.x vs m3.x; ebs optimized vs not; for write nodes, c1.xl?
  • Failover and backup

Storm

  • CacheMap metrics, tuning
  • In-stream database calls
  • Can I "push"/"flush" DRPC calls?
  • What happens when I fail a tuple?
    • fail-forever / fail-retriably
    • "failure" streams
  • Tracing
    • "tracing" stream
  • Wukong shim
    • failure/error handling
    • tuple vs record
    • serialization
  • Batch size tradeoffs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment