Skip to content

Instantly share code, notes, and snippets.

@rewinfrey
Last active August 29, 2015 14:04
Show Gist options
  • Save rewinfrey/807d79f4b926ec40f914 to your computer and use it in GitHub Desktop.
Save rewinfrey/807d79f4b926ec40f914 to your computer and use it in GitHub Desktop.
Rails Large Application Tuning

Notes from Tuning Legacy Rails App: How to Make an Elephant Sprint

###Measuring performance

  • Monitor the values of specific code paths and graph them to see performance over time (response times as one example metric)
  • Automated tests that measure performance can fail based on a set threshold
    • If a given code path exceeds 20% of the existing response time, then the automated test fails, alerting ops and devs that a recent code change has negatively impacted performance beyond an pre-defined SLA or threshold
  • Need a production like environment
  • Make that performance test environment exclusive to performance testing (don't let regular usage or QA usage affect the test results)
  • Using NewRelic to compare boxes against each other

###Performance Test Suite

  • Jmeter, gatling, OpenSTA, Tsung (speaker used Jmeter)
  • Recording functionality to maintain the tests
  • Validations on page access to avoid false results
  • Parameterize tests to use different data (via different users so cached queries don't throw off test results)
  • Tests can run as a distributed test suite to simulate actual users access
  • Tests run headless against the nightly build
  • Ideal: create an "ultimate" test suite based on production logs (replaying the production logs)
  • Biggest take home point: Use NewRelic

###Fixing Your Legacy Application

  • Akami (using ISP's as a CDN to cache images and common views)
  • Running Apache as server, running several instances of the application
  • Each process connects to Moxi as proxy layer for Memcached
  • Additionally used an in-process cache for frequently requested objects (like user data)

###Out of Band GC

  • Trigger GC only out of band (meaning not during a normal http request, or during the execution of application code)
  • Available in Passenger 4
  • Increase GC limit to skip GC during a single request
  • Fine tune the max number of passenger processes to handle concurrent requests
  • Using Ruby GC parameters to delay GC to trigger every 5th request rather than every request gave significant application performance
  • By using out of band GC, memory footprint will grow (reduced number of processes running on application server from 30 to 20)
  • Had to find a sweet spot of allowing many objects to be created in a request, but creating just enough objects so that Ruby's GC isn't triggered during the request, but immediately after the request then GC those objects (did this by delaying GC every 5 requests with a higher than normal GC_MALLOC_LIMIT)
	export RUBY_HEAP_MIN_SLOTS=3000000
	export RUBY_GC_MALLOC_LIMIT=120000000

###Fragment Caching

  • Caching of a data row in search results
  • Caching of user menu items
  • Caching of non-user specific display snippet (mostly static, drop down elements)
  • Force the caching key to include all changeable element ids (app_user_123 as the cache key where 123 is the user id)
  • May inject user specific data to improve caching usage
  • Use ajax calls for links that contain user specific data

###Optimize Active Record

  • In process caching of frequently used objects
  • Memcached (via Moxi proxy)
  • Preloading of associations (:includes)
  • Use id based queries rather than object based queries
  • Use raw SQL to optimize queries

###Caching Is Not Free

  • Have to create policies for how and when to expire caches
  • Get a little bit of automatic updating of cache when updating an AR object
  • Cache infrastructure (Memcached, Moxi proxy layer, etc) adds extra overhead to environment complexity, deployments, debugging, cost of infrastructure, etc.

###Deployments

  • Happened every three weeks
  • Cold deployments
  • Could theoretically use blue green deployments (one set of live production servers, one set of non live production servers)
  • With several servers (12 servers in total) - monitor 4 prod servers and their performance testing server

###My Questions

  1. Why the use of Moxi proxy for talking with Memcached?

    • 2 Memcached servers for two application servers both running about 20 processes each resulted in a lot of network traffic.
    • This network traffic caused delays in retrieving things from Memcached, which introduced latency
    • Moxi had it's own cache for specific commonly requested resources (like a list of the 50 states)
    • Moxi also had policies for determining which Memcached server to hit (structuring the data in Memcached so that requests could be directed via policy through the Moxi proxy layer)
  2. How did you determine when to cache things in process?

    • Using NewRelic to determine what the most common transactions are, and identifying which of those common transactions were the most expensive
    • Using a simple hash, the requested object would first be queried in an environment hash, if that results in a "cache miss", then the Moxy layer is queried for the object, and if the object doesn't exist in Memcached, the DB is queried
  3. What kind of expiration policy did you use for your caches?

    • Had a simple 5 minute expiration policy for everything in the fragment cache. Relied on ActiveRecord's built in cache support when updating AR objects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment