Skip to content

Instantly share code, notes, and snippets.

@jasonpincin
Created November 27, 2014 04:27
Show Gist options
  • Save jasonpincin/ef67d04c70840e02f52b to your computer and use it in GitHub Desktop.
Save jasonpincin/ef67d04c70840e02f52b to your computer and use it in GitHub Desktop.

Anivia

Anivia is Walmart's mobile analytics platform. It collects user-interaction metrics from mobile devices -- iPhone, iPad, Android, and mWeb. It also processes logging and other metrics from a bunch of mobile services. Anivia allows the business to have real-time insight and reporting into what is going on in the mobile business and provides vital capabilities for developers and ops folks to monitor the health of their services.

Anivia is built on Node.js, Hapi, RabbitMQ, and a multitude of downstream systems including Splunk and Omniture. Anivia is taking in 7,000 events per second on average (as of this writing), which after some fan-out and demuxing comes out to around 20,000 messages per second in flight. These rates are expected to soar leading up to and including Black Friday. The platform has grown in recent months to over 1,000 node processes spanning multiple data centers, gaining features such as link resiliency in the process.

A few of Anivia's functionalities

  • Timestamp Correction for misconfigured client devices
  • Demuxing to allow clients to send batched payloads
  • Tranformation/Mutation/Decoration of events to make them digestible and useful to downstream systems
  • Forwarding allows data to be fanned out to any number of downstream systems that can use it

With a few exceptions, Anivia is data agnostic and does not perform data aggreation. It relys on downstream systems (such as Splunk, Omniture, and many more) to crunch the numbers.

Anivia components

  • Elmer - Hapi web server responsible for collecting events as http requests
  • RabbitMQ - The message bus and safe zone
  • Prospector - Responsible for consuming events, trabsforming, and sending them downstream
  • Splunk - The system of record for all captured events

The flow

  1. Anivia primarily receives analytic messages via Elmer. Elmer performs absolutely no processing on the messages. It's sole purpose is to get the message into RabbitMQ as quickly as possible, where the message will remain safe until delivered to it's final destination(s).
  2. Once in RabbitMQ, the message waits (typically less than a few milliseconds) to be picked up by Prospector. Prospector will inspect the message, and break it apart into multiple messages, each containing a single event destined for a single downstream system (many events are destined for multiple downstream systems, so these will be duplicated). These messages are re-queued into RabbitMQ, where they wait to be picked up again.
  3. Prospector will pick up the re-queued, demuxed messages and deliver them downstream, to their final destination.

By de-muxing and re-queueing events individually, we protect ourselves against backpressure from downstream systems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment