OAE Loader

A list of tasks that need to happen to ensure we get consistent results out of each dataload that can be easily compared to each other.

Clean the system

We should try to bring the system into the same state for each dataload/benchmark. The simplest way is to just wipe everything and start fresh.

Drop the oae keyspace.
Shutdown each Cassandra node
Clear data from
- /var/lib/cassandra/data/oae
- /var/lib/cassandra/commitlogs (Assuming only the oae keyspace is used on these nodes)
- /var/log/cassandra/system.log
nodetool cleanup to wipe unnecessary files.

Shutdown each app server
Pull latest master (potential for configuring which branch should be pulled? ex: run tests against simong/rediscache)
Remove server.log
Restart app server
Create a tenant (only needs to happen once)

This task will perform the data load.

Generate a data set

With the following configurable options (we could also generate a dataset once and re-use it for every dataload. That way the results might be more comparable?):
- nr of batches
- users per batch
- groups per batch
- content per batch
Push load start annotation to circonus
Load the dataset with config
- url
- number of concurrent batches (probably shouldn't change over dataloads)
Push load end annotation to circonus
Copy the generated html statistics to /var/www/html/<date>/dataload
Package the dataset into csv/format files
- The package.js script has to provide the correct .format files along with the CSV.

This task will run the tsung suite.