Skip to content

Instantly share code, notes, and snippets.

@simong
Created October 9, 2012 15:22
Show Gist options
  • Select an option

  • Save simong/3859486 to your computer and use it in GitHub Desktop.

Select an option

Save simong/3859486 to your computer and use it in GitHub Desktop.
Dataload/benchmark todo list

OAE Loader

A list of tasks that need to happen to ensure we get consistent results out of each dataload that can be easily compared to each other.

Clean the system

We should try to bring the system into the same state for each dataload/benchmark. The simplest way is to just wipe everything and start fresh.

Cassandra nodes.

  1. Drop the oae keyspace.
  2. Shutdown each Cassandra node
  3. Clear data from
    • /var/lib/cassandra/data/oae
    • /var/lib/cassandra/commitlogs (Assuming only the oae keyspace is used on these nodes)
    • /var/log/cassandra/system.log
  4. nodetool cleanup to wipe unnecessary files.

App server nodes.

  1. Shutdown each app server
  2. Pull latest master (potential for configuring which branch should be pulled? ex: run tests against simong/rediscache)
  3. Remove server.log
  4. Restart app server
  5. Create a tenant (only needs to happen once)

Load balancer

  1. Remove the nginx logs.

Data loader/Tsung driver

  1. Remove old scripts
  2. Pull latest master (configurable branch?)

Dataload

This task will perform the data load.

Data loader

  1. Generate a data set

    With the following configurable options (we could also generate a dataset once and re-use it for every dataload. That way the results might be more comparable?):

    • nr of batches
    • users per batch
    • groups per batch
    • content per batch
  2. Push load start annotation to circonus

  3. Load the dataset with config

    • url
    • number of concurrent batches (probably shouldn't change over dataloads)
  4. Push load end annotation to circonus

  5. Copy the generated html statistics to /var/www/html/<date>/dataload

  6. Package the dataset into csv/format files

    • The package.js script has to provide the correct .format files along with the CSV.

Benchmark

This task will run the tsung suite.

  1. Generate a tsung suite (probably just use standard)
  2. Push tsung start annotation to circonus
  3. Run tsung tests
  4. Push tsung end annotation to circonus
  5. Copy the generated tsung statistics to /var/www/html/<date>/tsung
@mrvisser
Copy link
Copy Markdown

mrvisser commented Oct 9, 2012

Sounds about right. For the app server nodes you could narrow this down to:

  1. Run svcadm disable node-sakai-oae
  2. rm -rf /opt/oae
  3. Run /home/admin/puppet-hilary/bin/apply.sh (CWD should be /home/admin/puppet-hilary)
    #3 will re-checkout git master, regenerate the config.js and start up the node service

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment