Skip to content

Instantly share code, notes, and snippets.

@alexmuller
Last active August 29, 2015 14:11
Show Gist options
  • Save alexmuller/a875da949b91c1ae73c9 to your computer and use it in GitHub Desktop.
Save alexmuller/a875da949b91c1ae73c9 to your computer and use it in GitHub Desktop.
Converting the `user_satisfaction_graph` to use transforms

Converting the user_satisfaction_graph to use transforms

Raw datum that we store:

{
  "_day_start_at": "2014-03-10T00:00:00+00:00",
  "_hour_start_at": "2014-03-10T00:00:00+00:00",
  "_id": "20140310_apply-carers-allowance",
  "_month_start_at": "2014-03-01T00:00:00+00:00",
  "_quarter_start_at": "2014-01-01T00:00:00+00:00",
  "_timestamp": "2014-03-10T00:00:00+00:00",
  "_updated_at": "2014-03-11T00:30:07.403000+00:00",
  "_week_start_at": "2014-03-10T00:00:00+00:00",
  "comments": 19,
  "period": "day",
  "rating_1": 0,
  "rating_2": 1,
  "rating_3": 3,
  "rating_4": 5,
  "rating_5": 10,
  "slug": "apply-carers-allowance",
  "total": 19
}

How Spotlight queries it:

{
  "_count": 7.0,
  "_end_at": "2014-11-24T00:00:00+00:00",
  "_start_at": "2014-11-17T00:00:00+00:00",
  "rating_1:sum": 14.0,
  "rating_2:sum": 2.0,
  "rating_3:sum": 62.0,
  "rating_4:sum": 466.0,
  "rating_5:sum": 1176.0,
  "total:sum": 1720.0
}

Spotlight also adds a rating:

{
  "rating": 0.4736586
}

So we should store a data set that looks like this:

{
  "_end_at": "2014-11-24T00:00:00+00:00",
  "_start_at": "2014-11-17T00:00:00+00:00",
  "rating": 0.4736586,
  "number_of_responses": 1720.0,
  "days_with_responses": 7.0
}

Whenever the raw data set receives a POST:

  1. Find out what week the data point belongs to
  2. Request all the raw data for that week
  3. Summarise and store it, overwriting the summarised entry if it exists

TransformType

POST to http://stagecraft.development.performance.service.gov.uk/transform-type:

Authorization: Bearer development-oauth-access-token
Content-Type: application/json
{
  "name": "user_satisfaction_weekly",
  "function": "backdrop.write.tasks.user_satisfaction_weekly",
  "schema": {}
}

Transform

POST to http://stagecraft.development.performance.service.gov.uk/transform:

Authorization: Bearer development-oauth-access-token
Content-Type: application/json
{
  "type_id": "dd1e1e1d-4273-47cc-85c0-89638ec8d2eb",
  "input": {
    "data-group": "apply-carers-allowance",
    "data-type": "customer-satisfaction"
  },
  "options": {},
  "output": {
    "data-group": "apply-carers-allowance",
    "data-type": "customer-satisfaction-weekly-summary"
  }
}

Pseudocode

def calculate_rating(data):
    # https://github.com/alphagov/spotlight/blob/ca291ffcc86a5397003be340ec263a2466b72cfe/app/common/collections/user-satisfaction.js#L24-35
    min_score = 1
    max_score = 5
    score = 0
    for i in range(min_score, max_score + 1):
        score += sum([datum['_'.join('rating', i)]] * i for datum in data)
    mean = score / sum([datum['total:sum'] for datum in data])
    rating = (mean - min_score) / (max_score - min_score)
    return rating

def user_satisfaction_weekly(latest_datum):
    this_week = latest_datum['_week_start_at']
    data_from_this_week = data_set.find(week_start_at=this_week)
    summary_datum = {
      "_id": base64encode(this_week),
      "_end_at": this_week + 7_days,
      "_start_at": this_week,
      "rating": calculate_rating(data_from_this_week)
      "number_of_responses": sum([datum["total"] for datum in data_from_this_week]),
      "days_with_responses": len(data_from_this_week)
    }
    return summary_datum
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment