Converting the `user_satisfaction_graph` to use transforms

Raw datum that we store:

{
  "_day_start_at": "2014-03-10T00:00:00+00:00",
  "_hour_start_at": "2014-03-10T00:00:00+00:00",
  "_id": "20140310_apply-carers-allowance",
  "_month_start_at": "2014-03-01T00:00:00+00:00",
  "_quarter_start_at": "2014-01-01T00:00:00+00:00",
  "_timestamp": "2014-03-10T00:00:00+00:00",
  "_updated_at": "2014-03-11T00:30:07.403000+00:00",
  "_week_start_at": "2014-03-10T00:00:00+00:00",
  "comments": 19,
  "period": "day",
  "rating_1": 0,
  "rating_2": 1,
  "rating_3": 3,
  "rating_4": 5,
  "rating_5": 10,
  "slug": "apply-carers-allowance",
  "total": 19
}

How Spotlight queries it:

{
  "_count": 7.0,
  "_end_at": "2014-11-24T00:00:00+00:00",
  "_start_at": "2014-11-17T00:00:00+00:00",
  "rating_1:sum": 14.0,
  "rating_2:sum": 2.0,
  "rating_3:sum": 62.0,
  "rating_4:sum": 466.0,
  "rating_5:sum": 1176.0,
  "total:sum": 1720.0
}

Spotlight also adds a rating:

{
  "rating": 0.4736586
}

So we should store a data set that looks like this:

{
  "_end_at": "2014-11-24T00:00:00+00:00",
  "_start_at": "2014-11-17T00:00:00+00:00",
  "rating": 0.4736586,
  "number_of_responses": 1720.0,
  "days_with_responses": 7.0
}

Whenever the raw data set receives a POST:

Find out what week the data point belongs to
Request all the raw data for that week
Summarise and store it, overwriting the summarised entry if it exists

TransformType

POST to http://stagecraft.development.performance.service.gov.uk/transform-type:

Authorization: Bearer development-oauth-access-token
Content-Type: application/json

{
  "name": "user_satisfaction_weekly",
  "function": "backdrop.write.tasks.user_satisfaction_weekly",
  "schema": {}
}

Transform

POST to http://stagecraft.development.performance.service.gov.uk/transform:

Authorization: Bearer development-oauth-access-token
Content-Type: application/json

{
  "type_id": "dd1e1e1d-4273-47cc-85c0-89638ec8d2eb",
  "input": {
    "data-group": "apply-carers-allowance",
    "data-type": "customer-satisfaction"
  },
  "options": {},
  "output": {
    "data-group": "apply-carers-allowance",
    "data-type": "customer-satisfaction-weekly-summary"
  }
}

Pseudocode

def calculate_rating(data):
    # https://github.com/alphagov/spotlight/blob/ca291ffcc86a5397003be340ec263a2466b72cfe/app/common/collections/user-satisfaction.js#L24-35
    min_score = 1
    max_score = 5
    score = 0
    for i in range(min_score, max_score + 1):
        score += sum([datum['_'.join('rating', i)]] * i for datum in data)
    mean = score / sum([datum['total:sum'] for datum in data])
    rating = (mean - min_score) / (max_score - min_score)
    return rating

def user_satisfaction_weekly(latest_datum):
    this_week = latest_datum['_week_start_at']
    data_from_this_week = data_set.find(week_start_at=this_week)
    summary_datum = {
      "_id": base64encode(this_week),
      "_end_at": this_week + 7_days,
      "_start_at": this_week,
      "rating": calculate_rating(data_from_this_week)
      "number_of_responses": sum([datum["total"] for datum in data_from_this_week]),
      "days_with_responses": len(data_from_this_week)
    }
    return summary_datum

alexmuller/index.markdown

Converting the user_satisfaction_graph to use transforms

TransformType

Transform

Pseudocode

Converting the `user_satisfaction_graph` to use transforms