To produce the statistics that we're currently interested in, here are the core data points I've identified for each category.
- account_id (uid)
- account_created_at
- device_id
- device_created_at
- device_deleted_at
- device_type
- device_last_active_at
device_last_active_at
can be derived from the Sync log timestamps.
Example Stats
- Sync MAU by device count
- Sync retention
- % of Sync users by active device count
- % of Sync users by device types
- % of lapsed users by device count
- number of users whose device count changed
- number of users by device count
- time to create second device
-
uid
-
device_id
-
device_type
-
sync_api_method
-
sync_api_status
-
sync_api_errno
Example Stats
- % of users who sync each category
- time until second device fetches updates
- number of users affected by critical errors
The relevant log lines for sourcing this data come from the FxA auth-server and Sync storage-server.
- Account created event
- uid
- created_at
- Device created event
- uid
- device_id
- created_at
- Device deleted event
- uid
- device_id
- deleted_at
All sync app requests signal activity
- uid
- device_id
- device_type
- path
- status
- timestamp
- errno
- ...
All the proposed metrics could be produced from a relational model of the log data and standard SQL aggregate functions. I'm not saying we should do it this way, but its the model I'm most comfortable with. The table structure could be something like this:
- uid
- createdAt
- device_id
- uid
- type
- added_at
- removed_at
- uid
- device_id
- request_path
- request_started_at
- request_completed_at
- request_size
- response_started_at
- response_completed_at
- response_size
- response_status