Skip to content

Instantly share code, notes, and snippets.

@dbreunig
Last active January 22, 2021 16:07
Show Gist options
  • Save dbreunig/9315705 to your computer and use it in GitHub Desktop.
Save dbreunig/9315705 to your computer and use it in GitHub Desktop.
A description of the data written to the Reporter App Dropbox save folder.

#Reporter Save File Schema

##The Reporter Export File

Reporter saves to your Dropbox account with plaintext JSON files, one for each day. When a Report is entered in the app a file is created for that day if it does not exist. Otherwise, the report is appended to the existing file. The save folder is located in 'Dropbox/Apps/Reporter-App/'.

Reporter save files are named according to the following convention:

YYYY-MM-DD-reporter-export.json

So, a Reporter file for February 28th, 2014 could be found at the following path:

~/Dropbox/Apps/Reporter-App/2014-02-28-reporter-export.json

Provided, of course, your Dropbox folder is contained within your home directory.

###The Snapshots Array

The root object of document is a dictionary containing a single array, "snapshots". The "snapshots" array is a collection of all the reports from that day, in order of their entry.

Why "snapshots" and not "reports"? In the original version of Reporter that Felton and I built we differentiated by tween the survey "report" and the data captured passively, like battery, location, and weather. The "snapshot" object was created to hold unite these two objects. The name stuck.

####A Snapshot

Each snapshot contains a set of passive metrics gathered by the device without the user's input, any survey responses entered, and a bit of metadata.

When we save a snapshot to Dropbox we write out a JSON version of the entire object. As a result there are a handful of unused properties written. We'll cover these in the Metadata section.

###Passive Metrics

#####Battery

The battery key refers to a double numerical value, between 0 and 1, reflecting the power stored in the iPhone's battery at the time of report.

#####Location

The location dictionary is essentially a CoreLocation CLLocation object, with a CLPlacemark embedded. Refer to the linked documentation for each class for details on their properties.

The placemark object is the result of reverse geocoding the latitude and longitude deribed from iOS's location services. It will often get addresses wrong, but will usually be accurate with ZIP, county, neighborhood, city, and state attributes.

For example, see this output:

{
  "location" : {
    "verticalAccuracy" : 10,
    "timestamp" : 413471577.349213,
    "longitude" : -73.94920271473413,
    "latitude" : 40.71048222608454,
    "course" : 0,
    "placemark" : {
      "thoroughfare" : "Lorimer St",
      "postalCode" : "11206",
      "subAdministrativeArea" : "Kings",
      "subLocality" : "Williamsburg",
      "subThoroughfare" : "451",
      "region" : "<+40.71058351,-73.94912062> radius 23.22",
      "locality" : "New York",
      "name" : "451 Lorimer St",
      "country" : "United States",
      "administrativeArea" : "NY"
    },
    "horizontalAccuracy" : 65,
    "speed" : -1
}

At the time, I was in Williamsburg and very close to 451 Lorimer St, but not actually within said building.

#####Steps

The steps property provides a single numerical value reflecting the number of steps taken between the last report filed and the current report. It is only captured if the user is using an iPhone 5S, which features the M7 motion coprocessor. We at Reporter Inc have decided not to capture steps on other devices because implimenting background step-counting without the M7 is non-trivial and tends to burn battery if you're not careful (read: invest many hours in fine tuning your code for a variety of situations).

#####Audio

Audio is measured decibels, which is "a logarithmic unit used to express the ratio between two values of a physical quantity, often power or intensity." Because it is easier to define a reference sound at the upper limit (where the microphone is overloaded and "clips"), decibels are often expressed as negative values. This is true for the iPhone, so the values that are delivered in this property are the raw output from the iOS CoreAudio API, reflecting the average and peak volume recorded over a single second.

The lower the number, the quieter the noise. The closer the number is to zero (where the audio would clip), the louder the ambient noise.

#####PhotoSet

If the user has taken photos between reports, there will be a photoSet dictionary with a single array of photos written to the snapshot. Each photo object contains the EXIF metadata of the photo. Additionally, the photo object contains a link to the photo asset within iOS. Currently, this information is unused witin the Reporter application and is not of much use outside the iOS system.

#####Connection

The connection attribute indicates the current network connection of the device. Its value corresponds to the following states:

  • 0: Device is connected via cellular network
  • 1: Device is connected via WiFi
  • 2: Device is not connected

#####Weather

The weather dictionary is perhaps the most self-explanitory of the data captured. Dictionary keys are descriptive, detailing the metric and the units used.

For example:

"weather" : {
  "windMPH" : 4,
  "windDirection" : "NW",
  "tempF" : 24.8,
  "precipTodayIn" : 0,
  "windGustKPH" : 20.9,
  "feelslikeC" : -7,
  "visibilityMi" : 10,
  "feelslikeF" : 20,
  "stationID" : "KNYBROOK49",
  "latitude" : 40.69474,
  "windGustMPH" : 13,
  "pressureIn" : 30.31,
  "pressureMb" : 1026,
  "relativeHumidity" : "65%",
  "longitude" : -73.928444,
  "precipTodayMetric" : 0,
  "windKPH" : 6.4,
  "windDegrees" : 318,
  "tempC" : -4,
  "weather" : "Clear",
  "uv" : 0,
  "dewpointC" : -9,
  "visibilityKM" : 16.1
}

Currently, weather data is being captured via the Weather Underground API, whose reference can be found here.

#####Date

In the latest version of Reporter, the timestamp for the report is written out in the following format:

"date" : "2014-03-02T10:18:38-0500"

In previous versions of Reporter, the timestamp was written out as such:

"date" : 412523198.401702

The ethos of Reporter App is to provide you with as much data as possible, in its raw state, if you desire it. However, this is an instance where we took this a bit too far.

The old value of date is the number of seconds which have elapsed since January 1st, 2001 GMT (Obvious, no?). This is the reference date choosen by Apple in their implimentation of NSDate.

#####ReportImpetus

The attribute reportImpetus indicates how the report was triggered. The value for this attribute cooresponds to the following events:

  • 0: Report button tapped
  • 1: Report button tapped while Reporter is asleep
  • 2: Report triggered by notification
  • 3: Report triggered by setting app to sleep
  • 4: Report triggered by waking up app

###Responses

Any information entered by the user in Reporter survey questions will be contained within the responses array. Each question answered is captured as a single dictionary within the array, containing the questionPrompt and the user input or selected responses.

The locationResponse type is the sole exception to this pattern, as it includes the current location data from the iOS location services API and a foursquareVenueId, which is provided by the FourSquare Venues Platform API.

If a question is not answered, it will not be written to the array.

Below are examples of the data stored within the responses array for each question type.

######Token Response

{
  "questionPrompt" : "What are you doing?",
  "tokens" : [
    "Talking",
    "Drinking"
  ]
}

######Multiple Choice Response

{
  "questionPrompt" : "What is your energy?",
  "answeredOptions" : [
    "Neutral"
  ]
}

######Yes/No Response

{
  "questionPrompt" : "Are you working?",
  "answeredOptions" : [
    "No"
  ]
}

######Location Response

{
	"questionPrompt" : "Where are you?",
	"locationResponse" : {
	  "text" : "Welcome to the Johnsons",
	  "location" : {
	    "verticalAccuracy" : -1,
	    "timestamp" : 413520066.367849,
	    "longitude" : -73.98724124700989,
	    "latitude" : 40.71978727764392,
	    "course" : 0,
	    "horizontalAccuracy" : 0,
	    "speed" : -1
	  },
	  "foursquareVenueId" : "3fd66200f964a520dfe31ee3"
	}
}

######People Response

{
  "questionPrompt" : "Who are you with?",
  "tokens" : [
    "Megan Schwartz",
    "Chris Reagan",
    "Josh Behr"
  ]
}

######Number Response

{
  "numericResponse" : "0",
  "questionPrompt" : "How many coffees have you had?"
}

######Note Response

{
  "questionPrompt" : "What are you thinking about?",
  "textResponse" : "I'm likely too old for this bar."
}

###Metadata

The entirity of the data contained within a given report is written to the snapshots array. We have done this for completion purposes and in case new features are added. Rather than having a complex set of hueristics for writing out data, all attributes come along for the ride. This means some attributes are meaningless when written out, specifically:

  • Sync: This is a state variable to ensure each report is saved to Dropbox. It will always be 0 because once it is 1 (or true) the app will not attempt to write it to Dropbox.
  • Background: A state variable indicating the report was captured in the background. We are note captuing reports in the background. Therefore, this attribute is not in use.
  • DwellStatus: Debug variable. Not in use.
  • Draft: A state variable indicating the report is being edited. If it is, it won't be saved. Therefore, this will always be 0.
  • SectionIdentifier: A convenience variable used by the application when displaying reports in a UITableView.
@benmathes
Copy link

Has anyone built any tools to visualize your own data, not individually but perhaps chart relationships and regress for correlates? I might enjoy this as a side project of my own, but don't want to duplicate work.

For example, graph correlating answers. Or: Detect that you're only sad when you haven't slept or had coffee.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment