Skip to content

Instantly share code, notes, and snippets.

@thoughtful-solutions
Last active December 10, 2020 11:15
Show Gist options
  • Select an option

  • Save thoughtful-solutions/e154098f65b94e17c7d6165af0d4f3f1 to your computer and use it in GitHub Desktop.

Select an option

Save thoughtful-solutions/e154098f65b94e17c7d6165af0d4f3f1 to your computer and use it in GitHub Desktop.
JSON to CSV

JSON to CSV

jq provides a tool for processeding JSON files, in a similar manner to sed

Proccessing JSON files to CSV

 jq '.daily.calls | input_filename + "," + @csv' *.json  \
    |  sed -e 's/^"\([a-z7]*\)\-\([0-9]*\).json/\"\2,\1/'  \
    | sed -e 's/\"//g' > test.csv  

This takes all keys below .daily.calls within a JSON file.
it puts the name of the file being processed input_filename and append a ',' comma onto the CSV output
Using sed, we break up the filename, cccccccc-yyyymm.json, so that we get yyyymm,ccccccccc
then re remove the '"' then we output a CSV file

Merging Two arrays with JQ

We have a JSON data Structure

{"monthly_calls":{"labels":["total","auth/introspection","auth/authorization","auth/authorization/issue","auth/token","service/get/list",
  "service/jwks/get","service/get","auth/authorization/fail","client/get","serviceowner/authenticate","service/get/admin",
  "developer/authenticate","client/get/list","client/authorization/get/list","developer/logout","client/update",
  "serviceowner/logout","service/update","auth/revocation","client/authorization/delete","client/create","service/delete",
   "client/delete","service/create","serviceowner/get/self"],
"counts":[2157150,2136329,19333,505,313,185,132,111,69,36,26,25,20,20,10,9,6,5,4,3,3,2,1,1,1,1]}
}

We want to create a data structure where, values from each of the arrays and use one for the key and the other for the value in a new array so that it looks like this

{
  "total": 2157150,
  "auth/introspection": 2136329,
  "auth/authorization": 19333,
  "auth/authorization/issue": 505,
  "auth/token": 313,
  "service/get/list": 185,
  "service/jwks/get": 132,
  "service/get": 111,
  "auth/authorization/fail": 69,
  "client/get": 36,
  "serviceowner/authenticate": 26,
  "service/get/admin": 25,
  "developer/authenticate": 20,
  "client/get/list": 20,
  "client/authorization/get/list": 10,
  "developer/logout": 9,
  "client/update": 6,
  "serviceowner/logout": 5,
  "service/update": 4,
  "auth/revocation": 3,
  "client/authorization/delete": 3,
  "client/create": 2,
  "service/delete": 1,
  "client/delete": 1,
  "service/create": 1,
  "serviceowner/get/self": 1,
  "filname": "/tmp/data.json"
}
jq  '[.monthly_calls.labels, .monthly_calls.counts] |\
       transpose | map( {(.[0]): .[1]} ) |\
       add | . + {"filname": input_filename}' *.json

Select Common "Keys" from the data set and produce a CSV

jq -S sorts the keys into alphabetic order

jq -S '[.monthly_calls.labels, .monthly_calls.counts] | transpose | map( {(.[0]): .[1]} ) \
        | add | . + {"AAAAfilename": input_filename} | \
        [. "AAAAfilename" , . "auth/authorization", . "auth/revocation", . "auth/token" , ."total"] |\
        @csv' *.json \
 | sed -e 's/\"//g'  \
 | sed -e 's/\\//g'  \
 | sed -e 's/^\([a-z7]*\)\-\([0-9]*\).json/\2,\1/'                                                                                                                                                                                

So using JQ, we take the arrayes .monthly_calls.labels and .monthly_calls.counts,
we transpose them into a map,
to produce a key value hash.
We add a new key/value pair storing the name of the file we got the data from.
We then extract a series of values associated with some keys.
We then turn these values into a CSV file.
We then use sed to remove " and
We then we break up the filename into customer and date

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment