- Simple filter
- Access objects
- Access lists/arrays
- Combine filters with pipe
- Raw output
- Transform
- Feed into multiple filters
jq is an excellent command line tool to operate on JSON data. I have been using it to process, filter and transform json objects for easy inference of the data. Noting down some commonly used operations for my later reference.
- Syntax -
jq [options] <filter>
. Reads from stdin by default. - Filter specifies the expression to apply on the json data.
.
- identity filter, output is same as input.
~ % curl -s --compressed "https://api.stackexchange.com/2.2/tags?site=stackoverflow&pagesize=2" | jq '.'
{
"items": [
{
"has_synonyms": true,
"is_moderator_only": false,
"is_required": false,
"count": 2204785,
"name": "javascript"
},
{
"has_synonyms": true,
"is_moderator_only": false,
"is_required": false,
"count": 1770006,
"name": "java"
}
],
"has_more": true,
"quota_max": 300,
"quota_remaining": 219
}
.object
- accessobject
in the current stream..object1,.object2
to access multiple objects
~ % curl -s --compressed "https://api.stackexchange.com/2.2/sites" | jq '.quota_max,.quota_remaining'
300
216
.parent.child
- access child of a parent json value. Equivalent toparent[child]
syntax
Arrays are accessed using []
operator
.[]
- access all items in the array (e.g.input | jq '.items[]'
).[i]
- index object at indexi
(e.g.input | jq '.items[1].name'
).[i:j]
- slice the array between indexi
andj
.
~ % cat stackexchange_sites | jq '.items[1].name'
"Server Fault"
Filters can be combined using pipe operator |
. Filter expressions are separated by space.
e.g. ~ % cat stackexchange_sites | jq '.items[] | .api_site_parameter'
(api_site_parameter specifies the name of the API to be used in the "site" parameter in StackExchange API requests.)
"stackoverflow"
"serverfault"
"superuser"
"meta"
"webapps"
"webapps.meta"
"gaming"
"gaming.meta"
"webmasters"
"webmasters.meta"
"cooking"
"cooking.meta"
"gamedev"
"gamedev.meta"
"photo"
"photo.meta"
"stats"
"stats.meta"
"math"
"math.meta"
"diy"
"diy.meta"
"meta.superuser"
"meta.serverfault"
"gis"
"gis.meta"
"tex"
"tex.meta"
"askubuntu"
"meta.askubuntu"
--raw-output / -r
option outputs the data as raw (without any json formatting). This comes in handy to apply further operations on the data using shell commands.
e.g. list stack exchange sites, starting with S, in sorted order.
~ % cat stackexchange_sites | jq --raw-output '.items[] | .name' | sort | grep "^S"
Seasoned Advice
Seasoned Advice Meta
Server Fault
Stack Overflow
Super User
We can also transform one json stream into another by specifying a filter with structure in { key : value}
where value
is the object to extract from the stream.
e.g. Extracting the site_url from StackExchange sites list ~ % cat stackexchange_sites | jq '.items[0:5] | .[] | { "name" : .name, "site" : .site_url}'
{
"name": "Stack Overflow",
"site": "https://stackoverflow.com"
}
{
"name": "Server Fault",
"site": "https://serverfault.com"
}
{
"name": "Super User",
"site": "https://superuser.com"
}
{
"name": "Meta Stack Exchange",
"site": "https://meta.stackexchange.com"
}
{
"name": "Web Applications",
"site": "https://webapps.stackexchange.com"
}
,
operator can be to feed same input into multiple filters (similar to the tee
utility in Linux). Comes handy in sequential processing.
e.g. ~ % cat stackexchange_sites | jq '.items[1:5] | .[].name,.[].site_url'
"Server Fault"
"Super User"
"Meta Stack Exchange"
"Web Applications"
"https://serverfault.com"
"https://superuser.com"
"https://meta.stackexchange.com"
"https://webapps.stackexchange.com"