A common analysis is counting values, grouped on common attribute. e.g. What are the counts of passing, failing, missing tests.
IN SQL, we can do this:
SELECT name, status, COUNT(*)
FROM tests
WHERE date > NOW() - INTERVAL 30 DAY
AND name IN ('test1', 'test2')
GROUP BY name, status
And you might expect to see:
name | status | COUNT(*) |
---|---|---|
test1 | passing | 100 |
test1 | failed | 10 |
test1 | aborted | 30 |
test2 | passing | 100 |
To perform aggregations in elasticsearch:
curl https://example.com:1337/tests_index/tset/_search -d @blahblah.json
where blahblah.json
looks like:
{
"size": 0,
"aggs": {
"test_name" : {
"terms" : { "field" : "name", "size": 100 },
"aggs" : {
"status" : {
"terms" : { "field": "status"}
}
}
}
},
"query": {
"bool" : {
"filter" : [
{
"range" : {
"time.start" : {
"gte": "now-1M"
}
}
},
{
"terms": {
"name": [
"test1",
"test2"
]
}
}
]
}
}
}
The query has a bool, which is where we put our filters. In my case, I don't need scoring.
filter
takes a list of AND
conditions.
I've done a range check to look for results in the last month and also limited results to those matching my test name.
The aggregation is a terms
which is a value count.
The top-level aggregration is done on the test name, equivalent to the first column in the SQL group by.
Then sub-aggregation is on the status term.
Sub-aggregations can be nested to continue the "select" + "group by".