The new Cloudant query API is pretty awesome. It basically looks like Mongo, which makes for much more readable and user-friendly queries than what you can get with standard map/reduce.
Index the field foo
:
cinnabar:~ nolan$ acurl -X POST https://pouch.cloudant.com/mydb/_index -d '{
> "index": {
> "fields": ["foo"]
> },
> "name" : "foo-index",
> "type" : "json"
> }'
{"result":"created"}
It creates a design document (sigh), but notice that the structure is different from usual:
cinnabar:~ nolan$ acurl -X GET https://pouch.cloudant.com/mydb/_all_docs?include_docs=true | json
{
"total_rows": 1,
"offset": 0,
"rows": [
{
"id": "_design/a5f4711fc9448864a13c81dc71e660b524d7410c",
"key": "_design/a5f4711fc9448864a13c81dc71e660b524d7410c",
"value": {
"rev": "1-17ec75b65cb1ca7d012cd3fcad323848"
},
"doc": {
"_id": "_design/a5f4711fc9448864a13c81dc71e660b524d7410c",
"_rev": "1-17ec75b65cb1ca7d012cd3fcad323848",
"language": "query",
"views": {
"foo-index": {
"map": {
"fields": {
"foo": "asc"
}
},
"reduce": "_count",
"options": {
"def": {
"fields": [
"foo"
]
},
"w": 2
}
}
}
}
}
]
}
Instead of building up a map function, "map" instead maps to an object describing the index structure. Which is great, because as far as I can tell, it'd be impossible to build a map function that covers all the possible selectors ($in
, $nin
, $gt
, $lt
, etc.). What options
does, I'm not sure.
Now, let's put a doc with a foo
field:
cinnabar:~ nolan$ acurl -X PUT https://pouch.cloudant.com/mydb/doc -d '{"foo": "bar"}'
{"ok":true,"id":"doc","rev":"1-4c6114c65e295552ab1019e2b046b10e"}
Check the list of indexes to make sure it's still there:
cinnabar:~ nolan$ acurl -X GET https://pouch.cloudant.com/mydb/_index | json
{
"indexes": [
{
"ddoc": null,
"name": "_all_docs",
"type": "special",
"def": {
"fields": [
{
"_id": "asc"
}
]
}
},
{
"ddoc": "_design/a5f4711fc9448864a13c81dc71e660b524d7410c",
"name": "foo-index",
"type": "json",
"def": {
"fields": [
{
"foo": "asc"
}
]
}
}
]
}
Find all docs where doc.foo
is'bar'
:
cinnabar:~ nolan$ acurl -X POST https://pouch.cloudant.com/mydb/_find -d '{"selector": {"foo": "bar"}}'| json
{
"docs": [
{
"_id": "doc",
"_rev": "1-4c6114c65e295552ab1019e2b046b10e",
"foo": "bar"
}
]
}
Edge case time! What happens if I ask for a field that's not indexed?
cinnabar:~ nolan$ acurl -X POST https://pouch.cloudant.com/mydb/_find -d '{"selector": {"baz": "bar"}}'| json
{
"error": "no_usable_index",
"reason": "No index exists for this selector, try indexing one of: baz"
}
Good choice, Cloudant. By explicitly returning an error when a field isn't indexed, you protect the developer from accidentally writing unperformant queries. No need to do an EXPLAIN
or EXPLAIN QUERY PLAN
; the database will tell you if you're being a dumbass.
How about the _id
field, though? It is technically indexed:
cinnabar:~ nolan$ acurl -X POST https://pouch.cloudant.com/mydb/_find -d '{"selector": {"_id": "bar"}}'| json
{
"docs": []
}
cinnabar:~ nolan$ acurl -X POST https://pouch.cloudant.com/mydb/_find -d '{"selector": {"_id": "doc"}}'| json
{
"docs": [
{
"_id": "doc",
"_rev": "1-4c6114c65e295552ab1019e2b046b10e",
"foo": "bar"
}
]
}
Awesome. Developers don't want to have to learn a separate API to query by _id
, and it's great that they don't have to. Well done.
Other thoughts:
- Kinda bizarre that we don't get
total_rows
oroffset
when using_find
.offset
isn't a big loss IMO (we don't even fully implement it for Pouch), buttotal_rows
can be really convenient when doing pagination (e.g. "2 pages of results remaining"). - Adam Kocoloski said on IRC that they're open to porting this feature to CouchDB, but it won't be in until post-2.0.
- I'm interested in writing a Pouch plugin for this even before it lands in Couch, but it'll require some modifications to core map/reduce. I've already had to fork map/reduce for the quick-search plugin; not too keen to do it again for this.
- That being said, this is going to be a beast to implement in PouchDB. Queries like
{ "selector": {"Actor_name": "Robert De Niro"}, "sort": [{"Movie_year": "asc", "Movie_runtime": "desc"}]}
have a lot going on at once. OurtoIndexableString
idea was pretty clever, but it's not enough to cover every possible combination. In-memory sorting would be unperformant on larger databases, but it may be the only quick and dirty adapter-agnostic solution. - I am not a big fan of design documents, but the fact that this allows you to query by
_id
obviates my biggest complaint about design docs, which is that they muddy the_all_docs
results. - Cloudant released this in the nick of time. I was worried that Lucene would become the de-facto query language, since it was the only one that allowed you to write complex queries that human beings could actually understand.
- One view per design doc is a good default.
Ah, good point. In that case, it might be useful to invent something new for the case where you do a query with
limit
and want to know how many results are left. It'd be a shame to have to depend on Lucene for that.