Skip to content

Instantly share code, notes, and snippets.

@dwnoble
Last active December 21, 2015 11:06
Show Gist options
  • Save dwnoble/4666439 to your computer and use it in GitHub Desktop.
Save dwnoble/4666439 to your computer and use it in GitHub Desktop.
elasticsearch weird behavior when a document's _id is set to an object
# When indexing a document, we can specify an _id in the source to get ES to store it
curl -XDELETE 'http://localhost:9200/testindex/'
curl -XPOST 'http://localhost:9200/testindex/people/50f626ce173183f5f2d9dec0?refresh=true' -d '{
"_id" : "50f626ce173183f5f2d9dec0",
"person" : {
"age" : 25,
"name" : "Bob"
}
}'
# And we can now find this document with say, a range query:
curl -XGET 'http://localhost:9200/testindex/_search?size=1' -d '
{
"query": {
"filtered": {
"filter": {
"range": {
"person.age": {
"from": 20,
"to" : 30
}
}
},
"query" : {
"match_all" : { }
}
}
}
}'
# If we specify an _id that doesn't match the id in the url, ES (correctly) throws an error
curl -XDELETE 'http://localhost:9200/testindex/'
curl -XPOST 'http://localhost:9200/testindex/people/50f626ce173183f5f2d9dec0?refresh=true' -d '{
"_id" : "some_other_id",
"person" : {
"age" : 25,
"name" : "Bob"
}
}'
# Returns:
# {"error":"MapperParsingException[Failed to parse [_id]]; nested: MapperParsingException[Provided id [50f626ce173183f5f2d9dec0] does not match the content one [some_other_id]]; ","status":400}
# However, if we set the document's _id to an object, ES doesn't complain
curl -XDELETE 'http://localhost:9200/testindex/'
curl -XPOST 'http://localhost:9200/testindex/people/50f626ce173183f5f2d9dec0?refresh=true' -d '{
"_id" : {
"$oid" : "50f626ce173183f5f2d9dec0"
},
"person" : {
"age" : 25,
"name" : "Bob"
}
}'
# In this case where the _id is set to an object, we can no longer find the object with a range query.
curl -XGET 'http://localhost:9200/testindex/_search?size=1' -d '
{
"query": {
"filtered": {
"filter": {
"range": {
"person.age": {
"from": 20,
"to" : 30
}
}
},
"query" : {
"match_all" : { }
}
}
}
}'
# Returns:
# {"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}
# But, if we take out the range, we can still see that the object exists
curl -XGET 'http://localhost:9200/testindex/_search?size=1' -d '
{
"query" : {
"match_all" : { }
}
}'
# It looks like when the _id is set to an object, the document isn't getting correctly indexed.
# I think ES should probably be throwing an error when trying to index a document with a non-string _id.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment