Skip to content

Instantly share code, notes, and snippets.

@bobpoekert
Last active August 29, 2015 14:05
Show Gist options
  • Save bobpoekert/f4613bde4fabae5b50bb to your computer and use it in GitHub Desktop.
Save bobpoekert/f4613bde4fabae5b50bb to your computer and use it in GitHub Desktop.
Elasticsearch errors
  • Doing a query with a has_parent filter when a parent-child relation references a mapping that doesn't exist returns a NullPointerException (instead of a more informative error)
  • Adding a port number to a unicast host in elasticsearch.yml causes that node to recieve invalid (ie unparseable) http requests
  • Missing a newline in a bulk insert request caused subsequent queries on that index to return invalid json
  • Doing a delete by query on an index that removed a significant number of documents caused refresh requests on that index to return NullPointerExceptions
  • Shards moving between nodes for no apparent reason
  • Shards becoming unassigned for no apparent reason
  • Shards becoming unassigned even when all of the shards in the cluster had been routed manually and shard allocation had been disabled
  • Shards losing all of their documents if a write is performed while it's unavailable
@polyfractal
Copy link

Opened this ticket for the bulk problem (applies to regular indexing too it seems): elastic/elasticsearch#7299

Will look into the delete-by-query and refresh situation, see if i can reproduce it.

Will look into the "initializing"-delete-all-docs situation too.

No idea about the port situation...Ive never heard that before (and we routinely change/configure ports for ourselves and various customers). Were you using the transport port, or the HTTP port?

This might be intended behavior but it makes the db difficult to work with operationally. You want to be able to predict when expensive operations are going to happen so you can make sure that they don't interfere with the work your db is supposed to be doing.

If you don't want shards moving around at all, you can set:

curl -XPUT "http://localhost:9200/_cluster/settings" -d'
{
  "persistent": {
    "cluster.routing.allocation.enable" : "none"
  }
}'

Prevents any rebalancing/allocation at all. Or you could set new_primaries, which is likely the better option: it will allocate new primaries but nothing else.

Ultimately, ES is designed to perform these maintenance operations in the background. Rather than preventing it, it's better to just throttle the operations until they don't affect your cluster anymore.

  • You can throttle the process using: indices.recovery.max_bytes_per_sec, and set it to something reasonable that doesn't overwhelm your network/disk IO.
  • You could also set cluster.routing.allocation.cluster_concurrent_rebalance: 1 which only allows one rebalance to occur in the cluster at a given time, to throttle how much background activity is happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment