Last active
April 20, 2019 19:22
-
-
Save gyli/c32b2af531e5948540e176dcc0b6bb73 to your computer and use it in GitHub Desktop.
Find the best number for parameter number_of_routing_shards in Elasticsearch
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# https://www.elastic.co/guide/en/elasticsearch/reference/7.0/indices-split-index.html | |
# Parameter number_of_routing_shards is used for splitting index in Elasticsearch | |
# Since ES 7.0, it has default value, which is designed to split by factors of 2 up to a maximum of 1024 shards. | |
# However, depending on the original number of primary shards, the default value might not be the best choice, | |
# since it might not provide the most possibles the shards could be split to. | |
# For example, if the current primary shard number is 5, es would give number_of_routing_shards 650 as default value, | |
# and it allows the index to be split to 10, 20, 40, 80, 160, 320 or 640. | |
# However, assuming the maximum shard number is still 1024, set number_of_routing_shards to 900 would give the splitting | |
# more options: 5, 10, 15, 20, 25, 30, 45, 50, 60, 75, 90, 100, 150, 180, 225, 300, 450, 900 | |
def find_best_number_of_routing_shards(current_shard: int, upper_limit: int = 1024) -> int: | |
max_count = 0 | |
result = current_shard | |
for num in range(current_shard, upper_limit+1): | |
count = 0 | |
for i in range(1, num + 1): | |
if num % i == 0: | |
if i % current_shard == 0: | |
count += 1 | |
if count > max_count: | |
max_count = count | |
result = num | |
return result | |
find_best_number_of_routing_shards(current_shard=5) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment