Skip to content

Instantly share code, notes, and snippets.

@cheerfulstoic
Created September 22, 2014 12:12
Show Gist options
  • Save cheerfulstoic/280f57c9bbee94e22d85 to your computer and use it in GitHub Desktop.
Save cheerfulstoic/280f57c9bbee94e22d85 to your computer and use it in GitHub Desktop.
Script to find a good batch size for neo4j find_in_batches
batch_size = 10_000
previous_nodes_per_second = nil
while true
total_found = 0
puts "Testing batch size of #{batch_size}..."
time_taken = Benchmark.realtime do
i = 0
House.as(:h).find_in_batches(batch_size: batch_size) do |batch|
total_found += batch.size
break if i > 8
i += 1
end
end
puts
nodes_per_second = total_found / time_taken
nodes_per_second_factor = (previous_nodes_per_second ? nodes_per_second / previous_nodes_per_second : 1.2)
batch_size = (batch_size * nodes_per_second_factor).round
puts "Nodes per second: ~#{nodes_per_second.round}" + (previous_nodes_per_second ? " (#{(nodes_per_second_factor * 100).round}% as fast)" : '')
puts "Changing to batch size: #{batch_size}"
previous_nodes_per_second = nodes_per_second
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment