Skip to content

Instantly share code, notes, and snippets.

@damon
Created August 19, 2008 13:48
Show Gist options
  • Save damon/6182 to your computer and use it in GitHub Desktop.
Save damon/6182 to your computer and use it in GitHub Desktop.
# TableSplitter splits record ids up into ranges to use in SQL
#!/usr/bin/env ruby
require File.dirname(__FILE__) + '/../config/environment'
# $ ./splitter.rb 5 2
# break into 5 chunks.
# requesting chunk 2: id >= 74 and id < 144
# For demo purposes, here are all of them:
# requesting chunk 1: id >= 1 and id < 74
# requesting chunk 2: id >= 74 and id < 144
# requesting chunk 3: id >= 144 and id < 215
# requesting chunk 4: id >= 215 and id < 285
# requesting chunk 5: id > 285
class TableSplitter
def initialize (table, num_chunks)
@table = table
@num_chunks = num_chunks
@c = []
split()
end
def split()
sql = "select id from #{@table}"
ids = ActiveRecord::Base.connection.select_values(sql).collect! { |x| x.to_i }
size = ids.size
ids.sort!
chunk_size = (size/@num_chunks).floor
marker = 0
@num_chunks.times do |t|
@c[t] = ids[marker]
marker = marker + chunk_size
end
end
def chunk(num)
if (@num_chunks == num)
result = "id > #{@c[-1]}" # leave the end range open
else
result = "id >= #{@c[num-1]} and id < #{@c[num]}"
end
result
end
end
chunks = ARGV[0].to_i # the number of pieces to break it up into
chunk = ARGV[1].to_i # which chunk do you want?
if ( ( ARGV.size != 2 ) || (chunk > chunks) )
raise "splitter.rb [chunksize] [chunknumber]"
end
puts "break into #{chunks} chunks."
splitter = TableSplitter.new("users", chunks)
puts "requesting chunk #{chunk}: #{splitter.chunk(chunk)}"
puts "For demo purposes, here are all of them:"
chunks.times do |t|
puts "requesting chunk #{t+1}: #{splitter.chunk(t+1)}"
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment