Skip to content

Instantly share code, notes, and snippets.

@kshahkshah
Last active September 11, 2015 00:55
Show Gist options
  • Save kshahkshah/a8766e0199333aed1075 to your computer and use it in GitHub Desktop.
Save kshahkshah/a8766e0199333aed1075 to your computer and use it in GitHub Desktop.
# ./config/database.yml is standard except specifying reaping_frequency
# adapter: postgresql
# database: all_products
# username: postgres
# host: localhost
# pool: 2
# reaping_frequency: 30
require 'rubygems'
require 'bundler/setup'
require 'active_record'
require 'models/product'
ActiveRecord::Base.establish_connection(
YAML.load(File.open('./config/database.yml'))
)
class Importer
def run
products = [{name: "unique 1", prop_a: 'val1', prop_b: 'val2'}, {name: "unique 2", prop_a: 'val3', prop_b: 'val4'}, ...]
products.each do |product|
ActiveRecord::Base.connection_pool.with_connection do |connection|
np = Product.new(product)
np.save
end
end
end
end
Importer.new.run
# What ends up happening: way way too many threads open
# ps aux | grep importer.rb | wc -l => 16+, grows over time until my computer crashes
# Also many unique key violations even though I can manually verify this shouldn't be occuring
# if I let the process run, I end up getting the same number of new records being created as the number of products
# in the products Array, so there are no duplicates, it's just getting saved multiple times in multiple threads
# I have tried disconnecting from AR and connecting only inside the block
# I have tried not using the connection pool
# what is interesting is that I have no rescue statements,
# any errors do NOT terminate the script, they appear to some how magically get caught
# I don't know why.
# Rails is NOT loaded or required, this is NOT a Rails app
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment