-
-
Save ryanlecompte/3281509 to your computer and use it in GitHub Desktop.
# It appears that when I perform a query with AR via multiple threads, | |
# the instantiated objects do not get released when a GC is performed. | |
threads = Array.new(5) { Thread.new { Foo.where(:status => 2).all.first(100).each { |f| f.owner.first_name } } } | |
threads.each(&:join) | |
threads = nil | |
GC.start | |
ObjectSpace.each_object(Foo).count # => instances still exist | |
# ---------------- | |
Foo.where(:status => 2).all.first(100).each { |f| f.owner.first_name } | |
GC.start | |
ObjectSpace.each_object(Foo).count # => 0 | |
My guess would be that 'threads' gets captured by the closure created for each thread (and for Array.new) and you end up having a circular, orphaned dependency -- the Foos don't get GC'd until the threads are, and the thread closures hold references to themselves.
Absolute conjecture, though. Try wrapping the thread creation in a method so there's no locals for the blocks to close over.
I'm guessing you might need to run clear_stale_cached_connections!
Yikes! I didn't get e-mailed by GitHub at all for these comments. Thanks guys! Let me try this out.
Okay, unfortunately that didn't help. I tried both #clear_stale_cached_connections! and wrapping it in a separate method as such:
irb(main):001:0> def go
irb(main):002:1> 10.times.map do
irb(main):003:2* Thread.new do
irb(main):004:3* Foo.where(:status => 2).all.first(50).each { |e| e.owner.first_name }
irb(main):005:3> end
irb(main):006:2> end
irb(main):007:1> end
=> nil
irb(main):008:0> threads = go
=> [#<Thread:0x000000042ce710 run>, #<Thread:0x000000042ce580 run>, #<Thread:0x000000042ce1c0 run>, #<Thread:0x000000042ce008 run>, #<Thread:0x000000042cde00 run>, #<Thread:0x000000042d5bc8 run>, #<Thread:0x000000042d5b28 run>, #<Thread:0x00000003ac02a8 run>, #<Thread:0x00000003ac0438 run>, #<Thread:0x000000042c7f78 run>]
irb(main):009:0> threads.each(&:join)
=> [#<Thread:0x000000042ce710 dead>, #<Thread:0x000000042ce580 dead>, #<Thread:0x000000042ce1c0 dead>, #<Thread:0x000000042ce008 dead>, #<Thread:0x000000042cde00 dead>, #<Thread:0x000000042d5bc8 dead>, #<Thread:0x000000042d5b28 dead>, #<Thread:0x00000003ac02a8 dead>, #<Thread:0x00000003ac0438 dead>, #<Thread:0x000000042c7f78 dead>]
irb(main):010:0> GC.start
=> nil
irb(main):011:0> ObjectSpace.each_object(Foo).count
=> 132890
irb(main):012:0>
irb(main):012:0> ActiveRecord::Base.connection_pool.clear_stale_cached_connections!
=> [35025800, 35025120, 35024900, 35024640, 35040740, 35040660, 30802260, 35012540, 30802460, 35025600]
irb(main):013:0> ObjectSpace.each_object(Foo).count
=> 132890
irb(main):014:0>
This is using Rails 3.0.10 on MRI 1.9.3p194.
Can anyone else try this locally and see if they see the same results?
Threads do retain a return value, which is whatever value the thread's proc returns. Try returning nil from the threads.
It wouldn't hurt to try to disable ActiveRecord's query cache with Foo.uncached, too.
@mboeh, you win! I just tried returning nil as the last value of the Thread block, and that worked! Interesting! I thought that not having any references to the threads would cause them to get garbage collected (and their associated values).
Well, your second example seems to keep references to the threads. And IRB does keep a lot of stuff. On the other hand, if Ruby is keeping references to dead threads indefinitely, that's a problem for sure.
I did my own testing and determined that setting threads to nil is insufficient to get the threads GC'd. You need to do threads.clear.
This is true even if you wrap the code creating the threads in a method, which is surprising -- I'd expect the local variable reference to be lost outside that method's scope, and the threads to be available for GC.
It seems that threads stored in an array in a local variable might not be available to GC when expected. I have a demonstration at https://gist.github.com/3287930 .
Thank you @mboeh. That's very interesting and very good to know!
Things in ObjectSpace are not necessarily live instances. There are shortcut tricks you can use to force memory to be free'd / overwritten (in MRI only, where it's full of hacks at the C level "for speed"). Assuming that any Ruby GC will be fully deterministic around GC.start behavior is unlikely to be productive.
See if there's a difference if you do threads.replace [] instead of threads = nil.