Created
March 1, 2015 20:07
-
-
Save anonymous/1309ff2fa6b9778780b6 to your computer and use it in GitHub Desktop.
Experiment: number of recursive method calls before running out of stack.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ===== Environment (MRI 2.1.1) ===== | |
RUBY_VERSION # => "2.1.1" | |
RUBY_PLATFORM # => "x86_64-darwin13.0" | |
RbConfig.ruby # => "/Users/josh/.rubies/ruby-2.1.1/bin/ruby" | |
`#{RbConfig.ruby} -v` # => "ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-darwin13.0]\n" | |
# ===== The Hypothesis ===== | |
# A recursive method can call itself about 5000 times before it overflows. | |
# ===== Test #1 ===== | |
# Procedure: | |
# Count how many times we can recursively invoke a function before it explodes. | |
# Repeat this test with different numbers of local variables to see if locals make a difference. | |
# | |
# Resuls: | |
# stack_size_0_locals # => 10078 | |
# stack_size_1_local # => 9358 | |
# stack_size_10_locals # => 5696 | |
# stack_size_20_locals # => 3970 | |
# | |
# Conclusion: | |
# The hypothesis is false as the number of calls is shown to vary. | |
# | |
# Implication: | |
# A method with 1 long variable name can be called fewer times than a method with 1 short variable name | |
# I test this in Test #2, it is incorrect, see that test for analysis | |
# | |
# Interpretation: | |
# The number of recursive calls we can make is probably the stack size divided by the stack frame size. | |
# Since local variables use the same memory as the stack, they change the stack frame size, | |
# thus changing the number of recursive calls. | |
# | |
# Possible flaw: | |
# Seeing Is Believing could be affecting it: | |
# This file is loaded before our program runs | |
# https://github.com/JoshCheek/seeing_is_believing/blob/267b9e8f5a8cdff268658376bb10670de102c58f/lib/seeing_is_believing/the_matrix.rb | |
# It creates a SeeingIsBelieving::EventStream::Producer, which has a thread reporting events back to the main process | |
# https://github.com/JoshCheek/seeing_is_believing/blob/267b9e8f5a8cdff268658376bb10670de102c58f/lib/seeing_is_believing/event_stream/producer.rb | |
# We can check this by running without SiB. | |
# I did this and got the exact same numbers. | |
# | |
# Possible flaw: | |
# This may depend on implementation, we can check this by running on other versions. | |
# | |
# MRI: ruby 2.2.0p0 (2014-12-25 revision 49005) [x86_64-darwin13] | |
# Congruent with our results | |
# stack_size_0_locals # => 10078 | |
# stack_size_1_local # => 9358 | |
# stack_size_10_locals # => 5696 | |
# stack_size_20_locals # => 3970 | |
# | |
# MRI: ruby 1.9.3p545 (2014-02-24) [x86_64-darwin13.2.0] | |
# Congruent with our results. At 20 variables, it locked up. I repeated the experiment, same thing. I'm not sure why | |
# stack_size_0_locals # => 9357 | |
# stack_size_1_local # => 8733 | |
# stack_size_10_locals # => 5458 | |
# stack_size_20_locals # => | |
# | |
# RBX: rubinius 2.5.0 (2.1.0 50777f41 2015-01-17 3.5.0 JI) [x86_64-darwin14.1.0] | |
# Congruent with our results, though less pronounced | |
# stack_size_0_locals # => 3613 | |
# stack_size_1_local # => 3585 | |
# stack_size_10_locals # => 3347 | |
# stack_size_20_locals # => 3118 | |
# | |
# JRUBY: jruby 1.7.16 (1.9.3p392) 2014-09-25 575b395 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_51-b13 +jit [darwin-x86_64] | |
# Congruent with our results. Note that I had to rescue Exception instead of SystemStackError. | |
# Its original error message: "Error: Your application used more stack memory than the safety cap of 2048K." | |
# stack_size_0_locals # => 3964 | |
# stack_size_1_local # => 3673 | |
# stack_size_10_locals # => 3244 | |
# stack_size_20_locals # => 2974 | |
# | |
# Possible flaw: | |
# It could store the stack in the same memory as the heap. | |
# This would imply that the size of the program memory determines the callstack size. | |
# We could check this by allocating a lot of objects, counting callstack size, freeing them, and counting again. | |
# If the numbers are the same, then the stack's memory is independent from the state of the heap. | |
# I do this in Test #3 | |
# 0 locals | |
def recursive_0_locals | |
@count += 1 | |
recursive_0_locals | |
end | |
def stack_size_0_locals | |
@count = 0 | |
recursive_0_locals | |
rescue SystemStackError | |
return @count | |
end | |
# 1 local | |
def recursive_1_local | |
@count += 1 | |
a = nil | |
recursive_1_local | |
end | |
def stack_size_1_local | |
@count = 0 | |
recursive_1_local | |
rescue SystemStackError | |
return @count | |
end | |
# 10 locals | |
def recursive_10_locals | |
a = b = c = d = e = | |
f = g = h = i = j = nil | |
@count += 1 | |
recursive_10_locals | |
end | |
def stack_size_10_locals | |
@count = 0 | |
recursive_10_locals | |
rescue SystemStackError | |
return @count | |
end | |
# 20 locals | |
def recursive_20_locals | |
a = b = c = d = e = | |
f = g = h = i = j = | |
k = l = m = n = o = | |
p = q = r = s = t = nil | |
@count += 1 | |
recursive_20_locals | |
end | |
def stack_size_20_locals | |
@count = 0 | |
recursive_20_locals | |
rescue SystemStackError | |
return @count | |
end | |
# repeat 2x, in case order matters | |
stack_size_0_locals # => 10078 | |
stack_size_1_local # => 9358 | |
stack_size_10_locals # => 5696 | |
stack_size_20_locals # => 3970 | |
stack_size_0_locals # => 10078 | |
stack_size_1_local # => 9358 | |
stack_size_10_locals # => 5696 | |
stack_size_20_locals # => 3970 | |
# ===== Test #2 ===== | |
# Hypothesis: | |
# "A method with 1 long variable name can be called fewer times than a method with 1 short variable name" | |
# This is implied by Test #1, above | |
# | |
# Procedure: | |
# Count how many times we can recursively invoke a function with a 1-letter variable name. | |
# Repeat this for a 100-letter variable name. | |
# The one with the 100-letter variable name should take more stack memory, and thus overflow after fewer calls. | |
# | |
# Resuls: | |
# stack_size_1_letter # => 9358 | |
# stack_size_100_letters # => 9358 | |
# | |
# Conclusion: | |
# The hypothesis is false as the number of calls did not vary. | |
# | |
# Implication: | |
# The local variable is stored internally (most likely as a symbol -- verified in Test #4), | |
# The local variables are then stored in a data structure (probably a hash) | |
# that references the symbol by address. Since the address of a 1-letter variable | |
# and the address of a 100-letter variable will be the same size, the stackframes | |
# are also the same size. | |
# | |
# Possible flaw: | |
# Ruby could track locals as offsets from the stackframe, as C does, | |
# then the variable names are irrelevant, and we would still see these results. | |
# We could verify this by searching object space to see if the local variables were turned into symbols. | |
# See Test #4 for this | |
# 1 letter | |
def recursive_1_letter | |
a = nil | |
@count += 1 | |
recursive_1_letter | |
end | |
def stack_size_1_letter | |
@count = 0 | |
recursive_1_letter | |
rescue SystemStackError | |
return @count | |
end | |
# 100 letters | |
def recursive_100_letters | |
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa \ | |
= nil | |
@count += 1 | |
recursive_100_letters | |
end | |
def stack_size_100_letters | |
@count = 0 | |
recursive_100_letters | |
rescue SystemStackError | |
return @count | |
end | |
stack_size_1_letter # => 9358 | |
stack_size_100_letters # => 9358 | |
stack_size_1_letter # => 9358 | |
stack_size_100_letters # => 9358 | |
# ===== Test #3 ===== | |
# Hypothesis: | |
# The stack is stored in the same meory as the heap, | |
# so its size depends on how much heap memory is available. | |
# | |
# Procedure: | |
# Count how many recursive calls we have before it overflows. | |
# Allocate 100,000 objects, free the objects. | |
# Our program should have more memory available, but not more objects consuming it. | |
# If we count again, the count should increase | |
# as we will have more memory and thus recurse further before overflow. | |
# | |
# Resuls: | |
# The program memory grew by 6,144kb | |
# The stack size was 10,078 calls regardless of the amount of memory available to the program. | |
# | |
# Conclusion: | |
# The hypothesis is false. | |
# | |
# Interpretation: | |
# The stack size is independent of the heap size, thus the above experiments are valid. | |
def program_size_in_kb | |
# `man ps` says that `rss` is "the real memory (resident set) size of the process (in 1024 byte units)" | |
# experimentally, `rss` is the 6th column when I run `ps aux`, so calling that to find memory usage. | |
%x[ps aux].lines.find { |line| line.include? Process.pid.to_s }.split[5].to_i | |
end | |
def recursive | |
@count += 1 | |
recursive | |
end | |
def stack_size | |
@count = 0 | |
recursive | |
rescue SystemStackError | |
return @count | |
end | |
# Initial count and program size and stack count | |
initial_size = program_size_in_kb # => 9092 | |
initial_count = stack_size # => 10078 | |
# Allocate 100k objects, verify they are freed | |
objects = Array.new(100_000) { Object.new } | |
num_objs1 = GC.stat[:total_allocated_object] - GC.stat[:total_freed_object] | |
objects = nil | |
GC.start | |
num_objs2 = GC.stat[:total_allocated_object] - GC.stat[:total_freed_object] | |
num_objs1 # => 108307 | |
num_objs2 # => 8307 | |
num_objs1 - num_objs2 # => 100000 | |
# Our program should now have more memory available, but the count should not change | |
# note that these numbers may vary slightly with each run | |
program_size_in_kb - initial_size # => 6500 | |
initial_count # => 10078 | |
stack_size # => 10078 | |
# ===== Test #4 ===== | |
# Hypothesis: | |
# Local variables are stored as references to a symbol. | |
# | |
# Procedure: | |
# Look through object space to see if there is a symbol name that matches the local variable name. | |
# | |
# Resuls: | |
# We found the symbol. | |
# | |
# Conclusion: | |
# There is such a symbol, the hypothesis is not disproven. | |
# So it could be that locals are stored as a hash, | |
# this means the hashes themselves would be the same size. | |
# For example, say that the symbol :a is stored at memory address 0x100, | |
# and :aa is stored at 0x200, and :value is stored at 0x300. | |
# Then the hashes {a: :value} and {aa: :value} | |
# would internally be {0x100 => 0x300}, and {0x200 => 0x300} | |
# so they are the same size even though their keys are different sizes. | |
def has_a_local | |
i_am_a_local_variable = nil | |
end | |
Symbol.all_symbols.grep(/^i_am/) # => [:i_am_a_local_variable] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment