Created
July 12, 2011 07:39
-
-
Save mwylde/1077565 to your computer and use it in GitHub Desktop.
Hash.new
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Oftentimes, we want to create lists of things that have something in | |
# common. For example we may have a list of objects, each with a | |
# name. We want to count how many times each name occurs. We could do | |
# so with a hash, like this: | |
h = {} | |
a.each{|o| | |
# make sure there's a value here | |
h[o.name] ||= 0 | |
h[o.name] += 1 | |
} | |
# But it turns out there's an easier way, using the less common | |
# Hash.new method: | |
h = Hash.new(0) | |
a.each{|o| h[o.name] += 1} | |
# (or using a reduce to get it down to one line) | |
h = a.reduce(Hash.new(0)){|h, o| h[o.name] += 1; h} | |
# What's going on here? When we supply an argument to Hash.new, we | |
# provide a default value that's used for any key that's not already | |
# in the hash. When we omit the argument (or use the short-hand form | |
# {}) the default value is set to nil. | |
g = Hash.new(5) | |
h = {} | |
g["nancy"] = 10 | |
g["nancy"] # => 10 | |
g["sean"] # => 5 | |
h["sean"] # => nil | |
# However, care should be taken with using normal objects as default | |
# values, as the following example shows: | |
h = Hash.new([]) | |
h["mary"] << 5 | |
h["abigail"] << 10 | |
h["mary"] # => [5, 10] | |
h["abigail"] # => [5, 10] | |
# What's going on here? We're setting a particular object--an instance | |
# of the Array class--to the default value. However, this means that | |
# the exact same array is used for every index of the Hash. Needless | |
# to say, this probably isn't what you want. But there is another, | |
# more powerful form of Hash.new that can solve this exact problem: | |
# Hash.new {|hash, key| ... }. Using the block form, we can fix our | |
# above example: | |
h = Hash.new{|hash, key| hash[key] = []} | |
h["mary"] << 5 | |
h["abigail"] << 10 | |
h["mary"] # => [5] | |
h["abigail"] # => [10] | |
# Now we're producing a new array element for every key. This sort of | |
# thing can be useful if we have a list of objects each with names and | |
# values, and we want a list of the values corresponding to each | |
# name. We can now write: | |
h = Hash.new{|h, k| h[k] = []} | |
a.each{|o| h[o.name] << h.value} | |
# The possibilities are endless. Effectively the block form makes | |
# hashes into lightweight methods much like proc and lambda, with the | |
# added benefit that we get memoization for free. For example, lets | |
# say we want a function to calculate the fibonacci numbers. We all | |
# know the formula: f(n) = f(n-1) + f(n-2). We can compute it | |
# naievely like this: | |
def f(n) | |
case n | |
when 0 then 1 | |
when 1 then 1 | |
else | |
f(n-1) + f(n-2) | |
end | |
end | |
# This works fine for small values of n. f(20) takes less than three | |
# hundreths of a second on my MacBook Pro. But it quickly becomes | |
# useless: f(30) takes 0.34 seconds, but I got bored waiting for f(40) | |
# to finish. Is calculating the Fibbonaci sequence fundamently a hard | |
# problem, where we can't expect to find a good solution? Or do we | |
# just need to be smarter about it? Consider the following expansion, | |
# of the n = 5 case: | |
f(5) = f(4) + f(3) | |
= f(3) + f(2) + f(2) + f(1) | |
= f(2) + f(1) + f(1) + f(0) + f(1) + f(0) + f(1) | |
= f(1) + f(0) + f(1) + f(1) + f(0) + f(1) + f(0) + f(1) | |
= 8 | |
# While we got the right answer, notice how we kept repeating the same | |
# calculations. Even in the trivial case of n = 5, we calculated f(3) | |
# twice, and f(2) three times. Since we know that larger values of the | |
# problem depend on the values of smaller values, we can save | |
# ourselves an enormous amount of time my memoizing the function, | |
# which simply means saving the results of our earlier runs for | |
# later. The Ruby hash is a natural at this. Here's a much more | |
# effecient and yet concise implementation of the above: | |
f = Hash.new do |h, k| | |
case k | |
when 0 then h[0] = 1 | |
when 1 then h[1] = 1 | |
else | |
h[k] = h[k-2] + h[k-1] | |
end | |
end | |
# With our new and improved Fib, we can easily compute f(5000): | |
def time; t = Time.now; yield ; Time.now - t; end | |
time{f[5000]} # => 0.033939 | |
# I should take this time to note that there is a very simple | |
# iterative algorithm for fibbonaci, and you should never actually use | |
# a recursive algorithm to compute it. However, this technique can be | |
# very useful for the many recursive algorithms which can not be | |
# trivially made iterative. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment