mwylde · July 12, 2011 07:39
diff --git a/hash.rb b/hash.rb
 # Oftentimes, we want to create lists of things that have something in
 # common. For example we may have a list of objects, each with a
 # name. We want to count how many times each name occurs. We could do
 # so with a hash, like this:

 h = {}
 a.each{|o|
  # make sure there's a value here
  h[o.name] ||= 0
  h[o.name] += 1
 }

 # But it turns out there's an easier way, using the less common
 # Hash.new method:

 h = Hash.new(0)
 a.each{|o| h[o.name] += 1}

 # (or using a reduce to get it down to one line)

 h = a.reduce(Hash.new(0)){|h, o| h[o.name] += 1; h}

 # What's going on here? When we supply an argument to Hash.new, we
 # provide a default value that's used for any key that's not already
 # in the hash. When we omit the argument (or use the short-hand form
 # {}) the default value is set to nil.

 g = Hash.new(5)
 h = {} 
 g["nancy"] = 10

 g["nancy"] # => 10
 g["sean"]  # => 5
 h["sean"]  # => nil

 # However, care should be taken with using normal objects as default
 # values, as the following example shows:

 h = Hash.new([])
 h["mary"] << 5
 h["abigail"] << 10

 h["mary"]    # => [5, 10]
 h["abigail"] # => [5, 10]

 # What's going on here? We're setting a particular object--an instance
 # of the Array class--to the default value. However, this means that
 # the exact same array is used for every index of the Hash. Needless
 # to say, this probably isn't what you want.  But there is another,
 # more powerful form of Hash.new that can solve this exact problem:
 # Hash.new {|hash, key| ... }. Using the block form, we can fix our
 # above example:

 h = Hash.new{|hash, key| hash[key] = []}

 h["mary"] << 5
 h["abigail"] << 10

 h["mary"]    # => [5]
 h["abigail"] # => [10]

 # Now we're producing a new array element for every key. This sort of
 # thing can be useful if we have a list of objects each with names and
 # values, and we want a list of the values corresponding to each
 # name. We can now write:

 h = Hash.new{|h, k| h[k] = []}
 a.each{|o| h[o.name] << h.value}

 # The possibilities are endless. Effectively the block form makes
 # hashes into lightweight methods much like proc and lambda, with the
 # added benefit that we get memoization for free. For example, lets
 # say we want a function to calculate the fibonacci numbers. We all
 # know the formula: f(n) = f(n-1) + f(n-2). We can compute it
 # naievely like this:

 def f(n)
  case n
  when 0 then 1
  when 1 then 1
  else
    f(n-1) + f(n-2)
  end
 end

 # This works fine for small values of n. f(20) takes less than three
 # hundreths of a second on my MacBook Pro. But it quickly becomes
 # useless: f(30) takes 0.34 seconds, but I got bored waiting for f(40)
 # to finish. Is calculating the Fibbonaci sequence fundamently a hard
 # problem, where we can't expect to find a good solution? Or do we
 # just need to be smarter about it? Consider the following expansion,
 # of the n = 5 case:

 f(5) = f(4) + f(3)
     = f(3) + f(2) + f(2) + f(1)
     = f(2) + f(1) + f(1) + f(0) + f(1) + f(0) + f(1)
     = f(1) + f(0) + f(1) + f(1) + f(0) + f(1) + f(0) + f(1)
     = 8

 # While we got the right answer, notice how we kept repeating the same
 # calculations. Even in the trivial case of n = 5, we calculated f(3)
 # twice, and f(2) three times. Since we know that larger values of the
 # problem depend on the values of smaller values, we can save
 # ourselves an enormous amount of time my memoizing the function,
 # which simply means saving the results of our earlier runs for
 # later. The Ruby hash is a natural at this. Here's a much more
 # effecient and yet concise implementation of the above:

 f = Hash.new do |h, k|
  case k
  when 0 then h[0] = 1
  when 1 then h[1] = 1
  else
    h[k] = h[k-2] + h[k-1]
  end
 end

 # With our new and improved Fib, we can easily compute f(5000):
 def time; t = Time.now; yield ; Time.now - t; end
 time{f[5000]} # => 0.033939

 # I should take this time to note that there is a very simple
 # iterative algorithm for fibbonaci, and you should never actually use
 # a recursive algorithm to compute it. However, this technique can be
 # very useful for the many recursive algorithms which can not be
 # trivially made iterative.
	# Oftentimes, we want to create lists of things that have something in
	# common. For example we may have a list of objects, each with a
	# name. We want to count how many times each name occurs. We could do
	# so with a hash, like this:

	h = {}
	a.each{\|o\|
	# make sure there's a value here
	h[o.name] \|\|= 0
	h[o.name] += 1
	}

	# But it turns out there's an easier way, using the less common
	# Hash.new method:

	h = Hash.new(0)
	a.each{\|o\| h[o.name] += 1}

	# (or using a reduce to get it down to one line)

	h = a.reduce(Hash.new(0)){\|h, o\| h[o.name] += 1; h}

	# What's going on here? When we supply an argument to Hash.new, we
	# provide a default value that's used for any key that's not already
	# in the hash. When we omit the argument (or use the short-hand form
	# {}) the default value is set to nil.

	g = Hash.new(5)
	h = {}
	g["nancy"] = 10

	g["nancy"] # => 10
	g["sean"] # => 5
	h["sean"] # => nil

	# However, care should be taken with using normal objects as default
	# values, as the following example shows:

	h = Hash.new([])
	h["mary"] << 5
	h["abigail"] << 10

	h["mary"] # => [5, 10]
	h["abigail"] # => [5, 10]

	# What's going on here? We're setting a particular object--an instance
	# of the Array class--to the default value. However, this means that
	# the exact same array is used for every index of the Hash. Needless
	# to say, this probably isn't what you want. But there is another,
	# more powerful form of Hash.new that can solve this exact problem:
	# Hash.new {\|hash, key\| ... }. Using the block form, we can fix our
	# above example:

	h = Hash.new{\|hash, key\| hash[key] = []}

	h["mary"] << 5
	h["abigail"] << 10

	h["mary"] # => [5]
	h["abigail"] # => [10]

	# Now we're producing a new array element for every key. This sort of
	# thing can be useful if we have a list of objects each with names and
	# values, and we want a list of the values corresponding to each
	# name. We can now write:

	h = Hash.new{\|h, k\| h[k] = []}
	a.each{\|o\| h[o.name] << h.value}

	# The possibilities are endless. Effectively the block form makes
	# hashes into lightweight methods much like proc and lambda, with the
	# added benefit that we get memoization for free. For example, lets
	# say we want a function to calculate the fibonacci numbers. We all
	# know the formula: f(n) = f(n-1) + f(n-2). We can compute it
	# naievely like this:

	def f(n)
	case n
	when 0 then 1
	when 1 then 1
	else
	f(n-1) + f(n-2)
	end
	end

	# This works fine for small values of n. f(20) takes less than three
	# hundreths of a second on my MacBook Pro. But it quickly becomes
	# useless: f(30) takes 0.34 seconds, but I got bored waiting for f(40)
	# to finish. Is calculating the Fibbonaci sequence fundamently a hard
	# problem, where we can't expect to find a good solution? Or do we
	# just need to be smarter about it? Consider the following expansion,
	# of the n = 5 case:

	f(5) = f(4) + f(3)
	= f(3) + f(2) + f(2) + f(1)
	= f(2) + f(1) + f(1) + f(0) + f(1) + f(0) + f(1)
	= f(1) + f(0) + f(1) + f(1) + f(0) + f(1) + f(0) + f(1)
	= 8

	# While we got the right answer, notice how we kept repeating the same
	# calculations. Even in the trivial case of n = 5, we calculated f(3)
	# twice, and f(2) three times. Since we know that larger values of the
	# problem depend on the values of smaller values, we can save
	# ourselves an enormous amount of time my memoizing the function,
	# which simply means saving the results of our earlier runs for
	# later. The Ruby hash is a natural at this. Here's a much more
	# effecient and yet concise implementation of the above:

	f = Hash.new do \|h, k\|
	case k
	when 0 then h[0] = 1
	when 1 then h[1] = 1
	else
	h[k] = h[k-2] + h[k-1]
	end
	end

	# With our new and improved Fib, we can easily compute f(5000):
	def time; t = Time.now; yield ; Time.now - t; end
	time{f[5000]} # => 0.033939

	# I should take this time to note that there is a very simple
	# iterative algorithm for fibbonaci, and you should never actually use
	# a recursive algorithm to compute it. However, this technique can be
	# very useful for the many recursive algorithms which can not be
	# trivially made iterative.