Skip to content

Instantly share code, notes, and snippets.

@subratrout
Forked from JoshCheek/modules.md
Created March 3, 2016 23:19
Show Gist options
  • Save subratrout/124ef2f018e71559d058 to your computer and use it in GitHub Desktop.
Save subratrout/124ef2f018e71559d058 to your computer and use it in GitHub Desktop.
Modules (lesson from 29 July 2015) covers object model, namespaces, mixins, and functions.

Modules

To understand modules, we first have to understand a little bit about how Ruby works. So, lets define the things that exist in Ruby and see how they work together. Then, we will be able to understand how modules fit into this, and what they are here for.

Definitions

Any time you have a name, you have a hash

Any time you refer to something by name, you can think of the name as a "key", and the thing you are referring to as a "value". Then you can represent this with anything which can represent key/value pairs. By far the most common way to do this, especially in Ruby, is to use a hash. And we find that this is what makes up the things we see in Ruby.

Objects -- the nouns of Ruby

An object is a hash with symbols for keys. It has the keys :class, and :instance_variables.

The purpose of an object is to store data, via its instance variables.

If we think of the object as a noun, we need some verbs to act on it! The verbs are methods, and those are stored in classes, not objects, so we give each object the :class key, in order to locate the methods we can call on it.

{ class:             nil, # in reality, not nil, but one of the hashes defined below
  instance_variables: {
    :@year  => 1994,      # some example instance variables.
    :@make  => "Toyota",  # in reality, the values are other objects (other hashes like this one)
    :@model => "Camry",   # but for simplicity, I'll write them like this.
  }
}

Classes -- the verbs of Ruby

A class is a hash with symbols for keys. It has the keys :superclass, :methods, and :constants.

A class is also an object, which means that it also has the keys that objects have: :class, and :instance_variables.

The purpose of the class is to store the instructions for operating on an object.

Each car's year may be different, which is why we store it on the object. But each car's set of steps for incrementing the odometer are the same: @odometer += 1 These are methods... functions... steps... instructions... verbs.

If we had put them on the car itself, there would be frivolously redundant sets of instructions on every object. So we put the methods in one common place that all cars can go to to find: the class.

{ # class stuff
  superclass: nil, # Look here if I don't have the method you want.
  methods:    {},  # keys are method names, as symbols, values are the method bodies
  constants:  {},  # keys are constant names, as symbols (eg :Object, :String),
                   # values are any Ruby object (ie any hash with a class and superclass)

  # object stuff
  class:              nil,
  instance_variables: {},
}

Note that method bodies are instructions to Ruby to do things like "get an instance variable", "set a local variable", "call a method", etc. If you want to see what they look like, here is some code to do it: https://gist.github.com/JoshCheek/a8e9dbb6f54fd1a69216

Modules

A module is a class without the superclass.

The purpose of a module is to store constants for namespacing, and hold methods for "mixing into" classes.

We'll see why we want these, what they are, and how how they work down below.

{ # class stuff
  methods:    {},  # keys are method names, as symbols, values are the method bodies
  constants:  {},  # keys are constant names, as symbols (eg :Object, :String),
                   # values are any Ruby object (ie any hash with a class and superclass)

  # object stuff
  class:              nil,
  instance_variables: {},
}

Bindings -- sentences, maybe, lol

A binding is a hash with symbols for keys. It has the keys :self, :local_variables, :return_value, :next_binding.

The purpose of a binding is to store the information we need to actually execute the code.

If you think of objects as nouns and classes as verbs, the binding might be your sentence. If you think of objects as ingredients and classes as recipes, the binding might be your kitchen, your counter, a mixing bowl, your oven, a cutting board... anything which facilitates the preparation of the food according to the recipe.

  • If a method says to set an instance variable, we need to know which object to set it on. So the binding has a :self.
  • A method might need a variable that nothing else needs. So the binding has a hash of :local_variables.
  • When the method is finished, it wants to send the result of its calculations back to the code that called it, so the binding has a :return_value.
  • When the method is done being executed, we want the code that called it to resume execution, so the binding has a :next_binding -- the one that called it.
{ self:            nil,      # where to set/get instance variables, find `self`, and call "implicit" methods
  local_variables: {num: 1}, # keys are variable names, values are objects (things with classes and instance variables)
  return_value:    nil,
  next_binding:    nil,
}

The stack is a linked list of bindings

The stack points at the "head" of a linked list, this is where we are currently executing code.

When we call a method, we put a new binding on with:

  • :self set to the object we called the method on
  • :local_variables will have keys of the argument names, and values of whatever we passed it.
  • :return_value set to nil (this is why empty methods return nil)
  • :next_binding set to the binding that called it

When we leave a method, we remove the binding at the head of the list, causing us to resume executing code at the old binding.

We call it a stack, because it nicely fits the metaphor of... stacks of things. If you have a stack of pancakes, and you put another one on (often called pushing), then you couldn't get to the one that used to be on top, until you took it back off (often called popping). Anything with this behaviour of "the Last thing I put in is the first thing I get out" is a stack. In our case, it's done with a linked list of hashes.

Constant Namespacing

Okay, now we know what the basic structures of Ruby are, we can finally talk about modules :P

A namespace is a place you store things that have names. As we learned earlier, any time we are tracking things by their name, we have a hash and use the name as the key.

The purpose of a namespace is to differentiate multiple things that have the same name.

  • Each object is a namespace for its instance variables (this is why each car has its own @mileage)
  • Each binding is a namespace for its local variables.
  • Each class is a namespace for its methods.
  • Each class is a namespace for its constants.

It's this last one that modules are here to help with. A constant is any bareword that begins with an uppercase letter. A bareword is any sequence of characters in your code, that begin with a letter, and contain only letters, numbers, and underscores (ie not a "string", :symbol, or @instance_variable, but a bare_word or a BareWord).

The problem: collisions

What happens when two different classes have the same name? Well, we're going to go define both of them in a class that is stored at in some class or module's :constants. By default, that class will be Object.

So in the example below, the vehicle car and the train car are both adding the start method like this Object[:constants][:Car][:methods][:start] = <code>, which means that they are unintentionally modifying the same class, when they each expect to be modifying a unique class.

# This is my vehicle car
class Car
  def initialize(year, make, model)
    @year  = year
    @make  = make
    @model = model
  end

  def start
    "turn the key"
  end
end

# it starts like I expect
camry = Car.new(1994, 'Toyota', 'Camry')
camry.start # => "turn the key"


# This is my train car
class Car
  # notice it already has a start method!
  # that's because it is accidentally editing
  # the vehicular car class instead of making a new one.
  instance_methods(false) # => [:start]

  # some train methods
  def load_cargo
  end

  def start
    "Shovel in the coal!"
  end
end

# and now, when our camry goes to start, it starts like a train >.<
camry.start # => "Shovel in the coal!"

The solution: put them in different :constants hashes

Instead of putting them both in Object, we can make a class or a module whose job is to hold all the names of all the things related to that topic.

So now, our vehicular car and our train car are separated.

module Vehicle
  # This car will be stored in Vehicle's constants
  class Car
    def start
      self # => #<Vehicle::Car:0x007f91fc010398>
      'Turn the key'
    end
  end
end

module Train
  # This car will be stored in Train's constants
  class Car
    def start
      self # => #<Train::Car:0x007f91fc00b208>
      'Shovel in the coal!'
    end
  end
end

# We can access the Car constant through the namespacing module by using 2 colons
Vehicle::Car.new.start # => "Turn the key"
Train::Car.new.start   # => "Shovel in the coal!"

We could make Train and Vehicle classes, since they have constants, too, but that would imply you were supposed to say Vehicle.new. But we are only interested in the Vehicle and Train as a namespace for our constants.

Other examples

This is a very common thing to do, think of it as etiquette, you don't want other code to pollute Object's constants with every little class they happen to make, and so you should be kind and avoid this, as well :)

For example, one of my gems, Seeing Is believing (the one that updates the values in my editor) has a file that defines all of its errors, and they are namespaced inside of SeeingIsBelieving

Look at all these constants Minitest has in it's namespace!

require 'minitest'  # => true
Minitest.constants  # => [:Parallel, :VERSION, :ENCS, :Runnable, :AbstractReporter, :Reporter, :ProgressReporter, :StatisticsReporter, :SummaryReporter, :CompositeReporter, :Assertion, :Skip, :UnexpectedError, :Guard, :BacktraceFilter, :Test, :Assertions, :Unit]

And look how many different parsing libraries ship with Ruby itself!

# Parsing is the act of figuring out what a string means.
# For example, your Ruby code is a string, there is a class that parses it to find documentation
require 'rdoc'  # => true
RDoc::Parser    # => RDoc::Parser

# and a class that parses RSS feeds
require 'rss'  # => true
RSS::Parser    # => RSS::Parser

# and one that parses your parser defining code (O.o)
require 'racc/parser'  # => true
Racc::Parser           # => Racc::Parser

# Those are all in Ruby, but there's even a gem named parser, for parsing Ruby code
# This one gets to define itself on object,
# b/c the convention is that the gem gets the toplevel constant with its name
require 'parser'  # => true
Parser            # => Parser

Mixins

To understand this one, you need to understand what happens when you call a method. When you have something like camry.start, you can look at that and know that you are calling a method (because it has a single dot in it). The object we're calling the method on is whatever camry evaluates to (if it's a local variable, it will go look it up and see, if it's a method, it will be whatever is returned from the method... it can't be anything else, because it is "bare", meaning no fancy things on it like quotes for a String, or an @ for an instance variable). So that camry has the data (in its :instance_variables), and there is a method out there named start, which has the instructions.

What happens when you call a method

When you call a method... every time, for ever, always... You call it on an object. Objects don't have methods, so Ruby goes looks at the class. If it finds the method there, cool, if not, it looks at the superclass. If it finds the method there, cool, if not, it looks at the superclass. It continues this process until it finds the method or there are no more superclasses. If it did not find the method, you'll see a NoMethodError (there's a little bit more that happens first, but it's not really relevant). If it did find the class, it makes a new binding, and it puts that on the stack, with the argument names as keys in the binding's :local_variables, the arguments as the values, the object that the method was called on as :self, and the :return_value as nil. Then it proceeds to execute the code.

The problem: I want methods from something that isn't my superclass

Using the definition for Vehicle::Car that we had above, we might make a subclass named Vehicle::Truck. It starts like a car, so maybe we "subclass" Vehicle::Car (aka set its :superclass pointer to Vehicle::Car).

Now, there's another class called Mudding, and it has the go_mudding method, which, any respectable truck should surely be able to do. But, the only way we can use that method is if we set our superclass to it, but ours is set to Vehicle::Car. We could change Vehicle::Car's superclass, but cars don't go mudding. So what, then? We can't go mudding?!

The solution: The "Included Class"

The thing to realize here is that a class's methods are not inherently a part of the class. Instead, they are values for the :methods key. So, if we could operate at the level of these hashes, as Ruby does, then we could make a new class that no one is using yet, set its :superclass to Vehicle::Car, and set Vehicle::Truck's superclass to it (if you like thinking about these as linked lists, like I do, then that means "insert a node at the head"). Then, because we keep looking in the chain of superclasses for the method, we can still do all the things we used to do (call whatever methods we used to be able to call). BUT! we can also do whatever this new class can do. And if we go set its :methods to point at Mudding's methods... well, then when we search its :methods hash, we are actually searching the methods defined by Mudding!

This is actually what happens! Ruby calls that class an "included class", and that is what a "mixin" is.

It calls it an included class, because the way we tell it to do this is by going to our class, and saying include TheModuleName. Lets try it out!

# put methods into a module
module Mudding
  def go_mudding
    'get dirty'
  end
end

module Vehicle
  class Car
    def start
      'Turn the key'
    end
  end

  # ...and include them into a class. Now, like magic...
  # you have those available! (NOTE: NOT FUCKING MAGIC! http://www.infinitelooper.com/?v=Iq-FV97GRCw#/795;808)
  class Truck < Car
    include Mudding
  end
end

truck = Vehicle::Truck.new
truck.start      # => "Turn the key"
truck.go_mudding # => "get dirty"

So now, we inherited from Vehicle::Car, but we added another class that gives us access to Mudding's methods. And we can even take those methods and share them with other things that go mudding, observe:

# another class of mudders!
module ThePrivileged
  class OldRichWhiteLadies
    include Mudding
  end
end

# brilliant!
lady = ThePrivileged::OldRichWhiteLadies.new
lady.go_mudding # => "get dirty"

# But, note how sad our lady must be, that despite her reprobatic behaviour,
# her parents are not disappointed!
begin
  lady.disappoint_your_parents # =>
rescue NoMethodError => e
  e # => #<NoMethodError: undefined method `disappoint_your_parents' for #<ThePrivileged::OldRichWhiteLadies:0x007fe7f20fb810>>
end

# Since the included class and the module share the same hash table of methods,
# we can aid her iconoclastic quest:
module Mudding
  def disappoint_your_parents
    'Yeah, what now, mom?!' # actual fact: a question-mark followed by an exclamation mark
  end                       # is called an "interro-bang"
end

lady.disappoint_your_parents # => "Yeah, what now, mom!?"

Functions

A function is a method that doesn't use instance variables (in Ruby, anyway).

For example, puts is going to do something useful without modifying your current object.

puts 'hello world'

# >> hello world

There are certain properties that emerge from this, I've found that code is often much more maintainable when written in this style, because it becomes much easier to see the flow of data through the program.

If we held ourselves to this constraint, we would only use local variables, and as a consequence, we wouldn't need an object (because we wouldn't need instance variables), we would only need a method (because we would need instructions). But methods are stored in classes and must be called on objects, so we'll still have to instantiate some class, right?

Well, without going into all the details, you can place a method on any object through what is called a "singleton class". Since a module is an object (it has :instance_variables and a :class pointer), we can put the method on it. Then, the module becomes a convenient place to stick some useful piece of code that doesn't belong on any particular object.

Here, we'll create a "singleton method" named i_can_has_self?, and place it on the module Denial. This allows us to call it directly on Denial, without instantiating any object:

module Denial
  def self.i_has_self?
     false
  end
end

Denial.i_has_self? # => false

Here are some places that Ruby has done this:

Math.sin(0)                            # => 0.0
Math.cos(0)                            # => 1.0
RbConfig.ruby                          # => "/Users/josh/.rubies/ruby-2.2.2/bin/ruby"
Kernel.rand                            # => 0.656340249215726
Kernel.require('seeing_is_believing')  # => true

Here is my Enigma, you'll notice that I wrote most of the functionality in this style. And notice how easy it becomes to test them, I just give the function the input, and it gives me back the output, which I assert against.

Summary

There are a finite set of things in Ruby, with a finite set of rules. All the patterns you see will emerge from them, and if you understand them, then you can make sense of anything new that you see in Ruby.

One of these things is modules. We can use them as namespaces, avoiding collisions of classes with the same name by placing the classes into different :constants hashes.

Another thing is "mixins", where we can include SomeModule which places a new class into our inheritance hierarchy.

And the last thing is "functions", where we can have methods that do useful things, but without modifying any instance variables. We can put these on the module itself with def self.method_name; end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment