July 9, 2014

Introduction

The primary lesson I want to go over today is separation of concerns.

When coding, there are often many different tasks or components that make up the entire program. For example, gathering user input, displaying data, running algorithms, and storing data are all examples of different "tasks" or "concerns".

One problem many beginning coders have is a tendency to "blur" these concerns together, and end up writing functions that try and do too many things at once.

For example, this project had two main "concerns" -- input/output, and manipulating the events list. In many cases, you ended up including both IO and event list manipulation in the same function.

A better solution would be to separate the IO portions of the code from the event list manipulation portions. There are several benefits of doing so:

It's easier to test and debug your program.
It helps prevent spaghetti code.
It allows you to easily change or modify various components in your code without effecting the rest of the program.

Now, this is all a bit abstract. Let's look at how the concept of "separation of concerns" applies to your code.

For reference, here's your original code:

#!/usr/bin/env python

import re

class Event:
    def __init__(self, name, time, description):
        self.name = name
        self.time = time
        self.description = description

def add_event(event_list):
    name = raw_input("Event name? ")
    description = raw_input("Event description? ")
    while True:
        time = raw_input("Event time? Please input like hour:minute. For example, 1:45 am would be 0145 and 11:30 pm would be 2330: ")
        if re.match('[0-1][0-9][0-5][0-9]', time) is None and re.match('[2][0-3][0-5][0-9]', time) is None:
            print "Invalid input. Try again... "
        else:
            break
    event = Event(name, time, description)
    event_list.append(event)
    return sorted(event_list, key=lambda event: int(event.time))
    
def view_events(event_list):
    for event in event_list:
        print "\n--------------------------------\n"
        print "Name: {0}".format(event.name)
        print "Time: {0}".format(event.time)
        print "Description: {0}".format(event.description)
    print "\n--------------------------------\n"

def delete_event(event_list):
    for i in range(len(event_list)):
        print "{0}. {1}".format(i, event_list[i].name)
    while True:
        deleted_event = raw_input("To delete an event, enter the number that the event name appears next to: ")
        try:
            deleted_event = int(deleted_event)
        except ValueError:
            print "Invalid input. Try again..."
            continue
        if int(deleted_event) not in range(len(event_list)):
            print "Invalid input. Try again..."
        else:
            break
    del event_list[deleted_event]
    print "Event deleted."
    
def main():
    print "Welcome to the event organizer!\n"
    event_list = []
    while True:
        option = raw_input("To add an event, enter 'a'. To delete an event, enter 'd'. To view your event list, enter 'v'. To quit, enter 'q': ")
        option = option.lower()
        if option == 'a':
            event_list = add_event(event_list)
        elif option == 'd':
            if event_list == []:
                print "There are no events to delete!"
            else:
                delete_event(event_list)
        elif option == 'v':
            view_events(event_list)
        elif option == 'q':
            break
        else:
            print "Invalid input! Try again... "
            
if __name__ == '__main__':
    main()

This time around, rather then evolving the code in pieces, what I'll do is give you my final version and discuss the differences between yours and mine. Here it is. Take a moment to read through it and understand what it's doing.

#!/usr/bin/env python

import re
import sys
            
## Models ## 

class Event(object):
    def __init__(self, name, description, time):
        self.name = name
        self.description = description
        self.time = time 
        
class EventList(object):
    def __init__(self):
        self.events = []
        
    def add(self, name, description, time):
        event = Event(name, description, time)
        self.events.append(event)
        self.events = sorted(self.events, key=lambda event: int(event.time))
        
    def delete(self, index):
        del self.events[index]
        
    def __len__(self):
        return len(self.events)
        
    def __iter__(self):
        return self.events.__iter__()
        

## Utility methods ## 

def clean_input(prompt, is_valid):
    while True:
        string = raw_input(prompt)
        if is_valid(string):
            return string
        else:
            print 'Invalid input. Try again...'

def is_valid_time(string):
    before_2000 = re.match('[0-1][0-9][0-5][0-9]', string) is not None
    before_2400 = re.match('2[0-3][0-5][0-9]', string) is not None 
    return before_2000 or before_2400
    
def is_int(string):
    try:
        int(string)
        return True 
    except ValueError:
        return False 
    
def is_valid_index(string, min_val, max_val):
    return is_int(string) and (min_val <= int(string) <= max_val)
    

## Input/output ##
        
class Menu(object):
    def __init__(self, events):
        self.events = events
        self.options = {
            'a': ['Add an event', self.add_event],
            'd': ['Delete an event', self.delete_event],
            'v': ['View event list', self.view_detailed],
            'q': ['Quit', sys.exit]
        }
        
    def add_event(self):
        name = raw_input('Event name? ')
        description = raw_input('Event description? ')
        time = clean_input('Event time? Please input in form `hhmm`. ', is_valid_time)
        
        self.events.add(name, description, time)
        
    def delete_event(self):
        if len(self.events) == 0:
            print 'There are no events to delete!'
            return 
            
        self.view_simple()
        
        def partial(string):
            return is_valid_index(string, 0, len(self.events))
        
        index = clean_input('To delete an event, enter the number: ', partial)
        self.events.delete(int(index))
        
        print 'Event deleted.'
        
    def view_detailed(self):
        separator = '\n--------------------------------\n'        
        for event in self.events:
            print separator
            print 'Name: ' + event.name
            print 'Time: ' + event.time 
            print 'Description: ' + event.description
        print separator
        
    def view_simple(self):
        for index, event in enumerate(self.events):
            print '{0}, {1}'.format(index, event.name)
            
    def list_options(self):
        print 'Options:'
        for option, (description, cmd) in self.options.items():
            print '{0}: {1}'.format(option, description)
        print ''
            
    def main_loop(self):
        is_valid_option = lambda string: string in self.options
        
        print 'Welcome to the event organizer!'
        print ''
        while True:
            self.list_options()
            option = clean_input('Choice: ', is_valid_option)
            self.options[option][1]()
            print ''

def main():
    events = EventList()
    menu = Menu(events)
    menu.main_loop()

if __name__ == '__main__':
    main()

Ok, great. Now what I'll do is provide a commented version of the code that provides explanation of the changes I made. This is the exact same code from above, except with comments.

Read from the top down -- I've tried to lay out the code so it makes logical sense to read straight downward without jumping around, like a book.

#!/usr/bin/env python

import re
import sys
       
            
## Models ## 

# The first part of the code comprises of 'models', which comprises mostly of the 
# event list manipulation code. 
#
# A 'model' is in fact a generic term for objects or things that represent the 
# underlying "concept" that the program deals with. 
#
# Models typically contain only data, and methods that manipulate said data. 
# In more complex programs, models often (though not always) correspond to 
# entries in a database and provide a bridge between the database and the code.
#
# Here, they simply contain data.

class Event(object):
    '''
    Previously, you correctly identified that an event should be presented 
    by an `Event` object. The only change I made was to make the class 
    explicitly inherit from `object.
    
    This is another Python idiom. Unfortunately, there are two kinds of 
    "classes" in Python 2 -- old-style classes, and new-style classes. You 
    create an old-style class by doing `class Event:`. You create a new-style 
    class by doing `class Event(object):`. 

    You should always use new-style classes -- old-style classes were retained 
    in Python 2.x solely for the purposes of backwards compatibility. (They were 
    finally permanently removed in Python 3).
    '''
    def __init__(self, name, description, time):
        self.name = name
        self.description = description
        self.time = time 
        
class EventList(object):
    '''
    I also added an `EventList` object in lieu of using a plain list so that we 
    now have a convenient wrapper for any event list manipulations we might want 
    to do.
    
    Internally, it consists of only a list, but we wrap some extra behavior around 
    it.
    '''
    def __init__(self):
        self.events = []
        
    def add(self, name, description, time):
        '''
        For example, whenever we want to add an entry, we need to create an Entry 
        object, insert it into a list, and re-sort the list. Rather then having to 
        remember to do so each time, we can wrap it into a method so that we no 
        longer need to remember these details whenever adding a new event.
        '''
        event = Event(name, description, time)
        self.events.append(event)
        self.events = sorted(self.events, key=lambda event: int(event.time))
        
    def delete(self, index):
        del self.events[index]
        
    def __len__(self):
        '''
        However, by creating a custom object, we can no longer obtain the length 
        of the list by simply doing `len(events)`. 
        
        Of course, we could provide our own `length` method, but it seems 
        inconsistent to have to do `obj.length()` for custom objects but `len(obj)`
        for builtins (lists, strings, dicts, sets, etc).
        
        Python has the concept of 'magic methods' to help deal with this 
        inconsistency. Under the covers, the `len` function is essentially defined 
        to be:
        
            def len(obj):
                return obj.__length__()
                
        The `len` method calls the `obj.__length__()` method! Lists, strings, etc. 
        already have a __length__ method -- we just need to add our own to our 
        custom object.
        '''
        return len(self.events)
        
    def __iter__(self):
        '''
        Similarly, by creating a custom object, we lose the ability to easily iterate
        over the list. The solution to that is to make the class have an `__iter__`
        method that returns an interator.
        
        Whenever we do:
        
            for item in obj:
                print item
                
        ...Python will automatically call `obj.__iter__()` and use the iterator to 
        determine what the next item is and when to stop. Lists, dicts, sets, strings, 
        and other sequences in Python already have an `__iter__` method, but we need 
        to define a custom one for our custom class.
        
        However, because our data is internally a list, I'm simply just returning the 
        list's iterator instead of making my own. 
        '''
        return self.events.__iter__()
        
        
## Utility methods ## 

# The next part of the program consists of various utility methods, primarily 
# dealing with validation. 
#
# While reading through your code, I saw that you had a tendency to repeat this 
# same snippit of code:
#
#   while True:
#       user_input = raw_input(some_prompt)
#       if user_input is broken:
#           print 'Invalid input. Try again... '
#       else:
#           break
#   use user_input
#
# This is somewhat difficult to abstract into a separate method. We can easily 
# create a function that accepts a prompt and performs this check, but how do you 
# turn the check to see if the user input is broken into a parameter?

def clean_input(prompt, is_valid):
    '''
    Functional programming comes to the rescue! We're used to the concept of 
    passing values and objects as parameters, but did you know that in Python,
    we can pass in functions?
    
    For example, this `clean_input` method has two parameters -- one for the 
    prompt, and another that accepts any arbitrary function. Every time the user 
    types in something, we use this arbitrary function to determine whether or 
    not the input is valid or not. 
    
    That way, we can easily do something like this:
    
        time = clean_input("What time were you born?", is_valid_time)
        age = clean_input("How old are you?", is_int)
        
    ...and our code will automatically keep looping until the user types in a 
    valid time and number without us having to constantly rewrite the loop!
    '''
    while True:
        string = raw_input(prompt)
        if is_valid(string):
            return string
        else:
            print 'Invalid input. Try again...'
            
def is_valid_time(string):
    '''
    This also makes it very easy for us to test our code. If we hand-wrote the 
    loop, we'd need to keep re-running the program, and manually enter in 
    values to try. Now, since we've separated out the "loop until the user stops 
    derping" and "is the user input broken?" pieces, we can easily test each in
    isolation.
    
    For example, I could write the following method (note: assert is a keywork in 
    Python. If the value provided does not evaluate to True, it throws an 
    exception. It's a really useful tool for testing/debugging):
    
        def test_is_valid_time():
            assert is_valid_time('1324')
            assert is_valid_time('2333')
            assert is_valid_time('0123')
            
            assert is_valid_time('5432') == False 
            assert is_valid_time('kittens') == False 
            assert is_valid_time('-1214') == False 
            assert is_valid_time('') == False
            
    ...and run it to easily verify that our check is valid. We can also now 
    confidently completely change and modify our method to make it more efficient
    or whatever. If we run the test, and everything passes, then we know that 
    our rewritten method is working perfectly.
    
    Again, separation of concerns!
    '''
    before_2000 = re.match('[0-1][0-9][0-5][0-9]', string) is not None
    before_2400 = re.match('2[0-3][0-5][0-9]', string) is not None 
    return before_2000 or before_2400
    
def is_int(string):
    '''Note: we could probably use a regex here, but exceptions in Python are 
    actually very cheap, so it's quicker to just try parsing it.'''
    try:
        int(string)
        return True 
    except ValueError:
        return False 
    
def is_valid_index(string, min_val, max_val):
    '''
    Now, this function is a little different -- it needs to accept three 
    parameters since we don't know ahead of time how many items are 
    in the event list. However, if we try doing:
    
        foo = clean_input('enter stuff: ', is_valid_index)
        
    ...Python will throw an exception, since `clean_input` will only give 
    our function one argument!
    
    Oh no! What can we do?
    
    The answer is something called "partial functions", which I'll discuss
    later below.
    '''
    return is_int(string) and (min_val <= int(string) <= max_val)
    

## Input/output ##

# Now that we've written code to manipulate the events list and to perform 
# validation, we can get on the business of handling user input and output.
#
# In a sense, everything before was setup. This is where everything gets 
# pulled together and finally used.
#
# Notice that now, we can completely change the implementation for Event and 
# EventList without needing to change a single line of input/output code 
# (perhaps we want a more efficient EventList?), or completely change how we 
# do our input/output code without having to change the models (perhaps we 
# want a GUI?).
# 
# Because we separated our concerns, we now have the power to do this sort of 
# dramatic refactoring -- we can now easily extend our code support both a 
# console interface and a GUI interface, if we so desired.

class Menu(object):
    '''
    Note that there's no particular reason why all this code needs to be in 
    an object. I just chose to do so because it's somewhat similar to how 
    GUIs are typically coded, and I liked the symmetry.
    '''
    def __init__(self, events):
        '''
        There are two main things here: I'm accepting an event list object, 
        rather then constructing one, and am specifying all the different 
        commands I might issue in one place.
        
        By having the Menu accept an event list, I'm allowing the caller to 
        pass in whatever EventList implementation they want. This concept is 
        called "dependency injection" -- we "inject" anything we depend on, 
        rather then making it ourselves. 
        
        I'm specifying the commands in one place mostly so that I don't have 
        to type out a long if-else chain later, and so that I can easily customize
        which function each command maps to.
        
        The `sys.exit` function is a built-in that automatically ends the program.
        '''
        self.events = events
        self.options = {
            'a': ['Add an event', self.add_event],
            'd': ['Delete an event', self.delete_event],
            'v': ['View event list', self.view_detailed],
            'q': ['Quit', sys.exit]
        }
        
    def add_event(self):
        '''
        Here, you can how I'm taking advantage of the abstractions I defined 
        earlier. 
        
        I no longer need to care about looping or munging or worrying about details --
        those were all taken care for me. The resulting code is now comparatively
        simple.
        '''
        name = raw_input('Event name? ')
        description = raw_input('Event description? ')
        time = clean_input('Event time? Please input in form `hhmm`. ', is_valid_time)
        
        self.events.add(name, description, time)
        
    def delete_event(self):
        '''
        This method is a little more messy, but it can't be helped. The main thing 
        that I'm doing here is finally doing that 'partial function' thing I 
        talked about earlier. 
        
        I'm creating a temporary function, called `partial`, which calls 
        `is_valid_index`, but provides the two missing values. In a sense, we're 
        **partially** calling the function, leaving behind that one last parameter.
        
        Now, `clean_input` will no longer complain, since the function has only one 
        parameter, and everything is now good.
        
        The only last irritation is that it feels a bit clunky to have to create 
        an entirely new function just do this. We can use something called a 
        `lambda` instead. The following code is exactly identical:
        
            index = clean_input(
                    'To delete an event, enter the number: ',
                    lambda string: is_valid_index(string, 0, len(self.events)))
                    
        By using `lambda`, we can create a temporary inline function in only a single 
        line of code. It's a nifty trick.
        '''
        if len(self.events) == 0:
            print 'There are no events to delete!'
            return 
            
        self.view_simple()
        
        def partial(string):
            return is_valid_index(string, 0, len(self.events))
        
        index = clean_input('To delete an event, enter the number: ', partial)
        self.events.delete(int(index))
        
        print 'Event deleted.'
        
    def view_detailed(self):
        '''
        Notice here, I'm providing two separate ways to view an event list. 
        Again, by having a separate 'model' object that does not contain any 
        opinions on how it should be displayed, we gain the freedom to display the 
        event list however we want.
        '''
        separator = '\n--------------------------------\n'        
        for event in self.events:
            print separator
            print 'Name: ' + event.name
            print 'Time: ' + event.time 
            print 'Description: ' + event.description
        print separator
        
    def view_simple(self):
        for index, event in enumerate(self.events):
            print '{0}, {1}'.format(index, event.name)
            
    def list_options(self):
        '''
        Calling `dict.items()` returns a sequence of tuples where the first 
        element is the key, and the second is the value. As an example:
        
            >>> foobar = {'a': 'axe', 'b': 'bat', 'c': 'cat'}
            >>> print foobar.items()
            [('a', 'axe'), ('b', 'bat'), ('c', 'cat')]
            
        I'm then unpacking it, so that I don't have to muck around with indices 
        to get the description.
        '''
        print 'Options:'
        for option, (description, cmd) in self.options.items():
            print '{0}: {1}'.format(option, description)
        print ''
            
    def main_loop(self):
        '''
        And finally, the main loop. Everything comes together.
        '''
        is_valid_option = lambda string: string in self.options
        
        print 'Welcome to the event organizer!'
        print ''
        while True:
            self.list_options()
            option = clean_input('Choice: ', is_valid_option)
            self.options[option][1]()
            print ''

def main():
    events = EventList()
    menu = Menu(events)
    menu.main_loop()

if __name__ == '__main__':
    main()

Whew, that was a lot. But we covered a lot of good stuff, starting from separation of concerns and using elements of both OOP and functional-style programming to help abstract out the code.

Before I finish, I have one last improvement to make -- this time, a data structures and algorithms improvement.

Currently, the EventList object is somewhat inefficient. Every time we add an item, we need to re-sort the list. The more items we have in the list, the longer the sort will take. The sort is also inefficient in another way -- the vast majority of the list is already sorted, so it's wasteful to re-sort the entire thing just for a single element.

It would be ideal if we had some sort of data structure or algorithm that will correctly insert the element into the correct location so that the list stays sorted.

There's actually a data structure for this -- the heap, also known as a priority queue. In Python, the module is called heapq. Under the covers, it maintains a tree, instead of a list, so that inserting elements in sorted order is a fairly efficient operation.

An interesting exercise might be to try and write a new version of EventList that uses a heap instead of a list. The heapq module, unfortunately, is of procedural design rather then object-oriented, so it does turn out to be a bit roundabout.

Michael0x2a/gist:da6a731a8da375b75077

July 9, 2014

Introduction