how did I get this code shrunk down to
that last little bit how did that
process work my goal was to make these
functions small and I call this the
first rule of functions the first rule
of functions first rule is that they
should be small the second rule is they
should be smaller than that I want to
turn the knob up on this really high I
want the function small really small how
small should have function be let's see
what's the proper size for a function
how many lines should it be one for
three sit on your screen that was the
old rule by the way back in the back in
the 80s when we first got screens does
anybody remember when we first got
screens but there was a time we didn't
have screens we wrote our code on paper
in the early days of programming
programmers did not know how to type we
had no keyboard skills we wrote the code
in pencil and we had other people enter
it into punch cards for us but then
eventually we got screens and we started
typing ourselves that was in the 80s and
those screens a typical screen was 24
lines by 72 columns
why 72 columns yes part holes on the
punch card there were 80 columns on a
punch card the last eight were sequence
numbers so we only needed 72 on the
screen that's the reason they were 72
you didn't even know they were 72 did
you well now you know and the rule came
about your function should fit on a
screen well that meant that the
functions had to be about 20 lines of
code is that the right size I have a
better rule a function should do one
thing a function should do one thing
that's the rule for how big a function
should be it should do one thing but now
we need to define what one thing is
what's one thing what idea are you using
i hear people going IntelliJ IntelliJ
who's using IntelliJ oh yeah tell Jay
it's the best idea it really is the best
idea out there there's just nothing
better is anybody using anything else if
you are I'm sorry what is it
eclipse and well VI you know everybody
has to use VI from time to time cuz
you've got to edit some text file and
there's just no easier way to go bij
jkkk it's fine right fine
but if you're if you're editing code you
don't want to use VI for that cuz that's
a pain so you use something like
IntelliJ that's nice
okay IntelliJ it's got a refactoring
menu one of the menu items in there is
called extract method who's used extract
method we can add everybody use extract
method that's good okay so now you know
how extract method works I'm going to
define one thing a function does one
thing if you cannot meaningfully extract
another function from it if a function
contains code and you can extract
another function from it then very
clearly that original function did more
than one thing because you could extract
something from it
so I want all my functions to do one
thing and that means I must extract and
extract and extract and extract until I
cannot extract any more I'm going to
take all the functions in the system and
explode them down into a tree of tiny
little functions optimally extracted
maximally extracted at this point there
are people in the room going this guy's
nuts
he's insane I'm not going to do that if
I did that I'd have thousands of little
tiny functions I would drown in a sea of
tiny little functions no you will not
drown in a sea of tiny little functions
and there's a simple reason why you're
not going to drown in a sea of little
functions and that's you're going to
have to give those functions names
you'll have to name those functions and
as you name them you'll have to move
them into appropriately named classes
and appropriately named packages and
appropriately named source files and
modules and you will create a tree a
semantic tree of functions that you can
follow by name now you may not believe
me but let me make a few points I'm
going to draw for you a function that I
wrote in 1988 probably the name of this
function was gi gi stood for graphic
interpreter it was 3,000 lines long I'm
going to draw it for you not all the
code just the shape of the code you
recognize that don't you and that was
the shape of the code well it was
actually more complicated than that but
look at that shape and see if you can
tell if there's some part of your brain
that relaxes because you see that shape
there's some part of your brain that
could oh why why does that happen
because beep when people get very used
to a large function or a large module
you've been working in it for months you
know it you know it really well you know
it geographically you know it by its
landmarks you know the shape intimately
and look at that shape rotate that shape
90 degrees it looks like the horizon
humans evolved to know where they are on
the planet by staring at the horizon
they know over there by those Peaks
that's where the watering hole is and
the saber-tooth wanders over there so
you look at that shape and some
your brain goes yeah that looks like
home now this was written in C by the
way if someone had come to me in 1988
and said you know Bob this 3000 line
function does a lot more than one thing
I would have said no it can't it
interprets graphics because the whole
notion of one thing was so horribly
subjective at the time but now I know
how to make it objective extract extract
extract until you cannot extract anymore
now let's assume that this function
written in C has some variables where
were this where were the variables in C
the local variables of a C function
where did you put them at the top
that's right C programmers in the room
know right you still have your copies of
Kern and Richie sitting on the Shelf
don't you can't be far away from it so
okay let's say that there's a couple of
variables here int I and J good now
let's say that this indent right here
manipulates I and J and it does this it
says I equals 0 and J equals 2 okay
semicolon good now you want to practice
extract till you drop
do you want to do what I've just told
you to do extract extract extract
extract so you highlight that highlight
that indent with your mouse and you
invoke the extract method function of
the IDE and the IDE will come back with
an error message and say I can't extract
it because it changes two variables and
I can't extract code that changes two
variables well what are you going to do
now you want to extract it but you can't
because it changes two local variables
what are you gonna do now make them
global
right that works perfectly you think I'm
joking don't you haha but now look you
can extract that and you can extract
this one into another function so now
I've got two functions I've extracted
them out I can take that one and extract
it yeah I can take that one and extract
it and I've got something very
interesting here now I have a set of
functions all of which manipulate a set
of variables what's it called when you
have a set of functions that manipulate
a set of variables that's called the
class there was a class hiding inside
this big function and if you think about
it of course there are classes that hide
inside big functions because big
functions have a whole bunch of
variables and a whole bunch of indents
that manipulate those variables so of
course every large function is really a
class with a bunch of little tiny
functions inside it and if you start
extracting and extracting and extracting
you will begin to identify these classes
that you would otherwise not have
identified and you'll be able to put
them in appropriate names and spaces and
allocate them nicely and partition your
code well this is what happens when you
start to extract and extract and extract
you find the true object-oriented
structure of the system that you're
trying to design we're all
object-oriented designers we all use
yourself to him and sun microsystems
said hey you can't do that and microsoft
said oh ok we'll change the : ah how
much indenting do you think you would do
if you extract and extract and extract
how deep will your indenting be if you
extract and extract an extract one or
two yeah one or two is about it usually
one
well actually usually none right just
open brace 1 indent couple of lines of
code close your functions will really be
about four lines long 3 4 or 5 lines
long something like that every once in a
while you'll get a 6 liner switch
statements tend to get longer
you don't like switch statements so you
don't have too many of those I hope
because they're evil switch statements
bad things to have around really the
same true with if-else statements we
don't want those either
so we tend to give these little tiny
functions now think of what happens to
normal code take an if statement if you
are extracting and extracting and
extracting what is the body of the if
statement it's a function call yes and
it has a nice name a name that tells you
what the function is going to do so you
say if something then do this it reads
like well-written prose
what's in the parenthesis of the if
statement a function call with a nice
name that tells you what you're testing
if employee is too old fire employee it
works perfectly right it reads like
well-written prose how many arguments
should a function have
well zeros a good number right cuz it's
really easy to understand a function
that takes zero arguments but you don't
have to worry about any if statements in
there that checking the arguments for
anything there's no arguments one is not
too hard to understand one argument
going into a function is pretty easy
that's you know mathematical f of X we
kind of get that two arguments going
into a function yeah it's okay you know
the human brain is pretty good at
keeping two things in order there's only
two different ways to order them so not
too hard three arguments it's a little
hard how many ways are there to order
three arguments six yes it's n factorial
so ooh six is hard now humans can
probably do that sort of what about four
how many ways are there to order four
arguments
twenty-four twenty-four different
orderings how about five one hundred and
twenty ways is n factorial big crimes
enormous ly fast so probably don't want
any more than about three arguments in a
function that's the way I like to limit
it I don't like to have functions that
take more than three arguments and by
the way there's another there's another
debate here I almost use the word
argument um if you have a function and
you want to pass six things into it
those six things are so cohesive that
they can be passed into a function
together why aren't they already an
object this is an interesting debate to
have here you probably never need to
pass more than three things into a
function so I I like to use that as a a
soft rule I don't want to see a long
comma separated list of arguments that
seems to me to be rude I would like it
to be polite so keep the number of
arguments down to two or three create
objects if you have to use other
strategies to to get things into
functions by creating objects and data
structures things like that
what types of arguments should you never
pass into a function
boolean's boolean's now by the way I use
the word never never is the wrong word
mostly never is probably a better way to
put that don't pass boolean z' into
functions why not well because if you
pass a boolean into a function there
must be an if statement in that function
and that if statement has two branches
the normal branch and the else branch
why not just separate them into two
functions call the one in the true case
call the other in the false case have
you ever read code that has boolean
arguments it's rude now do this comma 5
comma 6 comma true what does the true
mean I don't know it must be true though
and here's how you read this code when
you're reading this code you stare at
that boolean and go oh wow that the
author probably knew what he was talking
about and you walk away
you're gonna go read what the boolean
does is probably some stupid if
statement in the middle of the function
don't pass boolean surround they're just
annoying now that can't be a hard and
fast rule because there are times when
you want to pass a boolean around for
example you are setting the state of a
switch you know sets which bool ok fine
but don't use it as a little testing
argument into functions that's just rude
it's annoying another thing that's rude
is output arguments arguments that are
passed into a function for the purpose
of collecting the output nobody
understands that right you're reading
along and you've probably all had this
experience where you're reading along
and you read this line and there's an
argument at the end of the function call
and you're not quite sure why it's there
it seems out out of context it's just a
bizarre argument but you've got this
vertical momentum as you're reading
who's had this experience right you're
reading down and there's something about
this line that puzzles you but you've
got this nice vertical momentum so you
keep reading but a little process is
started in your brain and this little
process starts yelling at you louder and
louder you didn't understand that last
line you didn't understand that
lining your eyes are torn back up to
look at that line this process in your
head takes your takes your head and
moves it back to stare at that line this
is a double take you know a double take
it's I think it's American slang double
take a double take is like this you're
out on the road your start on the
sidewalk you're walking down the street
out of the corner of your eye you see an
attractive individual you turn away and
then a little process in your brain goes
wait that was interesting and you go
back that's a double take that code that
makes you do a double take is rude it's
rude code it forces you to stop your
reading and go back so you don't want to
have these double take moments in the
code there's another author who says
this is the principle of least surprise
make sure that your code is not
surprising do one thing yes I did that
little bit later I've talked about
arguments already good and flag
arguments and output arguments no side
effects what's a side effect so the
classical definition of a side effect is
a change to the state of the system if
you call a function and that function
causes the system to change state then
that function had a side effect the
function open has a side effect because
it leaves a file open the function new
has a side effect because it leaves a
block of memory allocated side effect
functions change the state of the system
side effect functions come in pairs
there's open and closed new and free are
new and delete in C++ and Java we fixed
that problem semaphores season release
side effect functions come in pairs
they're like the SIF always to there
now how good at we at managing pairs of
functions like this for example how good
are we at managing Alec and free the
answer to that is that we're terrible at
managing that and the the obvious
evidence of our terrible ability to
manage pairs of functions is that in
modern languages we've invented a
horrible hack to allow us to forget
about managing pairs of functions that
horrible hack is called garbage
collection now I would not want to
program without garbage collection
because garbage collection makes it much
easier to write safe code but you must
admit that garbage collection is a
crutch it is not reasoned you did not
write the code in a reasoned way you did
not free everything you allocated
instead what you did is said the system
will take care of it and ok fine we have
we have written this horrible hack we've
we've declared that we're going to
depend upon it we've acknowledged that
we need the crutch but allocate
allocation and freeing is not the only
side effect function there are many
other side effect functions that you and
I have to deal with like open and close
does anybody seen a system crash because
too many people forgot to close files
yes and you leave a bunch of file
descriptors open in the operating system
and eventually you run out of file
descriptors that's called a file
descriptor leak has anybody seen a
system crash because all the graphics
contexts got leaked or the semaphores
didn't get closed anything that comes in
a pair like that will suffer the same
fate that memory used to suffer when we
had memory leaks and by the way you can
still have memory leaks in Java you just
have to have to work really hard at it
so what do we do about this what do we
do about this problem of of controlling
pairs of functions
pairs of functions must be called in the
right order you cannot close a file
before you open it you cannot free
memory before you allocate it if you
call them in the wrong order it's a
logical inconsistency how many of you
have debug based system spent days and
days debugging a system only to find
that you can make it work if you change
the order of two function calls and you
don't know why but somewhere in there
there's some side-effect and if you just
change the order oh that makes
everything work what can we do to manage
that so up what version of Java are you
using you're working on lambdas now Java
but what is that eight Java eight
everybody doing Java eight everybody
familiar with lambdas now you know how
to do lambdas and you can pass lambdas
here and there and everywhere good I
don't know why they put that feature in
the language so what can you do about
this well if you've got lambdas in the
language it makes things a lot easier
well it makes things a little bit easier
remember that the lambda is just a class
and it's a class with one function in it
called execute you could have written
that yourself but they decided to put it
in the language so okay fine
let's say that we want to make open safe
and I think in the Java library now they
actually do this so okay here's our open
function our open function is going to
be a void it doesn't return anything and
it's going to take a filename so that's
a string that would be the file that
we're going to open and then we're going
to pass into it a lambda and this will
be we'll call this lambda process of
course I'm not using Java syntax but you
get the point okay there's the open
brace now what do we do well first thing
we're going to want to do is we're going
to really open that file so we're going
to call the low-level open function so
here we say file
F equals file dot open of FN good so now
we've got the open file the next thing
we do is we say process that file so now
we're calling the lambda and now we say
file dot close 1/2 and return this is a
simple function that deals with the side
effect this function does not have a
side effect because it leaves the system
with the file closed so the side effect
is dealt with inside of this function
you don't have to remember to close it
what you do have to remember to do is
pass in the lambda that processes the
open file this is a very common
procedure for dealing with side effects
so you can try to get your side effects
under as much control as possible by
passing a lambda into your system if you
don't have lambdas in your language then
you could use a command object a simple
a simple class that has one function in
it called execute and then you pass in
the appropriate derivative data side
effects doop-doop do yep good command
query separation a function that returns
void must have a side-effect if it
doesn't have a side-effect there's no
point in calling it so a function that
returns a value should not have a
side-effect this is a convention that we
like to follow called command and query
separation commands change the state of
the system therefore they return void
anything that returns of value by
convention will not change the state of
the system and that way when you see a
function that returns a value you know
it's safe to call it it will leave the
system in the same state it was found in
this is a convention that I like to
follow the language doesn't enforce it
of course but I like to follow it
because it allows me to keep track of
side-effects to prefer exceptions to
returning error codes do I even need to
go over this dude use since exceptions a
lot in Java everybody using exceptions
good good good
does anybody remember how awful
exceptions were in C++ okay so we didn't
use them I don't use them in C++ and
Java they're fairly safe it is better to
use an exception than to return an error
code I'm not going to belabor that point
but I will make another point when I
write a try block the only thing in that
function that has the try block is the
try block I don't want to have a lot of
code before the try block if I've got a
function that throws an exception the
first executable statement in that
function is going to be try and then
there will be a single function in the
try block that's the function that
actually throws the exception and then
there will be a closed brace and then
the catch blocks and a finally block if
necessary and then no other code in the
function I don't want anything in the
function except the try block I don't
want a whole bunch of code in the try
block I want that to be a different
function I don't want any prefix code in
that function I don't want any suffix
code in that function just the try block
because error processing is one thing so
I want the I want the try block
completely contained by a function and I
never ever ever want nested track patch
blocks I will find you if you do that
[Music]
there's a rule in software called the
DRI principle don't repeat yourself this
has to do with duplicate code you saw
some duplicate code in that original
code that I threw up on the screen a
little bit earlier that code was
obviously copy and pasted we'd like to
avoid duplication as much as possible
because it's sloppy if you do you copy
and paste a bunch of code
it's just sloppy to leave it in that
state what you'd like to do is move the
copied code into some function and call
of a function of course sometimes you
copy the code and then change the code
you've copied and so that means you're
probably going to have to put it into a
function that has some arguments but
that's all fairly easy to do so we don't
like duplicated code we don't like code
that's kind of duplicated we ought to be
able to move those into functions but
what do you do when it's not the code
that's duplicated it is the loops that
are duplicated how many of you have seen
this problem where you've got a complex
configuration data structure to walk
this complex configuration data
structure you have to have a big nested
bunch of loops a bunch of while loops
and if statements that allow you to walk
through the Kombi the complex
configuration data structure and then
you finally get to the end nodes in
there and you've got a bunch of
processing code that processes the end
nodes and then you see that same loop
repeated over and over and over again
inside different parts of the system as
the different parts of the system walk
different parts of the configuration
database how can you get rid of that
duplication one of the answers to that
as well if you've got lambdas you can
put that nice looping structure into a
function that takes a lambda argument
and then you pass the processing code
into the lambda argument so you can get
all of those duplicated loops down into
one and then just pass a lambda in or
pass an object that takes a a single
single parameter which is a function