- Extract Function (106)
- Inline Function (115)
- Extract Variable (119)
- Inline Variable (123)
- Change Function Declaration (124)
- Encapsulate Variable (132)
- Rename Variable (137)
- Introduce Parameter Object (140)
- Combine Functions into Class (144)
- Combine Functions into Transform (149)
- Split Phase (154)
- Encapsulate Record (162)
- Encapsulate Collection (170)
- Replace Primitive with Object (174)
- Replace Temp with Query (178)
- Extract Class (182)
- Inline Class (186)
- Hide Delegate (189)
- Remove Middle Man (192)
- Substitute Algorithm (195)
- Move Function (198)
- Move Field (207)
- Move Statements into Function (213)
- Move Statements to Callers (217)
- Replace Inline Code with Function Call (222)
- Slide Statements (223)
- Split Loop (227)
- Replace Loop with Pipeline (231)
- Remove Dead Code (237)
- Split Variable (240)
- Rename Variable (244)
- Replace Derived Variable with Query (248)
- Change Reference to Value (252)
- Change Value to Reference (256)
- Decompose Conditional (260)
- Consolidate Conditional Expression (263)
- Replace Nested Conditional with Guard Clauses (266)
- Replace Conditional with Polymorphism (272)
- Introduce Special Case (289)
- Introduce Assertion (302)
- Separate Query from Modifier (306)
- Parametrize Function (310)
- Remove Flag Argument (314)
- Preserve Whole Object (319)
- Replace Parameter with Query (324)
- Replace Query with Parameter (327)
- Remove Setting Method (331)
- Replace Constructor with Factory Function (334)
- Replace Function with Command (337)
- Replace Command with Function (344)
- Pull Up Method (350)
- Pull Up Field (353)
- Pull Up Constructor Body (355)
- Push Down Method (359)
- Push Down Field (361)
- Replace Type Code with Subclasses (362)
- Remove Subclass (369)
- Extract Superclass (375)
- Collapse Hierarchy (380)
- Replace Subclass with Delegate (381)
- Replace Superclass with Delegate (399)
Use when functions are too long. Code used more than once should be put in its own function. If you spend effort looking at a fragment of code and figuring out what it's doing, then turning it into a function with a name that describes this what is a good idea.
When the function body is as self-descriptive as the name. Also useful as an
intermediate step to a later Extract Function (106)
to re-refactor a previous
refactor the way you want. Great for when code uses too much indirection and
you get lost in all of the delegation from one function to the next.
Code expressions can become very complex and hard to read. Similar to
extracting a function, Extract Variable (119)
gives an expression a name that
can be self-descriptive. Think about the context for the variable. If it is
only meaningful within the function, then Extract Variable (119)
is a good
choice, but if its useful more broadly, you can use Extract Function (106)
instead to make it more widely available.
Use when the name of the variable does not communicate more than the expression itself. Most times, this refactor is used as an intermediate step of a larger refactor.
The most important element of a function for maintainability is its name. If you come across a name that is unclear, it's imperative that you rename it as soon as you understand what the function does. The same applies to function parameters, since the parameters dictate the interface to the outside world. Changing the parameters may also benefit by reducing coupling between code that shares the data, but be careful! Sometimes coupling code is a Good Thing over time as the parameter and other code develops.
Data can be wrangled more understandably when mutations occur as part of functions. It adds a clear point to monitor changes and use of the data. This is the basis for encapsulating data into an object and then writing getters and setters.
Use when you get the name wrong initially and a rename can help make the code more self-descriptive. In tiny functions where the variable is self-evident by the context, a terse name can be acceptable, but persistent fields that last more than one function invocation require more careful naming.
Groups of data often travel together, appearing in function after function as a data clump. It's often a great idea to replace it with a single data structure. This process can greatly simplify the picture of the code.
Use when a group of functions operate closely together on a common body of
data. An alternative to this refactor is Combine Functions into Transform (149)
. Use a class if mutability is acceptable or important. A second
alternative is to nest the functions within each other, but this may make
testing the functions harder to do.
The point of this refactor is to take a bunch of functions that operate on a group of data and write a pure transform function that takes source data, clones it, operates on it and adds new attributes or changes the cloned object, and returns it. The benefit is that the data transform happens in one single place.
Use when code is dealing with two different things. For example, suppose if you receive JSON and a body of code operates on the JSON object with complex key access or even string manipulations across different expressions, it can be beneficial to split the computation into different phases. In this example, you could have a first phase that parses the input into an intermediate data structure that has easily accessible attributes for a second phase.
Fowler likes to encapsulate records into an object when the data itself is mutable. With an object, data accesses and mutations are more easily abstracted and kept track of. The underlying data structure can be refactored without worry of side effects. If the data is immutable, then you have two choices: use a structure that has all of its fields explicitly defined, or use a structure with dynamic fields.
This is a special case of Encapsulate Record (162)
that takes special care
over collections. If a list is encapsulated in an object and some getter
method returns the list itself, the list can be inadvertently mutated, which
removes all of the benefit of encapsulation. Instead, you can provide
add/remove/access methods that interact on a collection, and if a list needs to
be returned, you can return a copy or if the language supports it, a read-only
pointer.
Eventually the needs of some variable stored as a primitive grow beyond its usefulness. For example, a phone number may have sufficed at first as a string, but eventually you may need more detailed functionality like handling area codes, country codes, handling different display formats, being self-checking, etc. Another example is a date range.
Use when you want to provide even more self-explaining code for a variable that is instantiated with a complex expression. This is also useful when breaking up a large function. In python classes, the equivalent is basically turning temporary variables into @property methods or computed fields.
Classes should be a crisp abstraction that handles a few clear responsibilities. In practice, classes grow and grow and take on responsibilities that may have at first did not seem like it deserved their own class. But eventually once a class becomes too bloated, its time to split the class into two (or more).
The inverse of Extract Class (182)
. If a class is doing too little, then
we can reduce the indirection by folding the class into another. This could
also be a stepping stone to a larger refactor. For example if you have two
classes that you want to refactor into another two classes with different
allocation of features, you may want to Inline Class (186)
then first into a
super class, and subsequently Extract Class (182)
.
The inverse of Remove Middle Man (192)
. Use when you have a situation where A
calls a method defined on C, but C is held inside of B. In this example, C is
the delegate with the actual information that A cares about. B can hide the
delegate by implementing functions that basically wrap the calls to C. That
way, if a change needs to occur on C, the impact will only go to B to make sure
the wrappers remain in sync with any interface changes from C. A does not need
to care about the change.
The inverse of Hide Delegate (189)
. Hiding delegates can get annoying the
more that the caller needs to access the delegate. For every new interaction
point, a new method needs to defined just to expose a way for the caller to
reach to the delegate. Eventually the coupling between the caller and the
delegate get big enough that it's about time to remove the middle man and just
have the caller call the delegate directly.
Use when you discover new, better ways to implement the same thing. For example, if you start using a library that supplies features that duplicate your code, start using the library features (e.g., lodash). Also use if the new and better way to implement the code is clearer than the original implementation. Simpler is better than complex.
Use this to ensure that related software elements are grouped together adn the links between them are easy to find and understand. Also use this when functions references elements in other contexts more than the one it currently resides in. Or use it when you need to make some function more easily callable.
Use as soon as you realize a certain data structure is not right. Leaving data structures with problems will continue to confuse readers and users of the data structure far into the future. Use also when you find you always need to pass a field from one record whenever you pass another record to a function.
Inverse of Move Statements to Callers (217)
. Use if doing so will reduce
duplication of code. Look for the opportunity to use this by looking at the
call sites of a function. If certain statements are always co-located with the
call site, then it's time to move the statements into the function. When doing
so, the statements moving into the function should only make sense within the
context of the function.
Inverse of Move Statements into Function (213)
. Functions are meant to be an
atomic unit of some action. When common behavior used in several places need
to vary slightly at each of its call sites, it's time to move the varying code
out of a function and into the call sites.
Quite simply, this is taking statements and placing them into a new function.
This is very similar in oncept to Move Statements into Function (213)
, but
just without the relationship between an already existing function to its
callsites. With this refactor, you are just repackaging statements into a
brand new function, mostly to give a name to the operation so that the call
site can focus on what to do and the packaged function can focus on how
it's doing it.
Use to co-locate code statements together that share similar concerns or access the same data structures. Use also as an intermediate step to a larger refactor. Back to Top
Use when you find a loop that is doing multiple things at the same time. This refactor ultimately makes you repeat the loop in the pursuit of clearer code. This might be uncomfortable, but if the duplication of the loop ends up being the bottleneck, the refactor will make it easier to combine the loop again after.
Chained pipelines like filter
, map
, reduce
are often significantly more
readable than complex for-loops that execute multiple statements. When
appropriate, convert the for-loop into a pipeline.
Back to Top
Use always. Especially commented out code. You can always get it back with git.
Use when you see a variable identifier being reused again and again on separate statements. Ideally an identifier should be declared and used one time. There will be notable exceptions to this, but for the most part, try to achive a one-time usage.
Use when a new name would better convey the purppose and usage of a variable.
Use when the usage of a variable is located far from locations where that variable can be mutated. If possible, make the derived variable computed lazily, or on-demand. This makes it easier to reason out what the value of the variable can be at a given time.
Inverse of Change Value to Reference (256)
. Storing variables as values
instead of references is advantageous because values can be treated as
immutable, and when they get set to new values, a completely new value can be
instantiated and substitute itself in. Values are especially helpful when
passing them between different contexts, but only if you don't need the actual
value to be shared. For that, you'd need a reference.
Inverse of Change Reference to Value (252)
. Use when you want the value for
some variable to be dictated in some source of truth that all accessors must
reference.
In a function, conditionals describe what happens but obfuscates the why it
happens. Decomposing the condition means to take the condition itself and
re-write it as a function that returns some boolean. The function name will
describe the intention for the condition. This is really just a hyper-specific
case of Extract Function (106)
, specific for conditional logic.
Use when you find a series of conditional checks, where each check is different
but the outcome of each check is the same. Using and
, or
and not
logic
helps to convey that you are really only performing one check, and also sets up
the code for a future Decompose Conditional (260)
. This is not a one-size
fits all, though. Sometimes even if the series of conditional checks are meant
to be thought as different checks, then don't do the refactor.
Functions with a conditional come in two flavors: the first is where the
branch in code result in two paths that can be considered "normal" and maybe
even more-or-less equally likely. the second flavor is where there is one
"true" happy path and the conditional checks for some weird state before
continuing. A guard clause
(defined here by Fowler) is an if-check that
returns early if found to be true.
Returning early from guard clauses helps with lateral drift of code, aka code that is becoming more and more nested means the code needs to be more and more indented to make visual sense.
Use when you come across some kind of switch
statement that operates on a set
of variables differently based on the type of the variable. You can create a
class for each case and then create a polymorphic function that handles
themselves differently based on the individual case implementation of the
function. This is especially helpful when there is some kind of base case that
should be defined, but that some cases have small variations to the base case.
Use when you discover some code where many users of a data structure check for a specific value and then do the same thing with it every time. One common example is when code is written to handle nulls. The idea is instead of returning null, you return a literal object that represents null and has its own method for defautl behavior. This is actually a larger coding pattern called the Null Object Pattern, which Fowler calls a special case of "Special Case".
Use when a section of code should only work if certain conditions are true. The important part is that this should only be done where a failure would be caused by a programmer error. The program should be equally correct if all assertion statements were removed.
Assertions are a valuable mode of communication. They indicate the expected state at some certain point of the code.
In general and for most cases, making clear distinctions between functions with observable side effects and those without are a great idea.
Use when there are two very similar functions that do nearly the same thing but differ by some hardcoded constant. The constant can be made into a parameter to a common function.
Use when a parameter for a function serves as a flag argument that leads to some branching path within the function. There should be two separate functions for the different cases of the flag.
Use when a callsite deconstructs an object or record into individual values and then passes those deconstructed values into another function. The function should just accept the entire record and deconstruct what it needs out of it.
"Pulling several values from an object to do some logic on them alolne is a smell, and usually a signal that this logic should be moved into the whole itself."
Inverse of Replace Query with Parameter (327)
. The parameter list to a
function should usmmarize the points of variability of that function,
indicating the primary ways in which that function may behave differently.
If a call passes in a value that the function can just as easily determine for itself, that's a form of duplication - one that unnecessarily complicates the caller, which has to determine teh value of a parameter when it could have been freed of that work.
Should not use this if removing the parameter and moving its derivation to the function would add an unwanted dependency to the function body.
Inverse of Replace Parameter with Query (324)
. Use when the query inside the
function is part of some unwanted dependency that you no longer want the
function to care or be aware about. For example, in JavaScript, if the
function has closure over some variable it is using and you want to decouple it
from its declaration site, you may want to parametrize the query so that it
makes it easier to move the function around.
In the context of getters and setters, removing the setting method will convey that updating values of some object make no sense after initialization. This reduces the areas in the code that can mutate the object. If the caller wants a different object, they can instantiate a new one.
Use when you want code to conditionally create different objects based on some parameter. Many languages have limitations to what you can do with a constructor, but a function that returns an instantiated object will have no such limitations.
Inverse of Replace Command with Function (344)
. At times it can be a good
idea to take a function and move it into some class object that must first be
instantiated and subsequently have the function called. Fowler calls this
containing object a Command
. Command
s give you a much richer lifestyle to
the function. For example, since it's in a class that can be stateful, you
could potentially have an undo
function. You can also take advantage of the
class
nature by using inheritance, granting better flexibility.
The downside to the above is added complexity, which, based on the circumstance, you may want to avoid.
Inverse of Replace Function with Command (337)
. Use when a Command
would
be overkill for what you're trying to do or want the function to do.
Inverse of Push Down Method (359)
. Use to reduce any duplication of code
between class "siblings". Raising the method up to a common parent will by
defintiion remove the duplication. Usually in order to do this refactor you'll
have to do other refactors as well.
As a word of warning, Fowler notes that the most awkward version of this is
when the body of one of the methods refers to attributes that are on a subclass
but not on the superclass. When that happens, he'll reach for Pull Up Field (353)
first.
Inverse of Push Down Field (361)
. Use for the same reasons as Pull Up Method (350)
: to reduce duplication of code betweeen class "siblings".
This is basically a special case of Pull Up Method (350)
, specific for
constructor methods since for a lot of languages the constructor is very
important. What this looks like is moving common behavior to a superclass and
then having the subclasses calling super()
, or some language-equivalent.
Inverse of Pull Up Method (350)
. Use this when a particular behavior is
specific to one of the subclasses and not to all subclasses. As a caveat, this
refactor only applies if the caller knows it's working with the relevant
subclass. If it doesn't know that or can't know that, then you should instead
reach for Replace Conditional with Polymorphism (272)
.
Inverse of Push Up Field (353)
. Use for the same reason as Push Down Method (359)
. If a particular subclass is cares about a field and more of its
"siblings" do not, then move the field to that subclass.
Inverse of Remove Subclass (369)
. Use when you have a class that has a
special attribute that indicates some kind of enumerated "type" and a lot of
its other behavior is dependent on the value of this type variable.
Inverse of Replace Type Code with Subclasses (362)
. Subclasses lose their
value as the variations they support are moved to other places or removed
altogehter. Sometimes, the subclasses were created in anticipation of features
that never end up being built or built in a way that doesn't need the subclass.
A subclass that does too little is costly, and it is therefore best to remove the subclass, replacing it with a field on the superclass.
Use when you notice two classes doing similar things to pull the common behavior into a superclass.
Use when you determine that a class and its parent are no longer different enough to keep them different.
Sometimes you'll want to abort from using inheritance at all. Inheritance allows only for a single axis of variation. If you want to vary a People class based on whether they're young or old and also by whether they're poor or rich, you cannot do so with just inheritance. Inheritance also introduces a very close relationship betwen classes. Any change to a parent class can easily break children.
This refactor is really close to supporting the mantra to "Favor object composition over class inheritance." Fowler calls "object composition" the same as what he refers to as a delegate.
Use when some but not all of the methods on the superclass don't make sense on the subclass. For example, early in Computer Science history, they used to mis-inherit List into a Stack object. The problem was that the Stack object had all the methods that a List would, and many of these methods are not applicable to a Stack.