Created
July 13, 2017 15:36
-
-
Save boredstiff/5efce7a0e13f75c24d10830082ed1f0a to your computer and use it in GitHub Desktop.
Unittesting notes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Some old unittesting notes I consult once in a while when I get stuck writing a test | |
In the broadest sense of the term, unit testing simply means testing a single unit of code, isolated from other code that it might be integrated with. Traditionally, unit testing was an activity that was primarily performed by test engineers. These engineers would take code given by the developers and run them through a suite of tests to verify that the code worked. Since this code was tested before integration, the process fits into the definition of a unit test. Traditional unit testing was typically a manual affair, with test engineers walking through the tests cases by hand, although some teams would go a step further and automate the tests. | |
An integration test is a test that involves exercising more than one unit of the system. The goal is to check whether these units have been integrated correctly. A typical integration test might be to go to a web page, fill in a form, and check whether the right message is displayed on the screen. In order for this test to pass, the UI must show the form correctly, the input must be captured correctly, and that input must be passed on to any logic processing. The steps might involve reading and writing from a database before a message is generated and the UI has to display it correctly. Only if all these interactions succeed will the integration test pass. If any one step should fail, the integration test will fail. | |
At this point, a valid question would be to ask why we need unit testing at all. Why not write only integration tests, where a single test could check so many parts of the application at once? The reason is that integration tests do not pinpoint the location of failure. A failing integration test could have an error in the UI, or in the logic, or somewhere in the way data is read or written. It will take a lot of investigation to see where the error is and fix it. By contrast, with well-written unit tests, a failing unit test will pinpoint exactly what is failing. Developers can go right to the point and fix the error. | |
Along the way, teams started moving to a process where developers themselves wrote tests for the code that they had implemented. These tests would be written after the developer had finished the implementation, and helped verify that the code worked as expected. These tests were usually automated. Such a process is generally called developer testing or developer unit testing. | |
TDD takes developer tests one step further, by writing the test before starting the implementation. | |
Developer tests: Any kind of automated unit tests written by the developer, either before or after functionality is implemented. | |
Unit testing: Any kind of testing of a particular unit of an application, either by a developer or a tester. These tests might be automated, or run manually. | |
Integration testing: Any kind of testing that involves two or more units working together. These tests are typically performed by a tester, but they could be done by a developer as well. These tests might be manual or automated. | |
As we can see, unit testing is a general term, whereas developer testing is a specific subset of unit testing, and TDD is a specific form of developer testing. | |
On the surface, traditional unit testing, developer testing and TDD look similar. They all appear to be about writing tests for a single unit of code, with only minor variations based on who writes the test and whether the tests are written before the code or after. | |
However, dig deeper and differences appear. First, the intent is vastly different. Traditional unit testing and developer testing are all about writing tests to verify that the code works as it is supposed to. On the other hand, the main focus of TDD is not really about testing. The simple act of writing a test before the implementation changes the way we think when we implement the corresponding functionality. The resulting code is more testable, usually has a simple and elegant design, and is more maintainable and readable. This is because making a class easy to test also encourages good design practices, such as decoupling dependencies and writing small, modular classes. | |
Thus, one can say that TDD is all about writing better code, and it is just a happy side effect that we end up with a fully automated test suite as an outcome. | |
This difference in intent manifests itself in the type of tests. Developer testing usually results in large test cases, with a hefty part of the test code involved in test setup. By contrast, tests written using TDD are very small and numerous. Some people like to call them micro tests to differentiate them from other developer tests or traditional unit tests. TDD-style unit tests also try to be very fast to run because they are executed every few minutes during the development process. | |
Finally, the tests that are written in TDD are those that drive the development forward, and not necessarily those that cover all imaginable scenarios. For example, a function that is supposed to process a file might have tests to handle cases when the file exists or it doesn't exist, but probably won't have tests to see what happens if the file is 1 terabyte in size. The latter is something that a tester might conceivably test for, but would be an unusual test in TDD unless the function is clearly expected to work with such a file. | |
In TDD, tests are nothing but requirements | |
A test is brittle when a change in the implementation details requires a change in the test cases. Ideally, a test should be testing the interface and not the implementation directly. After all, it is the interface that other units will be using to interact with this unit. When we test through the interface, it allows us the freedom to change the implementation of code without worrying about breaking the tests. | |
If at all possible, avoid using implementation details in tests, and only use the publicly exposed interface. This includes using only the interface methods in setup code and assertions. | |
If the test needs to check functionality that is internal to the unit being tested, and it is an important functionality, then it might make sense to check for specific implementation actions. | |
If it is cumbersome to use the external interface to set up the exact state that we want, or there is no interface method that retrieves the specific value we want to assert, then we may need to peek into the implementation in our tests. | |
If we are fairly confident that the implementation details are very unlikely to change in the future, then we might go ahead and use implementation-specific details in the test. | |
Mocking notes: | |
We've been talking about mocks so far, but we've been a little loose on the terminology. Technically, everything we've talked about falls under the category of a test double. A test double is some sort of fake object that we use to stand in for a real object in a test case. | |
Mocks are a specific kind of test double that record information about calls that have been made to it, so that we can assert on them later. | |
Stubs are just an empty do-nothing kind of object or method. They are used when we don't care about some functionality in the test. For example, imagine we have a method that performs a calculation and then sends an e-mail. If we are testing the calculation logic, we might just replace the e-mail sending method with an empty do-nothing method in the test case so that no e-mails are sent out while the test is running. | |
Fakes are a replacement of one object or system with a simpler one that facilitates easier testing. Using an in-memory database instead of the real one, or the way we created a dummy TestAction earlier in this chapter would be examples of fakes. | |
Finally, spies are objects that are like middlemen. Like mocks, they record the calls so that we can assert on them later, but after recording, they continue execution to the original code. Spies are different from the other three in the sense that they do not replace any functionality. After recording the call, the real code is still executed. Spies sit in the middle and do not cause any change in execution pattern. | |
Legacy code: | |
Characterization tests are tests that describe the current behavior of the code. We aren't writing the tests against a predefined expectation. Instead, we write the tests against the actual behavior. You may ask what this accomplishes since the test can't fail if we are going to look at the current behavior and write a test that looks for the same thing. However, the thing to remember is that we aren't trying to find bugs. Instead, by writing tests against the current behavior, we are building up a safety net of tests. If we break something during the process of refactoring, the test will fail and we will know that we have to undo our changes. | |
Maintaining Your Test Suite: | |
unit tests serve a number of different purposes: | |
Serve as a test suite: This is the most obvious goal of unit tests. A comprehensive test suite reduces the number of bugs that can escape into production. | |
Serve as documentation: When we are trying to understand what a class or method is trying to do, it is useful to take a look at the test suite. A well-written test suite will illustrate how the piece of code is supposed to behave. | |
Serve as a safety net: This frees us from pressure when we refactor or clean up the code. | |
Illustrate the design: Using mocks, we can depict the interactions between different classes or modules. | |
The goal of a well-written suite of unit tests is to enable these purposes as well as possible. | |
For example, if we want to understand what a method does, then we need to work out the following: | |
Can we locate its unit test suite easily? | |
Once located, is it easy to understand the tests and what they are testing for? | |
Can we understand how this method interacts with other classes? | |
Once we understand what the method does, then is it easy to refactor it? Does making small refactorings break all the tests? | |
Is it easy to identify which tests have broken due to our refactoring and which failures are bugs in the refactoring? | |
If we need to make changes in the test case due to the refactoring, then how difficult is it to understand the test and make the required change? | |
To avoid this situation, the unittest module provides the addCleanup method. This method takes a callback function, which is called no matter whether the setup passed or failed, as shown in the following: | |
When would we use addCleanup versus tearDown? In general, when accessing resources that must be closed, addCleanup is the way to go. tearDown is a good place to put other types of cleanup, or in cases where setUp cannot throw an exception. | |
@mock.patch.object(smtplib, "SMTP") | |
class EmailActionTest(unittest.TestCase): | |
def setUp(self): | |
self.action = EmailAction(to="[email protected]") | |
def test_connection_closed_after_sending_mail(self, mock_smtp_class): | |
mock_smtp = mock_smtp_class.return_value | |
self.action.execute("MSFT has crossed $10 price level") | |
mock_smtp.send_message.assert_called_with(mock.ANY) | |
self.assertTrue(mock_smtp.quit.called) | |
mock_smtp.assert_has_calls([ | |
mock.call.send_message(mock.ANY), | |
mock.call.quit()]) | |
Instead of doing the above, move the mock_smtp object to setup. | |
def setUp(self): | |
patcher = mock.patch("smtplib.SMTP") | |
self.addCleanup(patcher.stop) | |
self.mock_smtp_class = patcher.start() | |
self.mock_smtp = self.mock_smtp_class.return_value | |
self.action = EmailAction(to="[email protected]") | |
Writing better asserts: | |
class TimeSeriesEqualityTest(unittest.TestCase): | |
def test_timeseries_price_history(self): | |
series = TimeSeries() | |
series.update(datetime(2014, 3, 10), 5) | |
series.update(datetime(2014, 3, 11), 15) | |
self.assertEqual(5, series[0].value) | |
self.assertEqual(15, series[1].value) | |
Rewritten: | |
class TimeSeriesTestCase(unittest.TestCase): | |
def assert_has_price_history(self, price_list, series): | |
for index, expected_price in enumerate(price_list): | |
actual_price = series[index].value | |
if actual_price != expected_price: | |
raise self.failureException("[%d]: %d != %d".format(index, expected_price, actual_price)) | |
class TimeSeriesEqualityTest(TimeSeriesTestCase): | |
def test_timeseries_price_history(self): | |
series = TimeSeries() | |
series.update(datetime(2014, 3, 10), 5) | |
series.update(datetime(2014, 3, 11), 15) | |
self.assert_has_price_history([5, 15], series) | |
When the same sequence of asserts is made again and again in multiple tests, we should see whether there is an opportunity to replace the calls to the built-in asserts with a higher level assert that more clearly expresses the intent that we want to convey. | |
Adding custom equality checkers as opposed to assertEqual: | |
>>> test_case.addTypeEqualityFunc(Stock, lambda stock_1, stock_2, msg: stock_1.symbol == stock_2.symbol) | |
>>> test_case.assertEqual(stock_1, stock_2) | |
>>> print(test_case.assertEqual(stock_1, stock_2)) | |
The equality checker takes three parameters: the first object, the second object, and the optional message that the user passed to assertEqual. Once we register the function like this, we can call assertEqual passing in two Stock objects and assertEqual will delegate the comparison to the function that we registered. | |
Two limitations: | |
We have to use the same comparison function for a given type. There is no way to use one comparison function in some tests and another comparison function in other tests. | |
Both parameters to assertEqual have to be objects of that type. There is no way we can pass in two objects of differing types. | |
A third way to make asserts more readable is to create custom matcher objects to make comparisons more readable during assertions. | |
Matchers: | |
A matcher can be any class that implements that __eq__ method. | |
The method will take the actual object as a parameter and the method can implement any comparison logic required. | |
class MessageMatcher: | |
def __init__(self, expected): | |
self.expected = expected | |
def __eq__(self, other): | |
return self.expected["Subject"] == other["Subject"] and \ | |
self.expected["From"] == other["From"] and \ | |
self.expected["To"] == other["To"] and \ | |
self.expected["Message"] == other._payload | |
he matcher does not need to compare complete domain objects | |
We can compare only the attributes that we are interested in | |
class AlertMessageMatcher: | |
def __init__(self, expected): | |
self.expected = expected | |
def __eq__(self, other): | |
return self.expected["Subject"] == "New Stock Alert" and \ | |
self.expected["From"] == other["From"] and \ | |
self.expected["To"] == other["To"] and \ | |
self.expected["Message"] == other._payload | |
Doctests: | |
def price(self): | |
"""Returns the current price of the Stock | |
>>> from datetime import datetime | |
>>> stock = Stock("GOOG") | |
>>> stock.update(datetime(2011, 10, 3), 10) | |
>>> stock.price | |
10 | |
""" | |
try: | |
return self.history[-1].value | |
except IndexError: | |
return None | |
The >>> is testable with | |
if __name__ == '__main__': | |
import doctest | |
doctest.testmod() | |
Running a big doctest: | |
def price(self): | |
"""Returns the current price of the Stock | |
>>> from datetime import datetime | |
>>> stock = Stock("GOOG") | |
>>> stock.update(datetime(2011, 10, 3), 10) | |
>>> stock.price | |
10 | |
The method will return the latest price by timestamp, so even if updates are out of order, it will return the latest one | |
>>> stock = Stock("GOOG") | |
>>> stock.update(datetime(2011, 10, 3), 10) | |
Now, let us do an update with a date that is earlier than the | |
previous one | |
>>> stock.update(datetime(2011, 10, 2), 5) | |
And the method still returns the latest price | |
>>> stock.price | |
10 | |
If there are no updates, then the method returns None | |
>>> stock = Stock("GOOG") | |
>>> print(stock.price) | |
None | |
""" | |
try: | |
return self.history[-1].value | |
except IndexError: | |
return None | |
When run with doctest, the above produces: | |
Trying: | |
stock = Stock("GOOG") | |
Expecting nothing | |
ok | |
Trying: | |
stock.update(datetime(2011, 10, 3), 10) | |
Expecting nothing | |
ok | |
Trying: | |
stock.update(datetime(2011, 10, 2), 5) | |
Expecting nothing | |
ok | |
Trying: | |
stock.price | |
Expecting: | |
10 | |
ok | |
Trying: | |
stock = Stock("GOOG") | |
Expecting nothing | |
ok | |
Trying: | |
print(stock.price) | |
Expecting: | |
None | |
ok | |
As we can see, doctest goes through the documentation and identifies the exact lines that need to be executed. | |
This allows us to put explanations and documentation in between the code snippets. The result? Well-explained documentation plus testable code. A great combination! | |
Testing for exceptions: | |
def update(self, timestamp, price): | |
"""Updates the stock with the price at the given timestamp | |
>>> from datetime import datetime | |
>>> stock = Stock("GOOG") | |
>>> stock.update(datetime(2014, 10, 2), 10) | |
>>> stock.price | |
10 | |
The method raises a ValueError exception if the price is negative | |
>>> stock.update(datetime(2014, 10, 2), -1) | |
Traceback (most recent call last): | |
... | |
ValueError: price should not be negative | |
""" | |
if price < 0: | |
raise ValueError("price should not be negative") | |
self.history.update(timestamp, price) | |
self.updated.fire(self) | |
doctest allows you to put three dots to skip the message. | |
Package-level doctests: | |
Go in the __init__.py of the package. | |
r""" | |
The stock_alerter module allows you to set up rules and get alerted when those rules are met. | |
>>> from datetime import datetime | |
First, we need to setup an exchange that contains all the stocks that are going to be processed. A simple dictionary will do. | |
>>> from stock_alerter.stock import Stock | |
>>> exchange = {"GOOG": Stock("GOOG"), "AAPL": Stock("AAPL")} | |
Next, we configure the reader. The reader is the source from where the stock updates are coming. The module provides two readers out of the box: A FileReader for reading updates from a comma separated file, and a ListReader to get updates from a list. You can create other readers, such as an HTTPReader, to get updates from a remote server. | |
Here we create a simple ListReader by passing in a list of 3-tuples containing the stock symbol, timestamp and price. | |
>>> from stock_alerter.reader import ListReader | |
>>> reader = ListReader([("GOOG", datetime(2014, 2, 8), 5)]) | |
Next, we set up an Alert. We give it a rule, and an action to be taken when the rule is fired. | |
>>> from stock_alerter.alert import Alert | |
>>> from stock_alerter.rule import PriceRule | |
>>> from stock_alerter.action import PrintAction | |
>>> alert = Alert("GOOG > $3", PriceRule("GOOG", lambda s: s.price > 3),\ | |
... PrintAction()) | |
Connect the alert to the exchange | |
>>> alert.connect(exchange) | |
Now that everything is setup, we can start processing the updates | |
>>> from stock_alerter.processor import Processor | |
>>> processor = Processor(reader, exchange) | |
>>> processor.process() | |
GOOG > $3 | |
""" | |
if __name__ == "__main__": | |
import doctest | |
doctest.testmod() | |
You can also change the __init__.py to not contain the doctests and put it in another file (because it's just a raw string anyway) | |
if __name__ == "__main__": | |
import doctest | |
doctest.testfile("readme.txt") | |
One of the missing features of the doctest module is an effective autodiscovery mechanism. Sucks for large projects | |
There are some ways to accomplish this, though | |
import doctest | |
import unittest | |
from stock_alerter import stock | |
class PackageDocTest(unittest.TestCase): | |
def test_stock_module(self): | |
doctest.testmod(stock) | |
def test_doc(self): | |
doctest.testfile(r"..\readme.txt") | |
This works, but the problem is that the test doesn't fail if there is a failure in the doctest. The error is printed out, but it doesn't record a failure. This is okay if the tests are run manually, but causes a problem when the tests are run in an automated fashion, for example, as a part of a build or deploy process. | |
doctest has another feature by which it can be wrapped inside a unittest: | |
import doctest | |
from stock_alerter import stock | |
def load_tests(loader, tests, pattern): | |
tests.addTests(doctest.DocTestSuite(stock)) | |
tests.addTests(doctest.DocFileSuite("../readme.txt")) | |
return tests | |
Since doctests aren't part of unittest.TestCase, they are not run by default when we execute the unit tests. What we do instead is implement the load_tests function and add the doctests to the test suite in that function. We use the doctest.DocTestSuite and doctest.DocFileSuite methods to create unittest-compatible test suites from the doctests. We then append these test suites to the overall tests to be executed in the load_tests function. | |
The biggest limitation of doctest is that it only compares printed output. This means that any output that could be variable will lead to test failures. The following is an example: | |
>>> exchange | |
{'GOOG': <stock_alerter.stock.Stock object at 0x00000000031F8550>, 'AAPL': <stock_alerter.stock.Stock object at 0x00000000031F8588>} | |
The order in which a dictionary object is printed out is not guaranteed by Python, which means it could be printed out in the opposite order, sometimes leading to failure | |
The Stock object might be at a different address each time, so that part will fail to match the next time the test is run | |
The solution to the first problem is to ensure that the output is deterministic. For example, the following approach will work: | |
>>> for key in sorted(exchange.keys()): #doctest: +ELLIPSIS | |
... print(key, exchange[key]) | |
... | |
AAPL <stock_alerter.stock.Stock object at 0x0...> | |
GOOG <stock_alerter.stock.Stock object at 0x0...> | |
Now that we have a pretty good idea of doctests, the next question is: how does this fit into the TDD process? Remember, in the TDD process, we write the test first, and then the implementation later. Do doctests fit in with this process? | |
How do doctests fit into the TDD process?: | |
In a way, yes. Doctests are not a particularly good fit for doing TDD for single methods. The unittest module is a better choice for those. Where doctest shines is at package-level interaction. Explanations interspersed with examples really bring out interaction between different modules and classes within the package. Such doctests can be written out at the beginning, giving a high-level overview of how we want the package as a whole to work. These tests will fail. As individual classes and methods are written, the tests will start to pass. | |
nose2: | |
Tests in nose2 are cleaner. No _need_ for classes like unittests | |
class StockTest(unittest.TestCase): | |
def setUp(self): | |
self.goog = Stock("GOOG") | |
def test_price_of_a_new_stock_class_should_be_None(self): | |
self.assertIsNone(self.goog.price) | |
becomes | |
def test_price_of_a_new_stock_class_should_be_None(): | |
goog = Stock("GOOG") | |
assert goog.price is None # uses Python's built-in assert | |
We no longer have to create a class to hold the tests in | |
We no longer have to inherit from any base class | |
We don't even have to import the unittest module | |
(nose2 will work with unittests, though) | |
assert also takes a message like in unittests: | |
assert goog.price is None, "Price of a new stock should be None" | |
nose2 also supports setup and teardown | |
def setup_test(): | |
global goog | |
goog = Stock("GOOG") | |
def teardown_test(): | |
global goog | |
goog = None | |
def test_price_of_a_new_stock_class_should_be_None(): | |
assert goog.price is None, "Price of a new stock should be None" | |
test_price_of_a_new_stock_class_should_be_None.setup = setup_test | |
test_price_of_a_new_stock_class_should_be_None.teardown = \ teardown_test | |
Parameterizing tests: | |
from nose2.tools.params import params | |
def given_a_series_of_prices(stock, prices): | |
timestamps = [datetime(2014, 2, 10), datetime(2014, 2, 11), | |
datetime(2014, 2, 12), datetime(2014, 2, 13)] | |
for timestamp, price in zip(timestamps, prices): | |
stock.update(timestamp, price) | |
@params( | |
([8, 10, 12], True), | |
([8, 12, 10], False), | |
([8, 10, 10], False) | |
) | |
def test_stock_trends(prices, expected_output): | |
goog = Stock("GOOG") | |
given_a_series_of_prices(goog, prices) | |
assert goog.is_increasing_trend() == expected_output | |
If we make a pacakge for nose2 (we currently have one for nose, which is dead), we need to put six as a required module. | |
For Python 2.7, Python 3.2 and pypy, nose2 requires six version 1.1. | |
nose2 supports generated tests | |
def test_trend_with_all_consecutive_values_upto_100(): | |
for i in range(100): | |
yield stock_trends_with_consecutive_prices, [i, i+1, i+2] | |
def stock_trends_with_consecutive_prices(prices): | |
goog = Stock("GOOG") | |
given_a_series_of_prices(goog, prices) | |
assert goog.is_increasing_trend() | |
In the example above, 100 tests have been run. | |
Uses a generator (yield). | |
When a test fails, it shows you the number of the tests that failed: | |
FAIL: stock_alerter.tests.test_stock.test_trend_with_all_consecutive_values_upto_100:100 | |
[99, 100, 100] | |
Layers: | |
Looking at the tests that have been written in the package already, you see there's a lot of repetition. | |
class StockTest(unittest.TestCase): | |
def setUp(self): | |
self.goog = Stock("GOOG") | |
class StockCrossOverSignalTest(unittest.TestCase): | |
def setUp(self): | |
self.goog = Stock("GOOG") | |
nose2 allows you to share part of the setup between tests. | |
nose2 has another way to write tests called Layers. Layers allow us to organize our tests hierarchically. The following is an example of some of the Stock tests rewritten using Layers: | |
from nose2.tools import such | |
with such.A("Stock class") as it: | |
@it.has_setup | |
def setup(): | |
it.goog = Stock("GOOG") | |
with it.having("a price method"): | |
@it.has_setup | |
def setup(): | |
it.goog.update(datetime(2014, 2, 12), price=10) | |
@it.should("return the price") | |
def test(case): | |
assert it.goog.price == 10 | |
@it.should("return the latest price") | |
def test(case): | |
it.goog.update(datetime(2014, 2, 11), price=15) | |
assert it.goog.price == 10 | |
with it.having("a trend method"): | |
@it.should("return True if last three updates were increasing") | |
def test(case): | |
it.goog.update(datetime(2014, 2, 11), price=12) | |
it.goog.update(datetime(2014, 2, 12), price=13) | |
it.goog.update(datetime(2014, 2, 13), price=14) | |
assert it.goog.is_increasing_trend() | |
it.createTests(globals()) | |
This feels very BDD, but I like it. | |
Test cases are marked using the should decorator, which takes a description string which explains the test. | |
The createTests method should be called at the end of the topmost layer. It takes a single parameter of the current globals. | |
If the createTests method is not called, none of the tests will be executed. | |
This test suite gives the following output: | |
A Stock class | |
having a price method | |
should return the price ... ok | |
should return the latest price ... ok | |
having a trend method | |
should return True if last three updates were increasing ... ok | |
nose2 has plugins, like the following: | |
coverage, doctest, debugging, writing test results to an xml file (boo :)), | |
Can have a nose2.cfg in the src directory that looks like this: | |
[unittest] | |
test-file-pattern=test_*.py | |
test-method-prefix=test | |
plugins = nose2.plugins.coverage | |
nose2.plugins.junitxml | |
nose2.plugins.layers | |
exclude-plugins = nose2.plugins.doctest | |
[layer-reporter] | |
always-on = False | |
colors = True | |
[junit-xml] | |
always-on = True | |
path = nose2.xml | |
[coverage] | |
always-on = False | |
coverage-report = ["html", "xml"] | |
Also supports multiple configuration files with the nose2 --config. | |
============= | |
Unit Testing patterns: | |
fast-tests: | |
Disable unwanted external services | |
Mock out external services | |
Use fast variants of services | |
Externalize configuration - What does configuration have to do with unit testing? Simple: if we need to enable or disable services, or replace services with fake services, then we need to have different configurations for the regular application and for when running unit tests. This requires us to design the application in a way that allows us to easily switch between multiple configurations. | |
Run tests for the current module only. | |
Running a subset of tests: | |
Can create our own test suites | |
--------------- | |
test_something.py | |
def suite(): | |
test_suite = unittest.TestSuite() | |
test_suite.addTest(StockTest("test_stock_update")) | |
return test_suite | |
--------------- | |
test_suites.py | |
import unittest | |
from stock_alerter.tests import test_stock | |
if __name__ == "__main__": | |
runner = unittest.TextTestRunner() | |
runner.run(test_stock.suite()) | |
--------------- | |
One of the problems with the suite function is that we have to add each test individually into the suite. Can simply by using a unittest.TestLoader | |
def suite(): | |
loader = unittest.TestLoader() | |
test_suite = unittest.TestSuite() | |
test_suite.addTest(StockTest("test_stock_update")) | |
test_suite.addTest( | |
loader.loadTestsFromTestCase(StockCrossOverSignalTest)) | |
return test_suite | |
A simpler way to create test suites is with the load_tests function | |
def load_tests(loader, tests, pattern): | |
suite = unittest.TestSuite() | |
suite.addTest(loader.loadTestsFromTestCase(StockTest)) | |
if not sys.platform.startswith("win"): | |
suite.addTest( | |
loader.loadTestsFromTestCase(StockCrossOverSignalTest)) | |
return suite | |
-------------- | |
Skipping tests: | |
@unittest.skip("skip this test for now") | |
def test_stock_update(self): | |
self.goog.update(datetime(2014, 2, 12), price=10) | |
assert_that(self.goog.price, equal_to(10)) | |
There's also skipIf and skipUnless | |
Pattern: using attributes | |
nose2 has an attrib plugin that allows us to set attrs on test cases and select tests that match particular attributes | |
def test_stock_update(self): | |
self.goog.update(datetime(2014, 2, 12), price=10) | |
self.assertEqual(10, self.goog.price) | |
test_stock_update.slow = True | |
test_stock_update.integration = True | |
test_stock_update.python = ["2.6", "3.4"] | |
-------------- | |
Using attributes with standard unittests :\ | |
import unittest | |
class AttribLoader(unittest.TestLoader): | |
def __init__(self, attrib): | |
self.attrib = attrib | |
def loadTestsFromModule(self, module, use_load_tests=False): | |
return super().loadTestsFromModule(module, use_load_tests=False) | |
def getTestCaseNames(self, testCaseClass): | |
test_names = super().getTestCaseNames(testCaseClass) | |
filtered_test_names = [test | |
for test in test_names | |
if hasattr(getattr(testCaseClass, test), self.attrib)] | |
return filtered_test_names | |
if __name__ == "__main__": | |
loader = AttribLoader("slow") | |
test_suite = loader.discover(".") | |
runner = unittest.TextTestRunner() | |
runner.run(test_suite) | |
This little piece of code will only run those tests that have the integration attribute set on the test function | |
-------------- | |
Pattern: Expected Failure | |
Put this above the test: @unittest.expectedFailure | |
This should obviously never go into a shipped package. | |
But for development or temporarily, it's fine. If you're working on something and know it's going to fail, then fine. | |
-------------- | |
Pattern: data-driven tests | |
Data-driven tests reduce the amount of boilerplate test code by allowing us to write a single test execution flow and run it with different combinations of data. | |
from nose2.tools.params import params | |
def given_a_series_of_prices(stock, prices): | |
timestamps = [datetime(2014, 2, 10), datetime(2014, 2, 11), | |
datetime(2014, 2, 12), datetime(2014, 2, 13)] | |
for timestamp, price in zip(timestamps, prices): | |
stock.update(timestamp, price) | |
@params( | |
([8, 10, 12], True), | |
([8, 12, 10], False), | |
([8, 10, 10], False) | |
) | |
def test_stock_trends(prices, expected_output): | |
goog = Stock("GOOG") | |
given_a_series_of_prices(goog, prices) | |
assert goog.is_increasing_trend() == expected_output | |
_----__-----_- Cannot do the following in Python2, it's only in Python 3.4 or higher (or you can write metaclasses. Boo to that.) | |
It's a subtests context manager - all code within the context manager block will be treated as a separate test, and any failures are reported independently. | |
class StockTrendTest(unittest.TestCase): | |
def given_a_series_of_prices(self, stock, prices): | |
timestamps = [datetime(2014, 2, 10), datetime(2014, 2, 11), | |
datetime(2014, 2, 12), datetime(2014, 2, 13)] | |
for timestamp, price in zip(timestamps, prices): | |
stock.update(timestamp, price) | |
def test_stock_trends(self): | |
dataset = [ | |
([8, 10, 12], True), | |
([8, 12, 10], False), | |
([8, 10, 10], False) | |
] | |
for data in dataset: | |
prices, output = data | |
with self.subTest(prices=prices, output=output): | |
goog = Stock("GOOG") | |
self.given_a_series_of_prices(goog, prices) | |
self.assertEqual(output, goog.is_increasing_trend()) | |
================ | |
Pattern: integration and system tests. | |
unit tests are not integration tests. | |
They have a different purpose to validating that the system works when integrated. | |
integration tests are also important and shouldn't be ignored. Integration tests can be written using the same unittest framework that we use for writing unit tests. The key points to keep in mind when writing integration tests are as follows: | |
Still disable non-core services | |
Enable all core services | |
Use attributes to tag integration tests: By doing this, we can easily select only the unit tests to run during development, while enabling integration tests to be run during continuous integration or before deployment. | |
Try to minimize setup and teardown time | |
================ | |
Pattern - spies | |
sometimes we might want to just record that the call was made to an object, but allow the execution flow to continue to the real object and return | |
This object is known as a spy. | |
A spy retains the functionality of recording calls and being able to assert on the calls afterwards, but it does not replace a real object like a regular mock does. | |
The wraps parameter when creating a mock.Mock object allows us to create spy behavior in our code | |
def test_action_doesnt_fire_if_rule_doesnt_match(self): | |
goog = Stock("GOOG") | |
exchange = {"GOOG": goog} | |
rule = PriceRule("GOOG", lambda stock: stock.price > 10) | |
# right here | |
rule_spy = mock.MagicMock(wraps=rule) | |
action = mock.MagicMock() | |
alert = Alert("sample alert", rule_spy, action) | |
alert.connect(exchange) | |
alert.check_rule(goog) | |
rule_spy.matches.assert_called_with(exchange) | |
self.assertFalse(action.execute.called) | |
================ | |
Pattern - asserting a sequence of calls | |
we want to assert that a particular sequence of calls occurred across multiple objects | |
def test_action_fires_when_rule_matches(self): | |
goog = Stock("GOOG") | |
exchange = {"GOOG": goog} | |
rule = mock.MagicMock() | |
rule.matches.return_value = True | |
rule.depends_on.return_value = {"GOOG"} | |
action = mock.MagicMock() | |
alert = Alert("sample alert", rule, action) | |
alert.connect(exchange) | |
goog.update(datetime(2014, 5, 14), 11) | |
rule.matches.assert_called_with(exchange) | |
self.assertTrue(action.execute.called) | |
In this test, we are asserting that a call was made to the rule.matches method as well as a call being made to the action.execute method. The way we have written the assertions does not check the order of these two calls. This test will still pass even if the matches method is called after the execute method. | |
Interactive: | |
>>> from unittest import mock | |
>>> obj = mock.Mock() | |
>>> child_obj1 = obj.child1 | |
>>> child_obj2 = obj.child2 | |
>>> child_obj1.method1() | |
<Mock name='mock.child1.method1()' id='56161448'> | |
>>> child_obj2.method1() | |
<Mock name='mock.child2.method1()' id='56161672'> | |
>>> child_obj2.method2() | |
<Mock name='mock.child2.method2()' id='56162008'> | |
>>> obj.method() | |
<Mock name='mock.method()' id='56162232'> | |
Mock objects by default return new mocks whenever an attribute is accessed that doesn't have a return_value configured. So child_obj1 and child_obj2 will also be mock objects. | |
Now, let us look at the mock_calls attribute for the child objects. This attribute contains a list of all the recorded calls on the mock object, as shown in the following: | |
>>> child_obj1.mock_calls | |
[call.method1()] | |
>>> child_obj2.mock_calls | |
[call.method1(), call.method2()] | |
The mock objects have the appropriate method calls recorded, as expected. Let us now take a look at the attribute on the main obj mock object, as follows: | |
>>> obj.mock_calls | |
[call.child1.method1(), | |
call.child2.method1(), | |
call.child2.method2(), | |
call.method()] | |
The main mock object seems to not only have details of its own calls, but also all the calls made by the child mocks! | |
So, how can we use this feature in our test to assert the order of the calls made across different mocks? | |
def test_action_fires_when_rule_matches(self): | |
goog = Stock("GOOG") | |
exchange = {"GOOG": goog} | |
main_mock = mock.MagicMock() | |
rule = main_mock.rule | |
rule.matches.return_value = True | |
rule.depends_on.return_value = {"GOOG"} | |
action = main_mock.action | |
alert = Alert("sample alert", rule, action) | |
alert.connect(exchange) | |
goog.update(datetime(2014, 5, 14), 11) | |
self.assertEqual([mock.call.rule.depends_on(), | |
mock.call.rule.matches(exchange), | |
mock.call.action.execute("sample alert")], | |
main_mock.mock_calls) | |
Usually, matching every call will lead to verbose tests as all calls that are tangential to the functionality being tested also need to be put in the expected output. | |
It also leads to brittle tests as even slight change in calls elsewhere, which might not lead to change in behavior in this particular test, will still cause the test to fail. | |
This is why the default behavior for assert_has_calls is to only determine whether the expected calls are present, instead of checking for an exact match of calls. | |
In the rare cases where an exact match is required, we can always assert on the mock_calls attribute directly. | |
================================= | |
Pattern – patching the open function | |
mock has a built-in open patch, which is useful. | |
class FileReaderTest(unittest.TestCase): | |
@mock.patch("builtins.open", | |
mock.mock_open(read_data="""\ | |
GOOG,2014-02-11T14:10:22.13,10""")) | |
def test_FileReader_returns_the_file_contents(self): | |
reader = FileReader("stocks.txt") | |
updater = reader.get_updates() # This thing is a generator function | |
update = next(updater) | |
self.assertEqual(("GOOG", | |
datetime(2014, 2, 11, 14, 10, 22, 130000), | |
10), update) | |
=================================== | |
Pattern – mocking with mutable args | |
Mock only stores a reference to the call arguments. | |
>>> from unittest import mock | |
>>> param = ["abc"] | |
>>> obj = mock.Mock() | |
>>> _ = obj(param) | |
>>> param[0] = "123" | |
>>> obj.assert_called_with(["abc"]) | |
Traceback (most recent call last): | |
File "<stdin>", line 1, in <module> | |
File "C:\Python34\lib\unittest\mock.py", line 760, in assert_called_with | |
raise AssertionError(_error_message()) from cause | |
AssertionError: Expected call: mock(['abc']) | |
Actual call: mock(['123']) | |
So, when the line param[0] = "123" was executed, it affected the value that was saved as the call argument in the mock. | |
In the assertion, it looks at the saved call argument, and sees that the call was made with the data ["123"], so the assertion fails. | |
Simple Solution: we just inherit from Mock or MagicMock and change the behavior to make a copy of the arguments, as shown in the following: | |
>>> from copy import deepcopy | |
>>> | |
>>> class CopyingMock(mock.MagicMock): | |
... def __call__(self, *args, **kwargs): | |
... args = deepcopy(args) | |
... kwargs = deepcopy(kwargs) | |
... return super().__call__(*args, **kwargs) | |
>>> param = ["abc"] | |
>>> obj = CopyingMock() | |
>>> _ = obj(param) | |
>>> param[0] = "123" | |
>>> obj.assert_called_with(["abc"]) | |
Keep in mind that when we use CopyingMock, we cannot use any object identity comparisons with the arguments as they will now fail, as shown in the following | |
============== | |
Tools to improve | |
py.test | |
features: | |
Writing tests as ordinary functions. | |
Using Python's assert statement to perform asserts. | |
Ability to skip tests or mark tests as expected failures. | |
Fixtures with setup and teardown support. | |
Extensible plugin framework, with plugins available to do popular functionality such as XML output, coverage reporting, and running tests in parallel across multiple processors or cores. | |
Tag tests with attributes. | |
Integration with popular tools. | |
funcargs with pytest: | |
import pytest | |
@pytest.fixture | |
def goog(): | |
return Stock("GOOG") | |
def test_stock_update(goog): | |
assert goog.price is None | |
In this code, the test_stock_update function takes a parameter called goog. Additionally, we have a function called goog, which is marked with the pytest.fixture decorator. PyTest will match these two, call the appropriate fixture, and pass in the return value as a parameter to the test. | |
This solves the following two problems: | |
It enables us to pass fixture values to function style test cases without having to resort to globals. | |
Instead of writing a large fixture, we can create many small ones, and the test cases only use the fixtures that they need. This makes it easier to read test cases as we don't have to look at a large fixture setup that has different lines of setup meant for different tests. | |
==================== | |
Can integrate the test_suite into setup.py files | |
from setuptools import setup, find_packages | |
setup( | |
name="StockAlerter", | |
version="0.1", | |
packages=find_packages(), | |
test_suite="stock_alerter.tests", | |
) | |
Can also use nose2 for the testsuite in setup.py | |
from setuptools import setup, find_packages | |
setup( | |
name="StockAlerter", | |
version="0.1", | |
packages=find_packages(), | |
tests_require=["nose2"], | |
test_suite="nose2.collector.collector", | |
==================== | |
Can use tox if we want to set up maintenance of a Python package (and testing) across multiple versions of Python. | |
==================== | |
Can use sphinx.ext.doctest if we decide to use the doctests. | |
==================== | |
If we stick with unittest, all the features written above have been backported to unittest2 | |
We'll need to make a package for it, but that all depends on if we want to go with nose2 or unittest2 | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment