Skip to content

Instantly share code, notes, and snippets.

@paulghaddad
Last active August 29, 2015 14:10
Show Gist options
  • Save paulghaddad/c4e08684de85df946106 to your computer and use it in GitHub Desktop.
Save paulghaddad/c4e08684de85df946106 to your computer and use it in GitHub Desktop.
Level Up 3: Knows why flapping tests are destructive
1. First, tell me what a flapping test is. Then, explain some ways that flapping tests are more destructive than even failing or non-existent tests.
Flapping tests are those that are non-deterministically green. They may fail, then pass the next time you run them, hence, "flapping".
Flapping tests are more destructive than failing or non-existent tests because they force us to visit many more tests each time we modify our codebase, leading to decreased productivity. For a normal failing test, we actually expect the test to fail initially, and it is just a matter of writing working code to make the test pass. Furthermore, when you experience flapping tests, you lose confidence that your test is actually doing its job: verifying a specific part of the application is working properly.
2. Name some gotchas when writing tests that can cause them to become fragile, and how to fix those problems.
- The browser environment: Never assume the page has loaded; the markup you are asserting against exists; your AJAX requests finish; and browser actions occur at a normal speed. Capybara is prone to these issues. You may have to be more patient by increasing your test timeouts so that browser rendering finishes.
- Race conditions: This often occurs with AJAX. AJAX calls and returns to the browser may occur after the test assertion occurs, making the test fail. Although Capybara tries to overcome this by retrying the assertion, you can manually fix this by adding a "wait_until" method to wait for the AJAX response.
- Creating database transactions from within the test thread is another source for race condition errors. You can't trust the test thread and server thread to read the same database state. To fix this issue, use immutable database states with fixtures (with fixture_builder) for Cucumber tests; you can also use mutex to control the race between assertions and exercising your application.
- Fake data: using poor fake data, for instance from Faker, that adds random fake data your application isn't built to handle (such as foreign zip codes). To counteract this, understand the fake data you are generating to ensure it tests your application appropriately.
- Time-dependent data: Database objects may be in a different timezone that your machine time, leading to failures. Do not use the current day and time for tests; use timezone-dependent times and dates.
- External dependencies: when you rely on external services, you are injecting possible random failures in your test base due to the service's reliability and data. To handle failures, you need to reproduce what is causing the error. This can be done by capturing the database state at the time of failure and replaying third-party responses using libraries such as webmock and vcr. The main purpose of the test code is to know when your code is broken, not the third party's. You can test whether the third-party service is broken or has changed by testing these interactions through an external build, separate from your normal build.
- Test pollution: results when tests alter state, and this state persists and influences subsequent tests. RSpec tries to counteract this by performing tests in a random order. To reproduce test pollution, you have to identify and use the seed value in RSpec to force the test to run in the same order. You will also want to reuse the same seed data. If the failure is repeated, this confirms test pollution and you should be able to fix the problem by tightening feedback loops with binary search debugging.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment