Skip to content

Instantly share code, notes, and snippets.

@grkvlt
Last active December 17, 2015 05:58
Show Gist options
  • Save grkvlt/5561496 to your computer and use it in GitHub Desktop.
Save grkvlt/5561496 to your computer and use it in GitHub Desktop.
Brooklyn Acceptance Testing

Brooklyn Acceptance Testing

I think that we have problems with our testing processes for Brooklyn. The integration tests are very brittle, and we only have weak acceptance criteria. We are trying to improve this with things like the Jenkins cloud setup, and the incremental fixes to the test suites, but there is still a lot more that can be done.

Test Environments

Brooklyn cannot be tested in isolation, it requires deployment to a test environment, ranging from a single machine to multiple cloud providers. This complexity is one of the challenges when trying to reliably and repeatably test complex software. One way of reducing the complexity is to restrict the number or scope of the environments used, but this does not adequately reflect the real world, where we cannot control the configuration used by a customer. So, to achieve a suitable level of confidence in the software, we must test against many different types and configurations of target system. The more permutations we test against, the higher level of confidence we have in the result.

For systems like Brooklyn running on a single machine, the confidence we have in our tests can be ranked by the properties of the target system. In increasing order, it would be something like this:

  1. Chosen machine (preconfigured)
  2. Chosen distribution, freshly installed
  3. Arbitrary distribution, freshly installed
  4. Arbitrary machine of chosen distribution
  5. Arbitrary machine with arbitrary distribution

This ranking is made more complex when deploying to multiple VMs, although there are fewer choices:

  1. Chosen VM image (preconfigured)
  2. VM image with chosen distribution
  3. VM image with arbitrary distribution

I think we are only at stage two for localhost type testing, and struggling to reach stage three. For the multiple VM scenario, we are at stage one, perhaps two.

Current Status

I would like to try and formalise this description, so that we can build a better (new!) acceptance test suite for Brooklyn, to demonstrate and gain confidence in the product. Our integration and live tests are getting better, we also need to try to automate things like the example applications, as these are the real use cases that show up issues that will appear in production.

We do have a green integration test suite running now; at least on a chosen machine - the AMI that is running Jenkins. This is an Ubuntu 12.10 installation with a carefully configured shell environment including ssh keys and sudo setup, as well as some operating system tweaks. This is a good thing, but we have to build from this to a green live test suite, and also gain the ability to run the software/* integration tests on an arbitrary VM image (in practice, a Jenkins slave VM) and then the Live tests on arbitrary cloud providers with whatever VM images happen to be available.

Suggested Improvements

These are not simple tasks, and Brooklyn is going to require some improvements before this is possible. For instance:

  • Changing the install/customize/start sequence for entities to use shell scripts.
  • Making more use of tools like Chef, Puppet and Whirr.
  • Better integration with jclouds.
  • Building better test support infrastructure.
  • Developing code to support the integration and live testing of Brooklyn applications.

And overall, less reliance on one-time hacks to get things running with more focus on general cases, not specifics.

We also need more rigorous test plans when evaluating release candidates, with a concrete mechanism for making go/no-go decisions. The jclouds test planning, with the large number of different cloud providers and APIs that must be checked, is probably a good example to follow here. During normal development, our continuous integration tests should always pass, and we should put effort into producing a live test suite that can be executed automatically, if not continuously, to give us confidence in multiple VM and cloud scenarios.

I really think is this something worth investing time in, what do other people think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment