Skip to content

Instantly share code, notes, and snippets.

@FlorianHeigl
Last active June 29, 2017 21:14
Show Gist options
  • Select an option

  • Save FlorianHeigl/35e1c45dc95c89821ae1f728a1b4aa77 to your computer and use it in GitHub Desktop.

Select an option

Save FlorianHeigl/35e1c45dc95c89821ae1f728a1b4aa77 to your computer and use it in GitHub Desktop.
Testing reqs in Ops envs on top of "normal" tests
--------------------------------------
||| Test return to stable state
--------------------------------------
||| External Fault injection tests
--------------------------------------
||| System health
--------------------------------------
\========/
Below: \ /
"insufficient tests for OS code
(nice to have, no strong meaning)"
\/
--------------------------------------
|| System testing
--------------------------------------
|| Data tests / Fault injection tests
--------------------------------------
\======/
Below: \ /
"can't trust this to run safely"
\/
--------------------------------------
| Unit tests
--------------------------------------
| Coding standards
--------------------------------------
| "it compiles"
--------------------------------------
Design and error handling (fix, continue, abort)
how we act has to match critical code paths and managed objects
single transaction? not? step by step?
=> do stop a san rescan on a single host if first path doesn't come back
=> don't stop a core router reconfiguration on error if the rest of the config might fix the issue!
what's the desired outcome?
partial reach of desired outcome possible?
=> don't stop starting VMs because the first failed!
on error in adding supporting elements
abort what costs redundancy
retry what could gain redundancy
continue on error if it could limit scope of an issue
continue on error if it could solve the issue
don't block on retrying a redundant component if the consumers are unavailable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment