Created
December 23, 2018 07:08
-
-
Save hanokhaloni/0add5853d98517a06f4a242851f755ec to your computer and use it in GitHub Desktop.
Resiliency checklist
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Define your availability requirements, based on business needs. | |
2. Design the application for resiliency. Start with an architecture that follows proven practices, and then identify the possible failure points in that architecture. | |
3. Implement strategies to detect and recover from failures. | |
4. Test the implementation by simulating faults and triggering forced failovers. | |
5. Deploy the application into production using a reliable, repeatable process. | |
6. Monitor the application to detect failures. By monitoring the system, you can gauge the health of the application and respond to incidents if necessary. | |
7. Respond if there are failure that require manual interventions. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment