Scott Sanders -- [email protected]
So we're all here at Monitorama, and it's awesome to see so many incredible people in one place focusing on such an important topic. I'd like to talk a bit about outage lifecycles and how the monitoring and alerting tools we're familiar with can be woven into processes that enable confidence.
When an incident occurs, we typically have an increased risk of an outage. How we structure our initial response, our decision making process, and our communication directly affects the impact that this incident will have. We need to think critically about our ability to quickly resolve any problems and reduce the risk of future incidents.