Postmortem documents are a ritual designed to examine serious incidents or outages. Google’s book on Site Reliability Engineering says:
A postmortem is a written record of an incident, its impact, the actions taken to mitigate or resolve it, the root cause(s), and the follow-up actions to prevent the incident from recurring.
We practice postmortems to ensure we understand and address the root cause of severe incidents such as outages, data loss, or serious production bugs.