author, status, updated, version, tag, type, parent
| author |
status |
updated |
version |
tag |
type |
parent |
| Joseph OBrien |
unpublished |
2025-12-23 |
1.0.1 |
skill |
reference |
debugging |
Incident Postmortem: {{INCIDENT_TITLE}}
Incident ID: {{INC-XXXX}}
Date: {{YYYY-MM-DD}}
Duration: {{START_TIME}} - {{END_TIME}} ({{DURATION}})
Severity: {{SEV1|SEV2|SEV3|SEV4}}
Status: {{RESOLVED|MONITORING}}
Summary
{{ONE_PARAGRAPH_SUMMARY}}
Impact
| Metric |
Value |
| Users Affected |
{{N}} |
| Revenue Impact |
${{N}} |
| Requests Failed |
{{N}} |
| Error Rate |
{{N}}% |
| Downtime |
{{DURATION}} |
Timeline
| Time (UTC) |
Event |
| {{HH:MM}} |
{{TRIGGER_EVENT}} |
| {{HH:MM}} |
Alert fired: {{ALERT_NAME}} |
| {{HH:MM}} |
On-call paged |
| {{HH:MM}} |
Investigation started |
| {{HH:MM}} |
Root cause identified |
| {{HH:MM}} |
Mitigation applied |
| {{HH:MM}} |
Service recovered |
| {{HH:MM}} |
Incident closed |
Root Cause
{{DETAILED_ROOT_CAUSE_ANALYSIS}}
Contributing Factors
- {{FACTOR_1}}
- {{FACTOR_2}}
- {{FACTOR_3}}
What Failed
- Detection: {{HOW_WAS_IT_DETECTED}}
- Prevention: {{WHY_WASNT_IT_PREVENTED}}
- Response: {{RESPONSE_GAPS}}
Resolution
Immediate Actions
- {{ACTION_1}}
- {{ACTION_2}}
Mitigation Steps
Verification
Lessons Learned
What Went Well
- {{POSITIVE_1}}
- {{POSITIVE_2}}
What Went Wrong
- {{NEGATIVE_1}}
- {{NEGATIVE_2}}
Where We Got Lucky
Action Items
| ID |
Action |
Owner |
Priority |
Due Date |
Status |
| 1 |
{{ACTION}} |
{{OWNER}} |
{{P1-4}} |
{{DATE}} |
{{STATUS}} |
| 2 |
{{ACTION}} |
{{OWNER}} |
{{P1-4}} |
{{DATE}} |
{{STATUS}} |
| 3 |
{{ACTION}} |
{{OWNER}} |
{{P1-4}} |
{{DATE}} |
{{STATUS}} |
Prevention
Detection
Response
Technical Details
Affected Systems
| System |
Impact |
Recovery |
| {{SYSTEM}} |
{{DESCRIPTION}} |
{{TIME}} |
Metrics During Incident
| Metric |
Normal |
During Incident |
Peak |
| Latency (p99) |
{{MS}} |
{{MS}} |
{{MS}} |
| Error Rate |
{{N}}% |
{{N}}% |
{{N}}% |
| CPU Usage |
{{N}}% |
{{N}}% |
{{N}}% |
| Memory |
{{N}}GB |
{{N}}GB |
{{N}}GB |
Logs
Communication
Internal
| Time |
Channel |
Message |
| {{TIME}} |
{{SLACK/EMAIL}} |
{{SUMMARY}} |
External
| Time |
Channel |
Audience |
Message |
| {{TIME}} |
Status Page |
Customers |
{{MESSAGE}} |
Related Incidents
| ID |
Date |
Similarity |
| {{INC-XXXX}} |
{{DATE}} |
{{DESCRIPTION}} |
Appendix
A. Alert Configuration
B. Runbook Updates Needed
- {{RUNBOOK_UPDATE_1}}
- {{RUNBOOK_UPDATE_2}}
Quality Checklist