There can be two types of outages: short term and long term.
Short term is defined as an interruption or temporary loss of power or signal flow that can be rectified in a matter of minutes by intervention of a systems operator.
Long term is defined as a stoppage in the functioning of a machine or mechanism due to a component failure that requires manual replacement of one or more integral parts to correct the problem.
In both instances, an After Action Report (AAR) should be filed to document the time the outage occurred, the time it was rectified, the nature of the problem and what steps were taken to correct the problem. The AAR can be used to track problems in the system so steps can be taken to mitigate future occurrences.
The purpose of this AAR is to review the effectiveness of the monitoring program for the RHOVISION project. The AAR will identify areas where the monitoring program is performing well and areas where it could be improved.
After Action Report, is a process for reviewing and reflecting on an event or activity. It is a valuable tool for improving performance and identifying areas for improvement. An AAR for monitoring can be used to review the effectiveness of a monitoring program and identify areas where the program can be improved.
*Improved performance
*Identification of areas for improvement
*Increased understanding of the monitoring program
These are some of the key words found in the github "mainnet outages" repository AAR for the monitoring team and their purpose:
TITLE - Issue or Problem encountered during Monitoring
DATE - Particular Day and time
OUTAGE TYPE - Long-term or Short-term
CURRENT STATUS - Status as at the time of filing the report e.g Fixed
ACTION TAKEN - What was the action taken to fix the issue
SITE - e.g Mainnet