An alert is a specialized log from a software component in a computing system, which indicates a problem. Tools like JSM and Opsgenie can help us manage alerts and avert incidents. If they do end up being incidents, they help us mitigate them as well.
An On-call is an engineer that is supposed to Keep The Lights On. They have to be on the lighthouse, on the lookout for alerts and preventing issues that might impact the customer. We also have a very similar process in our teams.
Our alert process has become deeply refined over the years. We have closed down multiple alerts, reduced the occurrence of many others, and we know the common alerts that we see, by intuition, but even more so, by run-books.
Allthethings Prioritization Matrix, also known as the "Eisenhower-Matrix" is a tabular system of filtering tasks, issues, alerts, and everything that has information into their order of attack based on the urgency and importance of the given data point.
Based on the information, we can categorize things as Important, Not Important, Urgent, and Non-Urgent. Which therefore leads us to 4 kinds of tasks.
Source: Picture from WikiMedia Commons by Davidjcmorris
Well, in case you encounter stuff in these categories, you gotta ask yourself the following questions:
Nipun Aggarwal
2 comments