We have recently replaced Splunk On-Call (formerly known as VictorOps) with a new setup of OpsGenie.
I am a user/admin of OpsGenie and we have a really annoying problem to do with 'Notification fatigue'.
The scenario is as follows: We receive a repeated alert to do with a problem in the system and this message floods our slack channel and the person on-call is bombarded with the notifications.
We would like to have the following happen:
1. On-Call receive the alert (and a message is sent to the slack channel)
2. On-Call Acknowledges the alert
3. Unless they close the alert, they will not receive any more of the same notification again.
I have taken a look at the Current Tier->Policies->Notification Policies, but I cannot see this feature.
Also, this is exactly how Splunk On-Call worked..
Does anyone have any ideas how to fix it or perhaps work around?
(btw, I have looked at the video/document called 'Three Ways to Reduce Alert Noise and Fatigue with Opsgenie')
The alias alert field is your best friend here. Make sure to set that to a value that uniquely identifies that series of alerts, or that problem. You'll get a notification for the first alert, but then each subsequent alert will just increase the counter on the first alert, without notifying. If you close the alert in Opsgenie and the problem is still happening, it will open a new alert, starting the counter again.
@Tom Russell This sounds like exactly what I need!!
Are there any instructions, documentation, blog, tutorial or youtube that covers this feature in-depth?
Thanks!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
As a follow-up, this did exactly what I needed -- thank you very much for the advice!
I found that the Alias was coming in as a UID which was always different, so what I did was I created a settings->Global Policies that set the Alias to equal the message..
It works now 👍
Here are a few images that others may find useful.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@colin.schofield, one thing to be careful of: we usually imbed the date somewhere in the alias. That way, if the alert gets acknowledged, then forgotten about, it will open a new one the next day. Otherwise, your problem could go for weeks without opening a new alert.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Tom Russell That is a great idea!
How do I specify the date -- the word date surrounded in braces, maybe?
{{ date }}
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@colin.schofield We're setting it in the application sending the alert. If that's not an option for you, see if there's anything in the payload being sent to Opsgenie that will work. You can find the payload by going to Settings -> Logs and searching for "Processed incomingData"
If there's a specific field (date or otherwise) that will work, you can include it with {{_payload.fieldname}} (https://support.atlassian.com/opsgenie/docs/dynamic-fields-in-opsgenie-integrations/). If you can find something in the description to use, you can pull it out with string processing (https://support.atlassian.com/opsgenie/docs/string-processing-methods-in-opsgenie-integrations/)
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.