With the recent cloudflare network problem, our sites went down. Of course, opsgenie received alerts form several systems and kept beeping and showing up notifications for all of then.
So the question is how to silent all notifications for X minutes, when we know already about the problem or a service is flapping.
They should still show in the site/app, but just do not show notification nor alerts... and of course, after the X minuted, any new alert should send notification again
Thanks for the help
Thanks for reaching out. I think your best bet for this scenario would be a Maintenance Policy which allows you to enable or disable multiple policies/integrations with a single entry during a specified window.
If you would like the alerts created in your Opsgenie account, but notifications delayed/suppressed - you'll want this Maintenance window to enable a Notification Policy. If you would like the alerts to NOT be created you will want to disable an integration - obviously if an alert isn't created, no user(s) will be notified.
Both the Notification and Maintenance policies can be set on the team level under the Policies tab. If you were to lean towards enabling/disabling a Notification policy, you will first want to create this.
Within the filters of the Notification policy - you can select ALL alerts to be delayed/suppressed, or get more granular with the conditions only looking for cloudflare alerts:
Once this Notification policy is configured, you will now be able to search for and enable it during the Maintenance window. Within a Maintenance policy, specify the timeframe and the Notification policy.
If you were to lean the other route of either enabling/disabling an integration and avoid any alert creation during this window - you'd simply want to adjust the filter of the Maintenance policy:
You can also select the + Add rule button in the Maintenance policy to enable or disable multiple policies/integrations.
While not very friendly way to do this, this do work!
it is not very friendly because it is way too complex and requires multiple steps, most of our team do not know opsgenie well enough to even know how this setting work. But the worse is that requires a user with enough permissions to build and enable those settings, so at least it requires a escalation to the next level to be able to create those emergency maintenance settings.
A better option would be a user "silent/auto-ack all for..." x minutes, next to the "ack all, close all"
It would be something that would not be used normally, but very useful to silent a alert spike, giving the team time to work in the problem and not in the never ending alert flood
maybe i need to create a feature request... (but how to do it ! ) :)
Unfortunately not at this time, but we do have an open feature request for this that I have added you to. Our tickets are not public, so I'll reach out directly with any updates/developments on the ticket. And if you'd like to contact support for an update yourself, the ticket for reference is ALX-818; Add recurring time windows as an option for Maintenance.
It's that time again, January flew by. But as usual we've got some fun updates for you. Incident Details Updates We've enhanced the incident details view so that you can ea...
Connect with like-minded Atlassian users at free events near you!Find an event
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no Community Events near you at the moment.Host an event
You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events