Hi there,
we're currently using PagerDuty but I've started to look over the fence on Opsgenie.
It looks feature rich but I can't find any resources on time based alert grouping.
It's a very useful feature in PagerDuty which will group alerts that come in during a predefined time-window into one incident assuming that alerts that are triggered within the same time-window most lightly originate from the same problem.
This works well for us since we've got quite a lot of monitoring going on in our Fibre Network and the time-based alert grouping will reduce the noise during bigger events.
Is there a way to achieve the same outcome with Opsgenie?
Thanks,
Andreas
Hi Andreas,
Opsgenie does have alert correlation that acts in a similar fashion:
https://docs.opsgenie.com/v1.0/docs/correlate-alerts-with-incident-1
Essentially if you have an ongoing service disruption (with your Fibre Network), alerts can be associated to an incident during the window its open - and group to the 'same problem' as you mentioned:
Additionally - the incident updates, timeline, post-mortem and reports can be created/exported to incorporate all the data tied to the service disruption:
https://docs.opsgenie.com/v1.0/docs/incident-timeline
https://docs.opsgenie.com/v1.0/docs/postmortems
https://docs.opsgenie.com/v1.0/docs/post-incident-analysis-report
I know that's a lot of information and doc. sharing so please let us know if you have any additional follow up questions.
Best,
Nick
Thanks for your reply Nick,
I did see that there's an alert correlation feature but I'm not sure that it works in the same way as the PagerDuty time based grouping feature.
There are no matching rules going on when alerts are grouped together in an incident, the only common denominator is that they appear in the same time window, hence most lightly originating from the same event.
Has anyone implemented this behavior in Opsgenie?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I'm personally not familiar with PD's time-based grouping feature, but within an Opsgenie service's incident rule - multiple matching rules (and/or conditions) can be defined to associate alerts into one incident.
This does not have time-based functionality since it's all dependent on an ongoing incident, so the filter does need to be defined to match the alerts in order to associate them:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Welcome! Please let us know if you run into any other questions or issues. And if you decide to move forward with trialing Opsgenie, we offer an in-app live chat bubble that my team and I can help with.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Nick,
if I understand correctly the OG feature group multiple alerts into one Incident but we need to Group various Alerts into one , so it is not flooding our on-call staff with notifications. For example if we have a Fibre break , there are various alerst trigerred (OSPF, BGP, Interface, ....) which are obviously caused by same reason , which is the Fibre break.
So we are looking to configure Opsgenie in a way that if there are more than for eample 3 alerts generated within 5 minutes, Opsgenei Group them into ONE SINGLE ALERT NOTIFICATION, and notify our engineer. Then the negineer can open that one notification and see all other alerts which are grouped.
is it something currently built in, or you are planning to add as a feature?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.