Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in
Celebration

Earn badges and make progress

You're on your way to the next level! Join the Kudos program to earn points and save your progress.

Deleted user Avatar
Deleted user

Level 1: Seed

25 / 150 points

Next: Root

Avatar

1 badge earned

Collect

Participate in fun challenges

Challenges come and go, but your rewards stay with you. Do more to earn more!

Challenges
Coins

Gift kudos to your peers

What goes around comes around! Share the love by gifting kudos to your peers.

Recognition
Ribbon

Rise up in the ranks

Keep earning points to reach the top of the leaderboard. It resets every quarter so you always have a chance!

Leaderboard

Come for the products,
stay for the community

The Atlassian Community can help you and your team get more value out of Atlassian products and practices.

Atlassian Community about banner
4,558,009
Community Members
 
Community Events
184
Community Groups

Is it possible to create an alert based on alerts?

I currently have a working setup where Dynatrace runs a bunch of synthetic monitors and they are tied in to OpsGenie to fire alerts to a Slack channel when the monitors fail.

 

However, given that this relates to an IAM solution then we've actually got a lot of different DT monitors for different applications that we protect. At the moment, if ANY application is down (whether it's for a problem or for scheduled maintenance) then an alert if triggered... which is fine, but it means that on-call staff might get called out for one app's scheduled maintenance that they didn't tell us was happening.

 

Is it possible to create a hierarchical/composite sort of alert that says "if you get alerts from more than 2 different DT monitors then create an alert"?? The desire is then to only call the on-call guys when multiple apps are having problems (i.e. a much more lijkely indication that the IAM infrastructure has a problem, rather than just a particular app).

 

Ideally we would leave the existing alerts in place, but just send them to Slack for information only, whereas the overarching alert would trigger a call to the on-call phone to alert the on-call person immediately (who could then see the Slack alerts for more detail on which apps are experiencing issues).

1 answer

0 votes
Tom Russell
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
Oct 14, 2022

@Darren Sunley we're in a similar situation with one of our application clusters. We get alerted on a node-by-node basis, but want to escalate to a higher priority alert if multiple nodes are having problems. We're just starting to work on a solution, but the two strategies we plan to look at are:

  • Leverage the alias field and notification policies to suppress alerts unless multiple come in with that same alias within a set period of time.
  • Suppress the alerts for n minutes. Have an OEC running that:
    • watches for that alert and monitors for multiple to come in
    • Closes the individual alerts and creates a new alert with a higher priority and links to the closed individual alerts

I'm not sure what we'll go with, and I don't know what level of Opsgenie you're running (and its capabilities), but that's just a couple of ideas.

Hi Tom - thanks for that!

 

I've got a support ticket open too and they've suggested Deduplication and Notification Policies, which sounds like your first suggestion (relating to the alias value).

 

I'm busy trying to work that through with them, but hopefully it'll get me there. If I figure it out I'll come back and give you an update!

Like Tom Russell likes this

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events