Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Direct PRTG alerts to Opsgenie team based on tags?

Nathan Ell April 26, 2023

I'll preface this by saying that I'm struggling to understand how PRTG is meant to be integrated into Opsgenie.

I have PRTG notifications going to Opsgenie. I want to have Opsgenie escalation work as follows:

  • If the notification has a tag of Tier1, and...
    • ... has any tag in a list (DBA, DatabaseServices, etc) then send to the DBA escalation
    • ... has any tag in a list (Application, Software) then send to the Application escalation
    • ... has any tag in a list (Networking, Servers) then send to the Infrastructure escalation
  • If the notification has a tag of Tier2, then send to the Daytime Coverage escalation

 

How do I implement a PRTG integration into Opsgenie with a complex set of tag-based routing rules?

2 answers

1 accepted

1 vote
Answer accepted
Nathan Ell May 3, 2023

I think I've got a motif figured out.

Use Case

Our DBA, Infrastructure, and Application teams provide after-hours on-call services. These three teams need to be notified when a service they are responsible for goes down. Each team should not, however, be notified when another team's services go down. Our monitoring solution has been configured with "tiered alerts", such that "Tier1" alerts indicate client-impacting downtime, "Tier2" alerts indicate no current but potentially impending client-impacting downtime, and "Tier3" alerts indicate non-client-impacting, but important-to-the-company service issues. After-hours on-call rotations respond only to Tier1 (immediately client-impacting) alerts.

During business hours, a separate Daytime Coverage team responds to Tier1, Tier2, and Tier3 alerts to ensure the company and clients are well-served. The Daytime Coverage team is not domain-specific and responds to all DBA, Infrastructure, and Application alerts.

PRTG

  1. Configure sensors with appropriate tags, for example DBA, Infrastructure, and Application.
  2. Configure three libraries: Tier1, Tier2, Tier3. Add sensors to the appropriate libraries.
  3. Add five notification templates:
    1. DBA Tier1, with tags listed for sensors that need to go to the DBA team.
    2. Infrastructure Tier1, with tags listed for sensors that need to go to the Infrastructure team.
    3. Application Tier1, with tags listed for sensors that need to go to the Application team.
    4. Tier2, with no tags listed.
    5. Tier3, with no tags listed.
  4. Modify the notification triggers for the three libraries created in step 2:
    1. Tier1 - add six state triggers:
      1. State trigger Up
        1. Notification template DBA Tier1
        2. Notification template Application Tier1
        3. Notification template Infrastructure Tier1
      2. State trigger Down
        1. Notification template DBA Tier1
        2. Notification template Application Tier1
        3. Notification template Infrastructure Tier1
    2. Tier2 - add two state triggers:
      1. State trigger Down - notification template Tier2
      2. State trigger Up - notification template Tier2
    3. Tier3 - add two state triggers:
      1. State trigger Down - notification template Tier3
      2. State trigger Up - notification template Tier3

Opsgenie

  1. To power the five notification templates created in PRTG, create five PRTG integrations in Opsgenie:
    1. DBA Tier1
      1. Alert fields: set Priority to P1-Critical
      2. Tags: DBA
    2. Application Tier1
      1. Alert fields: set Priority to P1-Critical
      2. Tags: Application
    3. Infrastructure Tier1
      1. Alert fields: set Priority to P1-Critical
      2. Tags: Infrastructure
    4. Tier2
      1. Alert fields: set Priority to P2-High
    5. Tier3
      1. Alert fields: set Priority to P3-Moderate
  2. Create four on-call schedules with rotations as desired:
    1. DBA On-Call, containing the DBA team, restricted to after-hours.
    2. Application On-Call, containing the Application team, restricted to after-hours.
    3. Infrastructure On-Call, containing the Infrastructure team, restricted to after-hours.
    4. Daytime Coverage, containing the Daytime Coverage team, restricted to business hours.
  3. Create four escalation policies, each with notify every x minutes and notify other teams/admins/etc configured to meet your business needs:
    1. DBA On-Call Escalation: notify the DBA On-Call Schedule
    2. Application On-Call Escalation: notify the Application On-Call Schedule
    3. Infrastructure On-Call Escalation: notify the Infrastructure On-Call Schedule
    4. Daytime Coverage Escalation: notify the Daytime Coverage On-Call Schedule
  4. Create four alert policies:
    1. DBA On-Call
      1. Match all the following conditions:
        1. Priority equals P1-Critical
        2. Tags matches(regex) DBA
      2. Responders: DBA On-Call Escalation
    2. Application On-Call
      1. Match all the following conditions:
        1. Priority equals P1-Critical
        2. Tags matches(regex) Application
      2. Responders: Application On-Call Escalation
    3. Infrastructure On-Call
      1. Match all the following conditions:
        1. Priority equals P1-Critical
        2. Tags matches(regex) Infrastructure
      2. Responders: Infrastructure On-Call Escalation
    4. Daytime Coverage
      1. Match any of the following conditions:
        1. Priority equals P1-Critical
        2. Priority equals P2-High
        3. Priority equals P3-Moderate
      2. Responders: Daytime Coverage Escalation 

Summary

We leverage the tags on PRTG sensors within PRTG, since it's not easy to make use of the PRTG tags over in Opsgenie when creating routing rules. Libraries are used in PRTG to group sensors by tier, so that all sensors within a specific tier will notify a specific integration in Opsgenie. By doing this, the specific integration in Opsgenie can set an appropriate priority for the Opsgenie alert, as well as add a tag to the Opsgenie alert so policies can be defined to figure out which team to send the alert to using alert policies. Rather than using routing rules, alert policies set the appropriate responder based off of alert priority and tags. The Opsgenie schedules are restricted to specific hours in the day to ensure that on-call notifications only get received after-hours, and daytime coverage notifications only get received during business hours.

Thanks to @John M for inspiring the use of alert policies!

John M
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
May 3, 2023

Thanks for posting the detailed solution, @Nathan Ell ! This is very helpful for other community members who may have a similar question. 

0 votes
John M
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 27, 2023

Hi @Nathan Ell ,

If the tags are being added at the time of alert creation via the advanced integration settings, you can create 4 different Global policies to satisfy your requirements. Here is what the first one would look like:

2023-04-27_17-26-22.png

If you're wanting to route based on fields that are in the PRTG payload, you will need to choose from the list of available options in the drop-down in the advanced settings. However, "tags" is not an option there. 

Nathan Ell May 1, 2023

Thanks for the info. Unfortunately the tags are coming from PRTG and this seems to be where the struggle comes from - somehow I need to get tags from the PRTG payload into Opsgenie so I can start making decisions in Opsgenie on how to route the alert.

Like John M likes this

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events