SCOM alerts going from Warning to Critical doesn't get into OpsGenie

Michael Stefansen September 19, 2021

Hi Community

I'm new to OpsGenie and has been handed the responsibility to integrate it with our System Center Operations Manager (SCOM) monitoring tool.

I've setup all alerts to be send to OpsGenie, but I only notify my on-call rotation with SCOM alerts with Severity "Critical".

This means that alerts of severity "Warning" should be send to OpsGenie, but should not be send out to the on-call rotation, like forexample a disk hitting a SCOM threshold that we have decided as being a warning threshold. But when that threshold reach a severity "Critical" because someone didn't react to the warning alert, I need OpsGenie to send that alert to our on-call rotation.

I would expect that SCOM closes the warning alert in SCOM, as the severity has changed to Critical and then sends a new alert, with a new ID, to OpsGenie. But that doesn't happen today - and I struggle to figure our why. Anyone here who has experienced the same?

Thanks in advance.

2 answers

1 accepted

0 votes
Answer accepted
martin.simonsen
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
September 22, 2021

Hi, Im just adding to this.

SCOM is changing the state of the alert from warning to critical not the initial eventID. How should this be handled in OpsGenie?

Skyler Ataide
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
October 14, 2021

Hi Martin, 

 

In sounds like you will need to adjust the filter(s) in your SCOM Create Alert actions to match the desired data received in Opsgenie from your SCOM Integration. If the payload received from SCOM matches a Create Alert action, and Opsgenie notices that there is already an open alert with the same alias, then that alert will be de-duplicated in order to prevent alert notification fatigue. Here you can find our documentation that covers Advanced integration settings: https://docs.opsgenie.com/docs/advanced-integration-settings and here our documentation covers more details on configuring your action filters:  https://docs.opsgenie.com/docs/filters

 

Best regards,

Skyler

Michael Stefansen October 14, 2021

hi Skyler,

We solved the issue by removing the "AlertID" alias on the alerts where we could have multiple alert states. Like forexample Logical Disk Space is low.

By doing that we'll get a new alert in Opsgenie everytime the Logical Disk alart is changing state and that is what we want.

The downside of doing this is then, that the alerts wont close automatically, as OpsGenie doesn't have a unique identifier to close the alert from, when it's changing state. So we add another manual step to the alert handling, but we live with that for now :-)

Thanks again for all your input, it guided us in the right direction :-)

1 vote
Skyler Ataide
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
September 21, 2021

Hi there Michael! 

 

I'll be happy to see if I am able to provide assistance with this issue! The SCOM>>Opsgenie integration works in a way such that when an alert is created on SCOM, an alert is created in Opsgenie automatically through the integration. If SCOM does in fact send a new alert with a new ID/Alias to Opsgenie when the severity gets changed from Warning to Critical, then you would be able to use routing rules to send these new alerts with Severity Critical to the appropriate On-Call Team. 

 

However, if I understand correctly, this is where you are having trouble since new alerts are not being created when the severity changes in SCOM? An initial suggestion that I have here would be to see if you are able to set up separate Create Alert actions that filter for different severity level alerts that come into Opsgenie from your SCOM integration. For example, the first Create Alert action in Opsgenie can filter to match all conditions where Severity equals Warning, and then you can set those alerts to have a lower priority of P3/P4. Then, the second Create Alert action can filter to match all conditions where Severity equals Critical, and set those alerts to a higher priority (P1/P2), which in turn can be routed to your On-Call Team based on the higher priority. This all can be configured from the Advanced Settings page for your SCOM integration. 

 

Hope this helps!

 

Best,

Skyler

Michael Stefansen September 22, 2021

Hi Skyler

Thanks a lot for your willingness to try to help out here, it's highly appreciated :-)

 

I've made a test environment and connected that to our SCOM test environment, where I would like to test the use of OpsGenie Prioritization. I suppose I'll need to just send everything from SCOM (Informational, Warning & Critical) alerts, then create 3 "Create Alert" covering the 3 severities being send over from SCOM, like in below screenshot:

OpsGenie-Create-Alert.PNG

Then for each of these I have the opportunity to set the requires OpsGenie priority (P1, P2, P3, P4 or P5)

OpsGenie-Create-Alert2.PNG

This one if for the Critical alerts, then I would assume that all SCOM alerts of severity critical will get the Priority "P3 - Moderate" when they are moved to OpsGenie.

To test this I just set my routing to send out everything, to verify that critical alerts has the P3 priority, but I never get any critical alerts :-(

I do get some Warning alerts though, which is prioritized as P4, which is set under "Warning alerts" and is working just fine, but it's doesn't over all warnings from SCOM :-(

Can you guide me a bit more, there is definitely something which isn't configured correctly :-)

Thanks in advance for your time.

Like Skyler Ataide likes this
Skyler Ataide
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
September 22, 2021

Hi Michael,

 

Happy to help, and thank you for providing the additional information/screenshots! The three Create Alert actions that you have set up to match different priority levels in Opsgenie look good based on the provided screenshots. In order to investigate why no critical alerts and only some warning alerts are coming into Opsgenie, you can check the logs under the Settings>>Debug>>Logs page in Opsgenie. You can check the Logs to ensure that Opsgenie is receiving the integration requests from SCOM, and to make sure matching actions are found to create the corresponding alerts in Opsgenie. 

 

From the Logs page in Opsgenie, you will want to identify the Received Integration Request from SCOM, and then expand the incoming data to get a full view of the payload that is being sent from SCOM to Opsgenie. Here is the sample payload that Opsgenie recognizes officially from SCOM:

 {
"owner": "np",
"lastModified": "12/24/2015 11:47:16 AM",
"resolutionState": "New",
"timeRaised": "12/24/2015 11:47:16 AM",
"resolutionStateLastModified": "np",
"workflowId": "{7eba60fd-b179-69a7-3897-47b6753601f2}",
"category": "Custom",
"alertId": "{2ba87d56-a7af-4b42-bdcc-eb18486bd8cd}",
"alertName": "Alert for event 999",
"priority": "1",
"severity": "2",
"createdByMonitor": "false",
"repeatCount": "0",
"alertDescription": "np",
"managedEntitySource": "WIN-RQTU8UB5TU5.opsgeniescom.com"
}

 

Based on the sample payload that Opsgenie expects to receive through your SCOM integration, my prior response suggestion to match all conditions where Severity equals Critical/Warning/Informational may need to be changed to equal 2, 1, or 0, since these are the values that Opsgenie expects to receive in the payload. These numbers correspond to the different SCOM Severities; where 2 = Critical, 1 = Warning, and 0 = Informational. Additionally, based on the conditions of your Create Alert action, in the integration payload there will need to be "resolutionState": "New" so the integration can be processed.

 

If you are still unable to identify why these alerts are not being created in Opsgenie after troubleshooting the logs for your SCOM integration, may I ask if you are able to raise a support ticket with Opsgenie support using this link here: https://support.atlassian.com/contact/#/ 

 

If this request is raised as a support ticket, you may be asked to grant temporary access to your Opsgenie instance to our support team, so that we can take a closer look into this issue to help make sure that you get this integration all set up correctly. :)

 

Best,

Skyler

Michael Stefansen September 22, 2021

Hi Skyler

Thanks for pointing me in the right direction again :-)

The issue was severities and what actual severity OpsGenie expects. I thought SCOM was sending Critical as the severity, but it doesn't, it sends "Error" for critical alerts. Changing my "Critical alerts" to Severity equals "Error" made it all work like a charm :-)

again a huge thank you for your guiding and suggestions, my issue has been solved - YAY :-)

Have a great day.

Like # people like this
Skyler Ataide
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
September 23, 2021

You're welcome, and thank you for providing this update! Great to hear that the integration is now working appropriately!

 

Best regards,

Skyler

Michael Stefansen October 4, 2021

Hi Skyler

It seems that the above integration only works when the state of the SCOM monitor goes from Healthy to Error.

When it hits "Warning" first, which is the most natural thing, there is no alert generated in OpsGenie, when it changes from "Warning" to "Error" :-( This is not very promising as we seem to miss quite a few alerts from this behavior :-(

A possible solution could be to close all "Warning" alerts, when they are sent from SCOM, but that prevents us from actually routing these to other groups.

Is there any way to get OpsGenie create a new Alert ID for Error alerts, even that it's actually coming from the same alert in SCOM?

Skyler Ataide
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
October 14, 2021

Hi again,

 

Apologies for such a delay as I must have missed the initial notification to this response. By default, the SCOM integrations is set up to create Opsgenie alerts with an Alias that is equivalent to the {{alertId}} in SCOM. From the advanced settings page for your SCOM integration, you can try configuring a unique alias for each different Create Alert action that you have set up. This way, when alerts are created based on the three different levels of severity, these alerts will be created as new alerts and not as de-duplicated alerts that share the same alias as pre-existing open alerts. Please let me know if this makes sense, or if you still are running into trouble setting up your SCOM integration to handle different severity alerts, please feel free to open up a support ticket so we can take a closer look into the payload that Opsgenie is receiving through your current SCOM integration. 

 

Thanks and best! 

Skyler

Michael Stefansen October 14, 2021

No apologies needed Skyler, you already made a huge contribution to the solution.

As I replied below, to your answer for Martin, removing the "AlertID" alias created the alerts that we need.

I'll be looking into if I can set my own unique identifier, to have the auto close mechanism back ;-)

Thanks again for your assistance :-)

Skyler Ataide
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
October 15, 2021

Thank you very much for sharing the details around how you were able to resolve this issue! :) I am happy to hear that you are able to adjust the integration settings appropriately to better handle incoming alerts from SCOM which can have multiple alert states. The default SCOM integration settings are not the best suited for these type of alerts, so adjusting the default Alias for alerts created in Opsgenie does seem to be the best configuration at this time. 

 

Cheers,

Skyler

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events