Heartbeat feature using for a complex use case

Hello Community,

I have a pretty special use case which I'd love to use OpsGenie for.

To distinct systems (hosted in-house with us) should send heartbeats to OpsGenie (using the plain built-in heartbeat functionality).

In case one system is not sending anymore nothing should happen (while a short mail would be okay). In case the second system is also not sending anymore it should raise an incident (other techniques within OpsGenie would be fine also, if they suit better) and alarm the whole team.

Up to now I could not get it working with two heartbeats which independently might be okay to fail but should generate an incident if BOTH are failed.

Is this doable at all?

Thanks a lot!

Best regards,
Birgit

1 answer

0 votes

Hi @bschmi !

Not sure I totally understand the use-case, but there isn't a way to have 2 heartbeats be dependent or related to each other.

The way heartbeats work in Opsgenie, is you add a heartbeat and specify an interval for that heartbeat. That interval determines how often the heartbeat should receive a ping (either via email, or Ping Heartbeat API request). If that time interval goes by with no ping, then the heartbeat will expire and create an alert.

For example, if you created a heartbeat with an interval of 10 minutes, then that heartbeat is expecting a ping once every 10 minutes. If 10 minutes go by with no ping, the heartbeat will expire and create an alert.

It sounds like in your use-case, you only want an alert if both heartbeats expire, right? If so - there wouldn't be able to do that natively using heartbeats. Each heartbeat is independent of each other, so if 1 heartbeat expires, it would create an alert. If another heartbeat expires, it would create another alert.

One workaround I could think of would be to setup an Alert policy to modify the alias of these heartbeat alerts, so that the alerts created from expired heartbeats would have the same alias. That way if both heartbeats expire, there would only be 1 alert created, but the count of that alert would be 2. This is because the 2nd alert would be created with the same alias as the 1st alert, so instead of a new alert being created, it would de-duplicate and increase the count of the 1st expired heartbeat alert.

Then on the team these expired heartbeat alerts are going to, you could setup a team notification policy to delay notifications for these alerts until the count = 2. So if the 1st heartbeat expires, the count of the alert would be 1, so no notifications would be sent. Then if the 2nd heartbeat expires, it would de-duplicate that 1st alert, and raise the count to 2, at which time the notification policy would then send notifications for the alerts.

Hope that helps! Let me know if you have anymore questions implementing that workaround.

Thanks,

Samir

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Forums

Product Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Heartbeat feature using for a complex use case

1 answer

Suggest an answer

Was this helpful?

Thanks!

TAGS

Atlassian Community Events