I have a pretty special use case which I'd love to use OpsGenie for.
To distinct systems (hosted in-house with us) should send heartbeats to OpsGenie (using the plain built-in heartbeat functionality).
In case one system is not sending anymore nothing should happen (while a short mail would be okay). In case the second system is also not sending anymore it should raise an incident (other techniques within OpsGenie would be fine also, if they suit better) and alarm the whole team.
Up to now I could not get it working with two heartbeats which independently might be okay to fail but should generate an incident if BOTH are failed.
Is this doable at all?
Thanks a lot!
Hi @bschmi !
Not sure I totally understand the use-case, but there isn't a way to have 2 heartbeats be dependent or related to each other.
The way heartbeats work in Opsgenie, is you add a heartbeat and specify an interval for that heartbeat. That interval determines how often the heartbeat should receive a ping (either via email, or Ping Heartbeat API request). If that time interval goes by with no ping, then the heartbeat will expire and create an alert.
For example, if you created a heartbeat with an interval of 10 minutes, then that heartbeat is expecting a ping once every 10 minutes. If 10 minutes go by with no ping, the heartbeat will expire and create an alert.
It sounds like in your use-case, you only want an alert if both heartbeats expire, right? If so - there wouldn't be able to do that natively using heartbeats. Each heartbeat is independent of each other, so if 1 heartbeat expires, it would create an alert. If another heartbeat expires, it would create another alert.
One workaround I could think of would be to setup an Alert policy to modify the alias of these heartbeat alerts, so that the alerts created from expired heartbeats would have the same alias. That way if both heartbeats expire, there would only be 1 alert created, but the count of that alert would be 2. This is because the 2nd alert would be created with the same alias as the 1st alert, so instead of a new alert being created, it would de-duplicate and increase the count of the 1st expired heartbeat alert.
Then on the team these expired heartbeat alerts are going to, you could setup a team notification policy to delay notifications for these alerts until the count = 2. So if the 1st heartbeat expires, the count of the alert would be 1, so no notifications would be sent. Then if the 2nd heartbeat expires, it would de-duplicate that 1st alert, and raise the count to 2, at which time the notification policy would then send notifications for the alerts.
Hope that helps! Let me know if you have anymore questions implementing that workaround.
We launched the first version of the Incident investigation view in April of 2020 for Bitbucket Pipelines. Shortly after that, we added the ability to investigate deployment-related incidents in Opsg...
Connect with like-minded Atlassian users at free events near you!Find an event
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no Community Events near you at the moment.Host an event
You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events