You're on your way to the next level! Join the Kudos program to earn points and save your progress.
Level 1: Seed
25 / 150 points
1 badge earned
Challenges come and go, but your rewards stay with you. Do more to earn more!
What goes around comes around! Share the love by gifting kudos to your peers.
Keep earning points to reach the top of the leaderboard. It resets every quarter so you always have a chance!
Join now to unlock these features and more
At our company we use the prometheus/grafana stack for measuring quality and uptime of our landscape. Opsgenie is being used as our alerting mechanism. So far so good!
But... as a final check we want to have a total independent "last resort"-check, from external locations. What are best practices for this?
What we want:
Curious to hear what external solutions are available with embedded opsgenie support.
Hi @rogierl ,
Closest thing Opsgenie has to that use case is Heartbeats. These essentially act like a dead man's switch, and can monitor that periodic tasks are running as expected.
Say for example you schedule a heartbeat to receive a ping from your external system every 12 hours. If Opsgenie does not receive that ping after the expected time, an alert can be created to notify your users to investigate the issue.
Hey @Nick H , that still would mean that we need something, a service, which does the actual request to our website. I was hoping such a thing was built inside the opsgenie stack. But seems like it isn't.
And I understand that, because opsgenie is an alert system, not a monitoring system. So basically I'm looking for a (simple) monitoring system which can do the actual requests, but which HAS integration with Opsgenie for the alerting part.
Yeah seems like you'd want a website monitoring tool such as Pingdom, Uptime Robot, etc.
But just to be clear, the heartbeart would be a ping sent from your external system to Opsgenie, not Opsgenie reaching out to the external system. So if your website can send some sort of job to Opsgenie, and it's not received over the specified times, Opsgenie would then create an alert.
Hi @Joseph Matan ,
That should be possible. There's a few things you'd need to configure within the heartbeat, as well with the Statuspage integration.
The heartbeat would need to add the affected component tag when the heartbeat-alert is created:
And the Statuspage integration would need to create an incident AND update the components'/incidents' statuses according to tags when an alert is created:
Our Statuspage document should help determine the format of the tag under the Changing Component/Incident Statuses via Alert Tags section. Here's a few examples shown in there as well:
And a test to prove this use case is possible through an expired heartbeat:
@Nick H thanks for your fast response and detailed answer !!!!
Do you know if there is a future plan add the heartbeat feature directly to Statuspage ?
Because that what we've suggested here with Opsgenie is more like a workaround.
(I know that UptimeRobot offers the heartbeat solution directly on their statuspage)