Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in
Celebration

Earn badges and make progress

You're on your way to the next level! Join the Kudos program to earn points and save your progress.

Deleted user Avatar
Deleted user

Level 1: Seed

25 / 150 points

Next: Root

Avatar

1 badge earned

Collect

Participate in fun challenges

Challenges come and go, but your rewards stay with you. Do more to earn more!

Challenges
Coins

Gift kudos to your peers

What goes around comes around! Share the love by gifting kudos to your peers.

Recognition
Ribbon

Rise up in the ranks

Keep earning points to reach the top of the leaderboard. It resets every quarter so you always have a chance!

Leaderboard

Come for the products,
stay for the community

The Atlassian Community can help you and your team get more value out of Atlassian products and practices.

Atlassian Community about banner
4,465,788
Community Members
 
Community Events
176
Community Groups

External monitoring of URLs including Opsgenie support

At our company we use the prometheus/grafana stack for measuring quality and uptime of our landscape. Opsgenie is being used as our alerting mechanism. So far so good!

But... as a final check we want to have a total independent "last resort"-check, from external locations. What are best practices for this?

What we want:

  • Being able to validate uptime (read: HTTP response codes) from (multiple) locations
  • If something is down (!= 200 range), then alert the corresponding innovation team using opsgenie

Curious to hear what external solutions are available with embedded opsgenie support.

2 answers

1 accepted

0 votes
Answer accepted
Nick H Atlassian Team Jul 22, 2022

Hi @rogierl ,

Closest thing Opsgenie has to that use case is Heartbeats. These essentially act like a dead man's switch, and can monitor that periodic tasks are running as expected.

Say for example you schedule a heartbeat to receive a ping from your external system every 12 hours. If Opsgenie does not receive that ping after the expected time, an alert can be created to notify your users to investigate the issue.

Hey @Nick H , that still would mean that we need something, a service, which does the actual request to our website. I was hoping such a thing was built inside the opsgenie stack. But seems like it isn't.

And I understand that, because opsgenie is an alert system, not a monitoring system. So basically I'm looking for a (simple) monitoring system which can do the actual requests, but which HAS integration with Opsgenie for the alerting part.

Nick H Atlassian Team Jul 22, 2022

Yeah seems like you'd want a website monitoring tool such as Pingdom, Uptime Robot, etc.

But just to be clear, the heartbeart would be a ping sent from your external system to Opsgenie, not Opsgenie reaching out to the external system. So if your website can send some sort of job to Opsgenie, and it's not received over the specified times, Opsgenie would then create an alert. 

Yeah I understand. Thanks @Nick H , for your help. The heartbeat feature indeed looks promising. Next step is to look for a way to actually do the request.

If Opsgenie doesn't received the heartbeat over a specified time from my service/website, does the alert from Opsgenie can be easily integrate with Statuspage and update one if it's components ?

Nick H Atlassian Team Nov 09, 2022

Hi @Joseph Matan ,

That should be possible. There's a few things you'd need to configure within the heartbeat, as well with the Statuspage integration.

The heartbeat would need to add the affected component tag when the heartbeat-alert is created:

hbsp1.jpg

 

And the Statuspage integration would need to create an incident AND update the components'/incidents' statuses according to tags when an alert is created:

hbsp2.jpg

 

Our Statuspage document should help determine the format of the tag under the Changing Component/Incident Statuses via Alert Tags section. Here's a few examples shown in there as well:

  • cmp_API:degraded_performance

  • cmp_Database Server:partial_outage

  • cmp_Management Portal:operational

 

And a test to prove this use case is possible through an expired heartbeat:

hbsp3.jpghbsp4.jpg

@Nick H thanks for your fast response and detailed answer !!!!

Do you know if there is a future plan add the heartbeat feature directly to Statuspage 

Because that what we've suggested here with Opsgenie is more like a workaround.

(I know that UptimeRobot offers the heartbeat solution directly on their statuspage)

 

Thanks,

Joseph

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
TAGS

Atlassian Community Events