Come for the products,
stay for the community

The Atlassian Community can help you and your team get more value out of Atlassian products and practices.

Atlassian Community about banner
4,363,275
Community Members
 
Community Events
168
Community Groups

Recheck Failures before Automated Calls

Edited

Our company uses Control-M to schedule all of our system's processes. Currently when a failure in Control-M occurs a "first-response" team is notified and manually calls someone related to that failure, based on who is on-call within "Teams" in Opsgenie. We're looking to automate this process through Opsgenie, so there's less manually intervention - cool, great!

My question/concern is, our system runs 24/7 with Control-M process cycling through multiple times a day. Some processes within Control-M are setup to automatically re-run 1-3 times. I'm looking for a way to setup a process in Opsgenie to not have automated calls when the process fails once, but automatically (within Control-M and 10 seconds) restarts and completes successfully.

I've reviewed the flowchart for "Alert Notifications" and ways to Suppress or Delay alerts, but I'm not sure if either of these options are what I'm looking for. I'm thinking more about how a human sees a failure, gets ready to manually call someone, then "rechecks" if the failed process is still failed before calling. In this case, the failure would automatically go back to rerunning & finish successfully. Thus, no call to on-call teams!

 

0 comments

Comment

Log in or Sign up to comment
TAGS

Atlassian Community Events