Automation behaviour of rule triggers after incidents which have caused trigger failure

Neven Panchev December 8, 2021

Hi everyone,

I'm here due to the recent incident from 08/07 Dec. 2021 which caused problems for automation rules.

In my case specifically, it caused one of my rules to not trigger on time. The rule was scheduled to create a task on 08.12.2021 12:00AM (midnight). The task was not created, and when I checked when the next rule execution was to be expected - it was for next week - 15.12.2021 (it's a weekly Wednesday rule). In addition, there was no audit log showing whether the rule ran or failed.

I checked all of my automation rules because of this, panicking that this might in the future cause us to miss tasks. While I was doing that, the problematic rule triggered at 11AM and created the task...

So here's my question - Was this trigger run manually by JIRA as a correction for all the problems caused by the outage, or is it a cron service which existed before this outage and which makes sure that all rules are run after an outage.

Basically, I want to know whether there is a contingency in place to run these failed rules in case another outage like this happens, so I can sleep at night :D

Thanks in advance!

-Nev

 

1 answer

1 accepted

3 votes
Answer accepted
John Funk
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
December 8, 2021

Hi Nev,

My experience is that when Automation comes back up, it does indeed catch up on all of the automations that should have fired. So, in effect, they are just delayed but not lost. 

Neven Panchev December 8, 2021

Hi John,

Thanks, that sounds great! Can we be certain it will work every time, though?

The reason I'm asking this because one miss of a task creation would be business critical for us, as we rely heavily on automated quarterly/annual/biennial task creation to keep going things like business continuity, backups etc.

Are there any cases where this catching up of Automation had failed before? What's the logic behind it, and is there anything we can do to decrease chances of failure (for example set trigger time something more in business hours and not midnight?)

Appreciate the swift reply!

John Funk
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
December 8, 2021

Same here - they are critical for us. There have been multiple outages over the months and they have always ended up running when things got cleared up. 

I can’t speak to that as a guarantee but that’s my experience.

Like Neven Panchev likes this
Aron Gombas _Midori_
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
December 8, 2021

As far as I know, Automation puts every triggered execution to a queue even if it cannot start it right away. For example, if it can execute max N automations in paralell, then if there is a spike, then all automations above N will be inserted to the queue and wait.

When there is a new "worker" that is available to execute a rule, it picks out an item from queue and executes it. And so on.

This is a standard scalability, resiliency pattern.

Like # people like this
Bill Sheboy
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
December 8, 2021

Greetings all!

An FYI to what I am reading this thread: 

The Atlassian support team told our company (for a prior outage ticket) that "catching up" on scheduled and triggered rules is subject to the severity and specifics of the outage.  The expectation is that rules may eventually run (as Aron notes for queued events) or miss a schedule/trigger, and there is no expectation of when those triggers will happen after the outage.

Kind regards,
Bill

Like Neven Panchev likes this

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PRODUCT PLAN
FREE
PERMISSIONS LEVEL
Site Admin
TAGS
AUG Leaders

Atlassian Community Events