I'm here due to the recent incident from 08/07 Dec. 2021 which caused problems for automation rules.
In my case specifically, it caused one of my rules to not trigger on time. The rule was scheduled to create a task on 08.12.2021 12:00AM (midnight). The task was not created, and when I checked when the next rule execution was to be expected - it was for next week - 15.12.2021 (it's a weekly Wednesday rule). In addition, there was no audit log showing whether the rule ran or failed.
I checked all of my automation rules because of this, panicking that this might in the future cause us to miss tasks. While I was doing that, the problematic rule triggered at 11AM and created the task...
So here's my question - Was this trigger run manually by JIRA as a correction for all the problems caused by the outage, or is it a cron service which existed before this outage and which makes sure that all rules are run after an outage.
Basically, I want to know whether there is a contingency in place to run these failed rules in case another outage like this happens, so I can sleep at night :D
Thanks in advance!
Thanks, that sounds great! Can we be certain it will work every time, though?
The reason I'm asking this because one miss of a task creation would be business critical for us, as we rely heavily on automated quarterly/annual/biennial task creation to keep going things like business continuity, backups etc.
Are there any cases where this catching up of Automation had failed before? What's the logic behind it, and is there anything we can do to decrease chances of failure (for example set trigger time something more in business hours and not midnight?)
Appreciate the swift reply!
As far as I know, Automation puts every triggered execution to a queue even if it cannot start it right away. For example, if it can execute max N automations in paralell, then if there is a spike, then all automations above N will be inserted to the queue and wait.
When there is a new "worker" that is available to execute a rule, it picks out an item from queue and executes it. And so on.
This is a standard scalability, resiliency pattern.
An FYI to what I am reading this thread:
The Atlassian support team told our company (for a prior outage ticket) that "catching up" on scheduled and triggered rules is subject to the severity and specifics of the outage. The expectation is that rules may eventually run (as Aron notes for queued events) or miss a schedule/trigger, and there is no expectation of when those triggers will happen after the outage.
Hi, Jira users! Do you use Jira alongside Microsoft Teams? We want to hear how you’ve used the power of Jira Cloud and Microsoft Teams (via the Jira Cloud for Microsoft Teams app) to achieve a team...
Connect with like-minded Atlassian users at free events near you!Find an event
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no Community Events near you at the moment.Host an event
You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events