Scheduled scaling with Jira Software Data Center now available with recent index improvements

Hello! I'm Olga – a product manager for Jira Data Center at Atlassian! 

On behalf of the team, I would like to share with you more details about recent improvements we’ve made to reduce the amount of work related to node startup and indexing on the administrative side. These improvements are available in the latest Jira Software long-term support release (LTS).

Let me share with you the challenges we detected and how we solved them. This post also includes additional documentation that will help you set up smart traffic distribution with Atlassian Data Center.

Challenge #1: Jira admins had to perform a periodic full reindex to achieve index consistencies between nodes

Prior to Jira 8.13, there were a number of issues that could cause index inconsistencies between a node and the database, as well as between any two nodes.

This was due to a lacking conflict resolution algorithm during the index update. Any concurrent reindexing operation on a single issue would result in a random state of this issue in the index, making it inconsistent with the state of the issue in the database or on other nodes.

To address the inconsistency, Jira admins scheduled a regular full reindexing step in their administration process.

Since version 8.13, Jira Software guarantees index consistency so that admins do not need to perform a regular, full re-index.

Learn more in Periodic full reindex in Jira Data Center is not required | Jira | Atlassian Documentation

Challenge #2: Unreliable index distribution on node start

While the 8.13 update ensured that admins didn’t need to perform regular full reindexes, admins still couldn't be sure that their starting nodes would receive proper indexing. Because of that, they had to either restart the node until the index was correctly copied or distribute the index manually by copying the index from another node to a new node before starting it. On more minor instances, admins performed full reindex every time the node was started.

Addressing the challenges

With the release of Jira Software 9.4, admins were then able to schedule nodes to be added or removed in a cluster upfront, and they don’t need to worry if index acquisition has worked. All nodes rely on an index snapshot being already present in the shared home.

We also ensure that the index snapshot will be produced and sent to the shared home on the following occasions:

  • Regularly – every 24 hours by default

  • After every major change to the index – background reindex, foreground reindex, or project import

  • On start – each start after the index is obtained makes sure a fresh index snapshot is available in the shared home

Jira snapshots are now only created and copied to the shared home by nodes with a consistent index.

The benefits of this improvement are:

  • Jira Software Data Center can now scale automatically based on demand and your preferences, thus saving time and resources while cutting back on stress for admins.

  • The process to establish node scaling is now stable and doesn’t need admin supervision.

Customer use case

The company requires four nodes to operate during business hours. On the weekend, the traffic is minimal and one node is more than enough to keep the instances running.

With Jira Software 9.4, the admin can set autoscaling rules in AWS to shut down three of those nodes on Friday evening and turn them back on Monday morning. This allows admins to enjoy their weekends without managing nodes while the company saves resources by shutting down unneeded nodes. :slight_smile: And the company benefits by not paying for three servers two days per week.

Useful resources

Bugs fixed as part of this initiative

Addressing the challenges described above also resolved the following bugs:

  • JRASERVER-72125 - Index replication service is paused indefinitely after failing to obtain an index snapshot from another node

  • JRASERVER-67261 - As a JIRA Datacenter Administrator I want to do an automated cold recovery from index a snapshot

  • JRASERVER-72944 - Restoring an index snapshot after a full re-index might trigger the index fixer, delaying the node start-up

  • JRASERVER-74321 - Upgrade from 8.x to 9.1 triggers full reindex twice

  • JRASERVER-66635 - Index Recovery is very slow

  • JRASERVER-74270 - Unable to calculate missing data in the index when getting the last issue update time returns null

  • JRASERVER-74271 - During startup when Jira tries to index missing data after getting a snapshot and fails it's not switching to full-reindex but continues with broken index.

  • JRASERVER-74266 - Full foreground reindex is not replicated to an offline node

  • JRASERVER-74248 - Jira shows unnecessarily alarming stack trace when reindexing thread is expectedly disabled

  • JRASERVER-74787Full reindex operation slow down in Jira 9.0.0

As we continue to improve indexing further, we are also addressing:

I hope this helps your admins save time and your organizations avoid the headache of running unnecessary nodes! You can learn more about the improvements in Jira Software 9.4 here. Thank you!

0 comments

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events