Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

It's not the same without you

Join the community to find out what other Atlassian users are discussing, debating and creating.

Atlassian Community Hero Image Collage

The story about how to improve the background indexing Lucene job for Jira in one shot

Holidays are the time to move your notes into small stories and small finalized articles. 

Based on that I will share a series of articles improving the performance of indexing Lucene of Jira. (such as testing the disks, checking the health of VM, filesystem, ramdisk vs SSD, RDBMS tuning and connectivity,  and of course, addons, custom fields, calculated fields as well)

Let’s start from the file system level because it was non-standard improvement. 

 

So who works with on-prem Jira, still meet on large instances with the long background reindexing job. 

Before the experiment, we had the next result of background reindexing for the ~3 mln tickets + daily ~2k tickets, 1.2k custom fields, ~20k+ users.  

image.png

 

System information before improvements: 

  1. VMWare cluster
  2. NetApp
  3. SSD read cache + SAS for application, full SSD for RDBMS
  4. Oracle Linux 7
  5. Filesystem: XFS
  6. Java 11
  7. Jira 8.5.2
  8. GC strategy as usual Shenandoah GC
  9. MySQL 5.6.x cluster (master-slave)
  10. Monitoring prometheus + grafana 

 

Access for the first step I had only SSH access to the system and no access to the Jira as application (GUI). Because waiting for the approvals and additional test env for the change on the application level. 

 

So I started from that tool  https://github.com/AUGSpb/atlassian-support-benchmark , because it includes percentiles :) Results were acceptable. 

So the next step is to check /var/log/messages and other levels. 

Because I had more than 60,000ms than Brendon Gregg, but I used his technique to check. 

One interesting option was iostat.

# iostat -xz 1

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
                    8.73    0.00    0.28    0.05    0.00   90.94
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda              74.84       181.73      1133.16 3530350052 22013754947
sdb              11.68       105.15       211.81 2042647649 4114862996
sdc               0.42         6.46         6.54  125413861  127003521
dm-0              0.14         0.18         0.38    3513884    7325080
dm-1             75.26       181.54      1132.77 3526794066 22006130096
dm-2             12.06       105.15       211.81 2042646465 4114862996
dm-3              0.44         6.46         6.54  125409921  127003521

The problem was interesting, the logical volume was slow, but the physical volume faster, of course, we need to pay attention to resource consumption. And during indexing, the results were the same. 

So the problem was with XFS fragmentation. Of course, many administrators will say we met with that on Windows with NTFS file system, and on Linux impossible. So no worry you can meet with that :) 

 

Detected via next command:

# xfs_db -c frag -r /dev/mapper/vg_jira-jira

actual 167812, ideal 116946, fragmentation factor 30.31%

Note, this number is largely meaningless.

Files on this filesystem average 1.43 extents per file

 

Online fix on prod env:

# xfs_fsr /dev/mapper/vg_jira-jira

/jira start inode=0

 

Double check after a few days:

# xfs_db -c frag -r /dev/mapper/vg_jira-jira

actual 34173, ideal 33338, fragmentation factor 2.44%

Note, this number is largely meaningless.

Files on this filesystem average 1.03 extents per file

 

References:

I hope that links will help you to indicate your problem proactively.

 

Conclusion

As a result, improved latency till 17%, background reindexing in one thread of that instance ~73h vs ~49h.

 

Do you have a similar situation?

 

 

 

3 comments

That's a long reindex! With Jira Data Center 8.5.4, 6.5 million issues, 1200 custom fields (mostly global contexts) we see a foreground reindex take under 4 hours. We do have jira-config.properties with

jira.index.issue.threads=26

@Matt Doar__ LinkedIn  It's start of improvements :) 

Parameter to set the 26 thread is not every time helpful (foreground != background ), because it's related to thread to read, but writer use 1 thread.

Could you share the  number of  calculated fields, how many custom search implementation, and apps list , please ? :) 

 

@Matt Doar__ LinkedIn  Meanwhile, please, could you share the timeline of background reindex time for your installation? 

  • Jira Data Center 8.5.4,
  • 6.5 million issues,
  • 1200 custom fields (mostly global contexts) 

Just want to compare. 

Comment

Log in or Sign up to comment
TAGS

Community Events

Connect with like-minded Atlassian users at free events near you!

Find an event

Connect with like-minded Atlassian users at free events near you!

Unfortunately there are no Community Events near you at the moment.

Host an event

You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events

Events near you