We have a 4 node Jira data centre instance that has been in use for over 6 months.
2 nodes became unresponsive. Looking in the server log files there were no error messages and we could see Jira was still running as schedule job log messages were still being logged.
Running netstat and Jira was no longer listening on port 8080.
Rebooting the 2 servers and both instances were in the same hung state despite the server log showing a normal startup.
It wasn't until all 4 nodes were rebooted that the 2 hung nodes became responsive again.
Has anybody else encountered this issue?
Hi Paul,
Could you please let us know what version of Jira Data Center you are using? There have been a significant number of fixes specifically for data center in regards to its stability in the past few months.
The one documented bug I am aware of that might be able to produce the symptoms you referenced is https://jira.atlassian.com/browse/JRASERVER-65197 However if this is the case, that bug was specifically fixed in the Jira 7.4.2 release (and the 7.2.10 release). I would not expect Jira 7.3.x, 7.4.1, 7.4.0, or the earlier 7.2.9 and before versions to have this fix.
Even if that is not the specific bug for this case, I am still interested to learn more about what version you are using.
Regards,
Andy
Hi Andy,
Many thanks for your reply.
We are running Jira Data Centre Version 7.5.2.
The day after I reported this problem the 2 remaining nodes hung. Again there was nothing in the server log files to help identify what the problem might be,
We rebooted all 4 nodes and all 4 nodes came back ok.
We are very concerned that this unexplained problem could reoccur.
Any help or suggestions on how to try and diagnose this next time it happens would be gratefully received
many thanks
Paul.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Paul,
Sorry to hear that my initial hunch seems incorrect. There still have been a number of improvements and fixes for data center, even from your more recent version. You can find a brief list of these in my JAC search for data center bugs and features since 7.5.2.
But that said, I would still be interested to see if we can better troubleshoot your specific problem rather than just encourage an upgrade. When this event happens, I would recommend first generating an immediate thread dump from the affected nodes. Steps on how to do this are in Troubleshooting Performance Issues with thread dumps.
These thread dumps can usually help with diagnosing what the application is doing when it becomes unresponsive on that node. That information, paired with a support zip from that node can be helpful.
In addition to those thread dumps, I would also be interested to take a look at the GC logs created by that application. You are running a version that should have this feature enabled by default, and in turn those logs should exist in the support zip when generated. But we in support would follow the guide Using Garbage Collection Logs to Analyze JIRA Application Performance in order to better understand if the way that Java is having to perform its garbage collection could be adversely affecting these nodes. Personally, I am a fan of using Samurai in order to analyze the GC logs, you could use lots of different tools, or even a text editor to examine these logs, but I find this relatively easy to use.
Let me know if this helps. If we need to gather these logs from you to take a look ourselves, just let me know, I should be able to generate a support request where we can do that.
Andy
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Andy,
Many thanks for your detailed and very helpful reply. I will keep your guide to hand should this problem reoccur.
kind regards
Paul
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Spend the day sharpening your skills in Atlassian Cloud Organization Admin or Jira Administration, then take the exam onsite. Already ready? Take one - or more - of 12 different certification exams while you’re in Anaheim at Team' 25.
Learn more
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.