I've got JIRA 5.2.10 installed on a Windows 2008 R2 server.
Very often, when trying to stop the "Atlassian JIRA" service it takes a veeerryyyy long time until the service stops. It can run up to one hour between stopping the service and when the tomcat7.exe process is finally finished.
This results in a very long JIRA service outage.
The instance is connected to a local MSSQL database and while polling the database processes there are no JIRA processes working.
My basic questions are:
Can anyone shed a light on this please?
This should go in as a suppport request. Shutdown should be something like a minute or so.
My guess is that a set of shutdown requests are not being dealt with properly and are going through a long timeout. I've had a similar problem to this, but on startup, with a set of plugins not responding to start requests then the start requests taking minutes to timeout. You should check the time of events in the log during shutdown looking for long breaks. Make sure all your plugins are up to date. You could also check Windows taskmgr.exe for memory usage, cpu activity and time consumed by the tomcat process. A memory leak could conceivably build a large chunk of virtual memory that might take a long time to clear if there is some kind of repeated timeout. (Is the time to shutdown related to the uptime?) If the activity is flat then the system will be waiting for some kind of callback or function return. This timeout would probably be logged.
My experience is that plugins can definitely be the cause of long (or improper) shutdown. This is what my logs used to show...
2013-08-14 20:37:49,605 localhost-startStop-2 INFO [atlassian.plugin.manager.DefaultPluginManager] Shutting down the plugin system 2013-08-14 20:37:54,530 Timer-1 INFO [jira.plugins.monitor.MonitoringScheduler] Unscheduling metrics collector... 2013-08-14 20:37:54,531 Timer-1 INFO [jira.plugins.monitor.MonitorLauncher] Stopped JIRA monitoring 2013-08-14 20:38:05,883 localhost-startStop-2 ERROR [internal.util.concurrent.RunnableTimedExecution] Closing runnable for context NonValidatingOsgiBundleXmlApplicationContext(bundle=com.atlassian.upm.atlassian-universal-plugin-manager-plugin, config=osgibundle:/META-INF/spring/*.xml) did not finish in 10000ms; consider taking a snapshot and then shutdown the VM in case the thread still hangs 2013-08-14 20:38:15,895 localhost-startStop-2 ERROR [internal.util.concurrent.RunnableTimedExecution] Closing runnable for context NonValidatingOsgiBundleXmlApplicationContext(bundle=another-plugin1, config=osgibundle:/META-INF/spring/*.xml) did not finish in 10000ms; consider taking a snapshot and then shutdown the VM in case the thread still hangs
..with a whole bunch of these errors. This was then followed by by multiple exceptions:
[osgi.container.felix.FelixOsgiContainerManager] JarContent: Unable to read bytes. java.lang.IllegalStateException: zip file closed
..that repeated until the JIRA process was killed.
What I did was this (using my test JIRA):
I eventually identified the culprit plugin. Naturally, Murphy's Law dictated that it was the last possible candidate. I then:
Understand that it does not really matter if you are not getting an exception in your logs... of it IS a plugin that is causing the problem then the above method is a good way of finding out which one it is. Or establishing that none of your non-system plugin seems to be causing the problem.
Jacques, did you come right with your slow shutdown problem?
I did not previously answer your 2nd question "How can I monitor what the service is actually doing?" but have had some thoughts on it...
Jim recommended Windows taskmgr.exe. That's a good start. I've also had some results using Process Explorer from Microsoft's Sysinternals site. This site has many incredibly useful tools to help you (to quote) "manage, troubleshoot and diagnose your Windows systems and applications." All free. And accompanied by lots of helpful blog postings that are full of advice.
I've also found the JavaMelody plugin to be excellent for ongoing JIRA monitoring, and even a certain amount of management. I've configured my setup to send me nightly and weekly PDF reports. Over time, these build up into a nice archive... so I can go back and see what things looked like a couple of months ago in a way that is not so easy with (say) Taskmgr.
I referred to management above. Just this past Friday, I had users complaining of slow response times. I was advised to restart JIRA. But, instead, I went into JavaMelody, which showed me that JIRA was consuming 50% of CPU & that HTTP response times had gone crazy bad. More importantly, the graphs in JavaMelody showed exactly when the problem had started. I was able to identify an open http session that had been initiated at the time that the problem started. I killed the session using JavaMelody... and CPU usage dropped down to 0-2% and HTTP response times were sorted out as well. No need to restart JIRA!
OK, so I still need to work out where that rogue HTTP sesssion came from... but JavaMelody will help with that as well.
Connect with like-minded Atlassian users at free events near you!Find a group
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no AUG chapters near you at the moment.Start an AUG
We're bringing product updates and pro tips on teamwork to ten cities around the world.Save your spot