Java GC blocks Jira

With the usage of Jira increasing, we notice that Jira is blocking out users for typically 120 to 300 seconds. The timing of this blocking seems to coincide with the garbage collector runs of the Java VM.

This is increasingly critical as users complain that when they have entered lengthy issue descriptions, they loose all their work (sometime in excess of half an hour). The number of user complaints is getting significant.

Is there a method to

a) determine the root cause of these short outages

b) fix it

Martin, Advantest

4 answers

1 accepted

2 votes

Answer accepted

a) Options: <kbd>-verbose:gc -XX:+PrintGCDetails</kbd>

b) You have to tune your GC.

1. Keep in mind that most such problems arise because Java heap is too big.

2. You can always use -XX:+UseParallelGC (minor pauses acceptable) or even -XX:+UseConcMarkSweepGC (no pauses)

If you decide for parallel, cosider using -XX:ParallelGCThreads=<N procs - 1> (or around) and use -XX:MaxGCPauseMillis=2000 to indicate to the server that pauses greater than 2s are not desired.

There are plenty of articles describing how to minimize the pauses

[Edit] Try these to cut as much as possible the pauses:

-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Actually I was searching for that in both "Answers @ Atlassian" and Google. The majority of articles was just mentioning the GC logging parameters but I have those already.

Are there any specific pointers?

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

What's in gc.log ?

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

A good place to start: http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

The one that caused a problem today:

2012-07-27T08:39:15.594+0200: 175993.979: [GC [PSYoungGen: 805630K-&gt;191877K(798656K)] 3554934K-&gt;2941630K(3594880K), 0.0779530 secs] [Times: user=1.12 sys=0.00, real=0.07 secs]
2012-07-27T08:39:58.266+0200: 176036.651: [GC [PSYoungGen: 798597K-&gt;162119K(830976K)] 3548350K-&gt;2912229K(3627200K), 0.0638500 secs] [Times: user=0.98 sys=0.01, real=0.06 secs]
2012-07-27T08:40:00.939+0200: 176039.324: [GC [PSYoungGen: 761095K-&gt;143249K(844288K)] 3511205K-&gt;2914494K(3640512K), 0.0581810 secs] [Times: user=0.94 sys=0.01, real=0.06 secs]
2012-07-27T08:40:04.790+0200: 176043.175: [GC [PSYoungGen: 742225K-&gt;156741K(908160K)] 3513470K-&gt;2928745K(3704384K), 0.0519910 secs] [Times: user=0.79 sys=0.00, real=0.06 secs]
2012-07-27T08:40:05.888+0200: 176044.274: [GC [PSYoungGen: 825669K-&gt;156447K(907136K)] 3597673K-&gt;2929053K(3703360K), 0.0541860 secs] [Times: user=0.86 sys=0.00, real=0.06 secs]
2012-07-27T08:40:20.023+0200: 176058.408: [GC [PSYoungGen: 825375K-&gt;179413K(990016K)] 3597981K-&gt;2952710K(3786240K), 0.0881010 secs] [Times: user=1.05 sys=0.07, real=0.09 secs]
2012-07-27T08:40:28.082+0200: 176066.467: [GC-- [PSYoungGen: 945173K-&gt;945173K(990016K)] 3718470K-&gt;3741389K(3786240K), 2.4579610 secs] [Times: user=1.93 sys=6.64, real=2.46 secs]
2012-07-27T08:40:30.540+0200: 176068.925: [Full GC [PSYoungGen: 945173K-&gt;0K(990016K)] [PSOldGen: 2796216K-&gt;762306K(2541568K)] 3741389K-&gt;762306K(3531584K) [PSPermGen: 240935K-&gt;235637K(249600K)], 73.4757870 secs] [Times: user=4.55 sys=0.76, real=73.46 secs]
2012-07-27T08:41:44.618+0200: 176143.004: [GC [PSYoungGen: 765760K-&gt;34425K(996928K)] 1528066K-&gt;796732K(3538496K), 0.0297050 secs] [Times: user=0.38 sys=0.00, real=0.03 secs]

So this Full GC took 73 seconds which was noticable

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

This is more of the typical one (takes between 0.5 and 4 sec for the full GC)

2012-07-27T13:34:21.545+0200: 193699.930: [GC [PSYoungGen: 539200K-&gt;33856K(609408K)] 3076719K-&gt;2586239K(3399424K), 0.0915780 secs] [Times: user=0.35 sys=0.02, real=0.09 secs]
2012-07-27T13:34:33.013+0200: 193711.398: [GC [PSYoungGen: 545152K-&gt;20362K(679104K)] 3097535K-&gt;2589388K(3469120K), 0.1007560 secs] [Times: user=0.41 sys=0.01, real=0.10 secs]
2012-07-27T13:34:37.009+0200: 193715.395: [GC [PSYoungGen: 613066K-&gt;30201K(682560K)] 3182092K-&gt;2599976K(3472576K), 0.0562450 secs] [Times: user=0.34 sys=0.04, real=0.06 secs]
2012-07-27T13:34:40.994+0200: 193719.379: [GC [PSYoungGen: 622905K-&gt;21104K(771776K)] 3192680K-&gt;2611651K(3561792K), 0.0600670 secs] [Times: user=0.34 sys=0.02, real=0.06 secs]
2012-07-27T13:34:46.649+0200: 193725.034: [GC [PSYoungGen: 711472K-&gt;39351K(772928K)] 3302019K-&gt;2641011K(3562944K), 0.0705590 secs] [Times: user=0.36 sys=0.02, real=0.07 secs]
2012-07-27T13:34:50.140+0200: 193728.525: [GC [PSYoungGen: 729719K-&gt;28886K(877504K)] 3331379K-&gt;2657883K(3667520K), 0.0467470 secs] [Times: user=0.34 sys=0.03, real=0.05 secs]
2012-07-27T13:34:51.458+0200: 193729.843: [GC [PSYoungGen: 833558K-&gt;2816K(880768K)] 3462555K-&gt;2654100K(3670784K), 0.0295270 secs] [Times: user=0.33 sys=0.00, real=0.03 secs]
2012-07-27T13:34:58.463+0200: 193736.848: [GC [PSYoungGen: 807488K-&gt;77042K(968384K)] 3458772K-&gt;2830436K(3758400K), 0.1496060 secs] [Times: user=0.94 sys=0.13, real=0.15 secs]
2012-07-27T13:34:58.613+0200: 193736.998: [Full GC [PSYoungGen: 77042K-&gt;0K(968384K)] [PSOldGen: 2753394K-&gt;1802531K(2796224K)] 2830436K-&gt;1802531K(3764608K) [PSPermGen: 255608K-&gt;243959K(256704K)], 4.8672610 secs] [Times: user=4.35 sys=0.01, real=4.87 secs]
2012-07-27T13:35:09.438+0200: 193747.823: [GC [PSYoungGen: 891328K-&gt;47021K(1018752K)] 2693859K-&gt;1849553K(3814976K), 0.1264450 secs] [Times: user=0.36 sys=0.02, real=0.13 secs]

As mentioned above: most of the full GCs take less than 2 seconds.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

1) I would advise to set -Xms & -Xmx to the same value

2) Can you try adding -XX:NewRatio=4 (or even 5) and see what's happening ? Play also with -XX:MaxGCPauseMillis

3) Revert, and try:

-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Any update on this ?

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

1 vote

Here is the proposal from Atlassian support. It' slightly different:

Based on the Database statistic, it seems that this is a medium size instance. Based on the GCViewer image as per attached in the screenshot GClog, it seems that the timespent for the GC is really high for on the 27th July 2012. We also notice that the heap usage was high during the day. Since the time taken for the GC full GC is high, this might be cause by the size of the heap that Full GC need to clear is very large.

Therefore, we would like to recommend to adjust the current JVM input parameter to the following. In Linux, please edit the <tt>JIRA_Install/bin/setenv.sh</tt>:

Use a parallel collector: <tt>-XX:+UseParallelOldGC</tt>. This will ensure that GC will perform the Garbage Collection more frequently. But, every collection should be small thus the lag time should be reduce.
Disable remote clients from triggering a full GC event <tt>-XX:+DisableExplicitGC</tt>
Set the Young space up to 30-40% of the overall heap. Eg: <tt>-XX:NewSize=1229m</tt>
optionally, reduce the size of the -Xmx to 3GB because as a benchmark we allocate 1 GB of heap to instance with 100,000 of issues. Reducing the -Xmx will reduce the amount of heap that GC need to clear for each Full GC.

Do let us know how it goes.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

At the moment we have implemented the solution from Radu (using the concurrent GC). As we are talking about a production instance here, we can only change things like this during a downtime window. This has happened Sunday night. So we are collecting data since about 10 hours.

So far no lengthy GC outages have been observed and the gc.log indicates the longest time that the concurrent GC did spend was about 1 sec.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

1) Java heap is too big, agree. 2) Smaller young generation, agree 3) ok for <tt>DisableExplicitGC and </tt>UseParallelOldGC.

Please keep us informed with your case.

TIA,

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Intellectual curiosity: which solution did you choose ?

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Ultimately we are running the solution as proposed by Atlassian support:

<tt>-XX:+UseParallelOldGC </tt><tt>-XX:NewSize=1229m</tt>

and adjusted Min and Max Heap to 1.5 GB / 3 GB, so that the delta is not that big.

Thanks a lot for all the help

maba

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Glad you solve it.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Hello,

sorry for re-opening this thread but can anyone assist me with a guide on how to implement this solution;

Been struggling for over a week and running into GC Allocation errors as above.

Here is the proposal from Atlassian support. It' slightly different:

Based on the Database statistic, it seems that this is a medium size instance. Based on the GCViewer image as per attached in the screenshot GClog, it seems that the timespent for the GC is really high for on the 27th July 2012. We also notice that the heap usage was high during the day. Since the time taken for the GC full GC is high, this might be cause by the size of the heap that Full GC need to clear is very large.
Therefore, we would like to recommend to adjust the current JVM input parameter to the following. In Linux, please edit the <tt>JIRA_Install/bin/setenv.sh</tt>:
Use a parallel collector: <tt>-XX:+UseParallelOldGC</tt>. This will ensure that GC will perform the Garbage Collection more frequently. But, every collection should be small thus the lag time should be reduce.
Disable remote clients from triggering a full GC event <tt>-XX:+DisableExplicitGC</tt>
Set the Young space up to 30-40% of the overall heap. Eg: <tt>-XX:NewSize=1229m</tt>
optionally, reduce the size of the -Xmx to 3GB because as a benchmark we allocate 1 GB of heap to instance with 100,000 of issues. Reducing the -Xmx will reduce the amount of heap that GC need to clear for each Full GC.
Do let us know how it goes.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

0 votes

-bash-3.2$ ps -ef |grep jira
jira       956 14625  0 11:33 pts/0    00:00:00 ps -ef
jira       957 14625  0 11:33 pts/0    00:00:00 grep jira
jira     12370     1 13 Jul25 ?        07:14:17 /usr/java/latest/bin/java -Djava.util.logging.config.file=/opt/apps/atlassian/jira-enterprise/conf/logging.properties -XX:MaxPermSize=512m -Xms1024m -Xmx4096m -Djava.awt.headless=true -Datlassian.standalone=JIRA -Dorg.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER=true -Dmail.mime.decodeparameters=true -Dhttp.proxyHost=webproxy.verigy.net -Dhttp.proxyPort=80 -Dhttp.nonProxyHosts=*.verigy.net|localhost -Djira.jelly.on=true -Dmail.mime.decodeparameters=true -XX:+PrintGC -XX:+PrintGCDetails -Xloggc:/opt/apps/atlassian/jira-enterprise/logs/gc.log -XX:+PrintGCDateStamps -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.endorsed.dirs=/opt/apps/atlassian/jira-enterprise/endorsed -classpath /opt/apps/atlassian/jira-enterprise/scripts/production/workflow/groovy/common:/opt/apps/atlassian/jira-enterprise/scripts/production/workflow/groovy/postactions:/opt/apps/atlassian/jira-enterprise/scripts/production/workflow/groovy/validators:/opt/atlassian/jira-data/scripts:/opt/apps/atlassian/jira-enterprise/bin/bootstrap.jar -Dcatalina.base=/opt/apps/atlassian/jira-enterprise -Dcatalina.home=/opt/apps/atlassian/jira-enterprise -Djava.io.tmpdir=/opt/apps/atlassian/jira-enterprise/temp org.apache.catalina.startup.Bootstrap start
root     14623 13614  0 09:49 pts/0    00:00:00 su - jira
jira     14625 14623  0 09:49 pts/0    00:00:00 -bash
postgres 31055 23969  0 Jul26 ?        00:03:09 postgres: jirausr jiradb 127.0.0.1(21248) idle
postgres 31072 23969  0 Jul26 ?        00:02:26 postgres: jirausr jiradb 127.0.0.1(21254) idle
postgres 31077 23969  0 Jul26 ?        00:03:54 postgres: jirausr jiradb 127.0.0.1(21258) idle
-bash-3.2$

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

0 votes

Could you please share the java arguments that you're using for running JIRA?

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Forums

Product Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Java GC blocks Jira

4 answers

1 accepted

Suggest an answer

Was this helpful?

Thanks!

TAGS

Community showcase

Atlassian Community Events