Recommanded setting for our JVM

Ubisoft
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 15, 2013

Hello,

We are experiencing problem with one of our instance that has the most load. We have our connections pools that hit the max (we double it up to 80) and also slowdown over the instances. Any optimization of the JVM would help.

We are wondering if this could be the Groovy scripts running on our instance. We have scripts that are :

on Create issue (to set a custom field automatic from 2 other custom field) and transition on workflow

Create issue automatic and link them to each other

Set Component automatic, set assignee automatic depending on fields.

All those are in try catch naturally so if it has errors we would see it.

For setting the JVM you would suggest what?

Right now we have this (tomcat 6)

-Djava.util.logging.config.file=/mnt/jira01_vol1/tomcat-81/conf/logging.properties -Duser.timezone=EST5EDT -Dfile.encoding=UTF-8 -Dmail.debug=false -Xms6144m -Xmx6144m -Xmn2048m -XX:MaxPermSize=512m -XX:+UseCompressedOops -XX:+UseTLAB -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseParallelGC -XX:+DisableExplicitGC -XX:+OptimizeStringConcat -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=12 -XX:+UseParallelOldGC -Xloggc:/mnt/jira01_vol1/tomcat-81/logs/gc.log -Djira.jelly.on=true -Djira.home=/mnt/jira01_vol1/home/jira81 -Dsun.rmi.dgc.client.gcInterval=900000 -Dsun.rmi.dgc.server.gcInterval=900000 -Dcom.sun.management.jmxremote.port=10081 
-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.password.file=/etc/tomcat/defaults/jmx.password -Dcom.sun.management.jmxremote.access.file=/etc/tomcat/defaults/jmx.access -Dcom.sun.management.jmxremote.registry.ssl=false -Dhttp.proxyHost=proxy.mdc.ubisoft.org -Dhttp.proxyPort=3128 -Dhttps.proxyHost=proxy.mdc.ubisoft.org -Dhttps.proxyPort=3128 -Dhttp.nonProxyHosts=localhost\|*.ubisoft.org\|X.X.X.* -Dhttps.nonProxyHosts=localhost\|*.ubisoft.org\|X.X.X.* -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.config.file=/opt/tomcat/conf/logging.properties -Dorg.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER=true 
-Dmail.mime.decodeparameters=true -Djava.endorsed.dirs=/opt/tomcat/endorsed -Dcatalina.base=/mnt/jira01_vol1/tomcat-81 -Dcatalina.home=/opt/tomcat -Djava.io.tmpdir=/mnt/jira01_vol1/tomcat-81/temp

We were thinking of adding this :

-XX:+UseCodeCacheFlushing

-XX:ReservedCodeCacheSize=512m

-XX:+CMSClassUnloadingEnabled

-XX:+UseConcMarkSweepGC

Groovy creates classes dynamically, but the default Java VM does not GC the PermGen. If you are using Java 6 or later, add -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC. UseConcMarkSweepGC is needed to enable CMSClassUnloadingEnabled.

How much should we increase the Code Cache or any idea on other settings


Full GC :

Memory

Code Cache

2% Free (Used: 47 MB Total: 48 MB)

PS Eden Space

37% Free (Used: 1248 MB Total: 1978 MB)

PS Survivor Space

66% Free (Used: 12 MB Total: 35 MB)

PS Old Gen

59% Free (Used: 1677 MB Total: 4096 MB)

PS Perm Gen

51% Free (Used: 250 MB Total: 512 MB)

2013-07-16T07:29:21.198+0000: 23101.591: [Full GC [PSYoungGen: 14034K->0K(2078016K)] [ParOldGen: 4186927K->1450226K(4194304K)] 4200962K->1450226K(6272320K) [PSPermGen: 254379K->248703K(503360K)], 40.8788140 secs] [Times: user=35.17 sys=222.72, real=40.87 secs] 
2013-07-16T09:22:28.999+0000: 29889.392: [Full GC [PSYoungGen: 29890K->0K(2047744K)] [ParOldGen: 4178539K->1338711K(4194304K)] 4208430K->1338711K(6242048K) [PSPermGen: 259038K->252730K(503360K)], 25.0574570 secs] [Times: user=38.01 sys=0.02, real=25.05 secs] 
2013-07-16T12:04:51.273+0000: 39631.666: [Full GC [PSYoungGen: 23508K->0K(2062848K)] [ParOldGen: 4182444K->1347254K(4194304K)] 4205953K->1347254K(6257152K) [PSPermGen: 260362K->255592K(503360K)], 20.1977170 secs] [Times: user=36.62 sys=0.02, real=20.20 secs]

Statistic on instance

Issues 126953
Projects 12
Custom Fields 174
Workflows 88
Attachments 96570
Comments 261330
Users 18522
Groups 508

Over 1 gig of access log by day (5000 users log on) lot of soap/rest scripts (that we QOS 5 actions by second by ip).

Any help would be gratefull.

Thanks

Martin

Ubisoft Montreal

2 answers

1 vote
Radu Dumitriu
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 15, 2013

I hope you realize you're on a forum and not on the official Atlassian support. That being my disclaimer:

1. You have big GC times. Those times should be minimized. The idea is to downsize a bit your heap, for instance set it to 4Gb for instance, since the CG will clear less. Also, try setting the young gen size to 20-25% (for 4Gb total => 1Gb).

Also, you may want to use -XX:MaxGCPauseMillis=1000 to establish a target timeframe for your CG.

-XX:+UseParallelOldGC implies AFAIK -XX:+UseParallelGC, so no need to specify both, it's just confusing

Rule for setting -XX:ParallelGCThreads is usually ncores - 1.

2. If you want to have no pauses in CG, use:

-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing

(but overall performance may be a bit lower, however users will not see pauses)

3. It's ok to increase CodeCacheSize, but put it at 128m for the beginning.

4. I do not know what to say about sql connection pools. That should be carefully monitorized, so please take a look there as well ... https://confluence.atlassian.com/display/JIRA/Tuning+Database+Connections


My 2 cents. Please search this forum, I saw some nice examples of tuning ...
0 votes
David Grierson February 8, 2017

Nothing particularly bad about your suggested parameters - here's what we're running our production environment with:

So, long story short - your parameters don't seem mental and the combination of the two GC's is a strategy which I've used over the years. It's interesting that Atlassian now appear to have standardised on the G1GC which I've also had decent experience with recently.

Suggest an answer

Log in or Sign up to answer