GC overhead limit exceeded

Andrew DeFaria September 30, 2015

We've been experiencing the dreaded "GC overhead limit exceeded" error. I had turned on GC logging and I have attempted to analyze the logs with GCViewer but the version that is pointed to by https://confluence.atlassian.com/display/JIRAKB/Using+Garbage+Collection+Logs+to+Analyze+JIRA+Performance is old and fails to parse any of the logfiles I've collected (Yes I tried removing the date timestamps - didn't help). And this page - https://github.com/chewiebug/GCViewer - has only Src versions available and the README instructions merely say to run " java -jar gcviewer-1.3x.jar" yet there are no .jar files in the package and no instructions are given for how to build this .jar file...

Suffice to say I could not analyze my log files with any of these tools!

Looking at https://confluence.atlassian.com/display/ENTERPRISE/Garbage+Collection+%28GC%29+Tuning+Guide it seems to say that more memory is generally considered better. The machine we're using has 8 gig of memory yet our memory settings for JIRA are JVM_MINIMUM_MEMORY="1g" and JVM_MAXIMUM_MEMORY="1g". Considering that the only thing this machine does aside from the OS is JIRA, would it make more sense to use something like JVM_MINIMUM_MEMORY="1g" and JVM_MAXIMUM_MEMORY="7g" leaving a gig for the OS to use?

I've tried reading https://confluence.atlassian.com/display/ENTERPRISE/Garbage+Collection+%28GC%29+Tuning+Guide but it's very confusing to me using terms that I have no idea what they are (e.g.

The less short-lived allocations that are promoted or tenured from the new to the old generation and conversely the more long-lived allocations that are tenured from new to old, the more efficient the overall garbage collection of the JVM. This leads to higher throughput.

I don't know what is long-lived, tenured, new or old objects. 

GC overhead limit exceeded

I'm adding my comment here because when I tried to add a comment the normal way I got:

Your activity is currently limited because you've commented, answered or asked a question 1 times in the past 24 hours. These limits apply until you earn 3 points, then you can ask, answer and comment as much as you like.

That's just odd!

Here's the comment I wanted to attach to Nic's answer:

I don't understand why GC has to start and not stop until it's completed all of the GC. If memory is short then fine, start GC, in it's own task and let others continue. Give it a little time, say a few seconds, then stop it and continue onward. This is not unlike multiprocessing that modern OSes have been doing for decades. Why does everything have to be halted and why, if it is unable to reclaim enough unused objects, do you hang the whole JIRA process requiring a restart? I do not have to configure memory pools for any other language than Java. But what do I know...

There were supposed to be tools for this but the tools don't work (bad link). Incrementing this gradually means that I have to take the service down periodically because you cannot change these amounts without a restart. Each restart represents a service interruption. Not a good way to do this.

While I consciously thought to leave 1 gig for the OS, I would think even if I allocated all 8 gig to JIRA the OS should intervene and say "No I'm taking X meg/gig of memory for myself - after all I'm the boss here".

And to address Rodrigo:

The OS mechanism to keep processes from hogging 100% of the memory should be to not honor the request that says "I want 100%", not to kill the process that asked.

2 answers

1 vote
Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
October 1, 2015

Massive jumps in memory allocation aren't a good idea.  This is vastly oversimplified, but a garbage collection is "stop everything while I sweep up blobs of memory that were used but have been dropped and can't be reused until I've put them back in the pool".  If you increase the pool of memory hugely, then you can make things worse because garbage collection won't happen as often, but when it does, it'll take ages.  That's just one problem you can cause, there are others.

I would try to find someone who understands GC logs (I don't know them well enough), but to fix the current error, try increasing the memory gradually - I usually go up in 128Mb increments, but with a 1Gb JIRA, I'd increase that to 1.5Gb initially and then increase in 128Mb steps if the problem persists. 

(Your instinct to leave the OS a gig to play with is excellent - too many times I've seen "we've allocated all the system memory to JIRA and it's got worse" - it's bound to when the OS has no space!)

0 votes
Rodrigo Girardi Adami
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
October 1, 2015

Awesome answer from Nic Brough.

My only extra advise would be to set the minimum and maximum memory to the same size. This have an effect on GC that depress the mechanism, only activating GC when the memory usages really needs. This will increase performance of the instance.

Gradually increasing memory is what we recommend as Nic stated. And also, I've seen customers instances that mysteriously crashed and it was found that they have set the instance memory to the limit of the physical RAM of the server. Most OS have mechanisms to protect the system normal functions when some application tries to use all physical RAM. Killing the application is one of them.

Cheers,

Rodrigo 

Suggest an answer

Log in or Sign up to answer