Database spikes & locks - can't find the root cause.

Błażej O_
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 8, 2014

Hello.

Lately we are experiencing problems with database connection spikes. There seem to be locks on the database.

We will be very thankful for any hints where to look for the cause.

We have tried to find a root cause to the problem, but after days of cleaning logs, and digging for hints we are left only with countless scattered symptoms, which we are even unsure if those are connected (damn, feels like in House episode).

Problems started approximately when we have connected one of our departments to Jira, thus doubling issue count and active users count. We are also strongly dependant on Kepler Romino's Database and SIL custom fields. SIL and database fields are HEAVILY used by the new department. No more than week later we have upgraded our instance from 5.2.5 to 6.1.7 as well as all (~40) plugins to the newest available versions. Now we can't tell what problems are caused by the heavy load, which by the upgrade of Jira, and which by some plugins.

We looked in atlassian-jira.log but there is nothing that would appear during or before each spike.

We have enabled SQL logging, but the amount of data logged to the file is so insane (20MB each 3-4 minutes) that it is simply impossible to find any hint out there...

We have sat and waited for the spikes to appear to peek the database connections. We have found out that spikes are caused by the locks appearing on the database, but there is a complete randomness in the command that begins the lock avalanche. Sometimes selects, sometimes updates, sometimes inserts. When we manually kill the locking connection on the database on time, spike is ended immediately. But we still don't know what sparked the locks.

We will be glad for any help or directions where too look for the cause of the problem.

4 answers

1 accepted

1 vote
Answer accepted
Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
May 8, 2014

Three things I'd look at:

1. Your spikes of activity are mostly suspiciously evenly spaced. I'd do a bit more monitoring and see if the spikes are forming patterns of usage. Like 11:20am Mon-Fri.

2. You say "the amount of data logged to the file is so insane (20MB each 3-4 minutes) that it is simply impossible to find any hint out there" - have you tried some basic analysis? Simple numbers can tell you a lot - for example "95% of our SQL log is Select X from Y where Z" (i.e. always the same query), that could tell you a lot on its own.

3. One quick trick to rule something out - I worked at a place where the users were effectively DoS attacking Jira with their "clever" scripts (i.e. hideously inefficient and they could have done almost all of it inside Jira if they'd bothered to use their brain). Try Admin -> General settings -> Allow remote calls = off. If there's no effect on the database and it still spikes, you've ruled out accidental DoS. If the datbase looks a lot happier, you can then go digging through the remote call logging...

4. Oops. And what Florin said too. :-)

shambhu sanjay kumar February 13, 2018

I too facing similar issue. Any clue on this? Is this issue resolved?

Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
February 13, 2018

It's the same answer - you need to work out what your system is doing.  Work through the points Florin and I made four years ago.

Błażej O_
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 13, 2018

@shambhu sanjay kumar Exactly what Nic said, there may be many possible reasons for the DB spikes.

In our case there where at least two independent sources of the database load spikes. The major one was some kind of "firewall update mechanism" on the machine that hosted the database. I have been told that it periodically tried to perform some upgrades causing a slow down and "traffic jam" on all databases hosted on the machine. Since then we have moved Jira database to the separate machine and database spikes ended. So the cause was misconfiguration of the database server.

 

When you're troubleshooting Jira load be sure to try out the really useful tool JavaMelody for Jira mentioned by @Rp Subhub. Also google up how to enable garbage collector logging and get a tool to analyze these logs (for example GCViewer).

1 vote
Florin Manaila
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 8, 2014

Hello Blazej,

First off, may I suggest you take a few thread dumps on the JVM when you notice this happening. It may help identify who is waiting on the results.

Then, unless you are using the Database Custom field plugin to access the JIRA database and not an external one, I doubt it is to blame here.

I've seen similar behavior before (actually even worse) but could never find a common root cause. The point here is that it can really come from anywhere, especially plugins that do stuff in the DB inside a JIRA action (I'm thinking post-functions and listeners). If you have complex SIL scripts for your SIL Scripted Custom Fields, I would try and comment out, if possible, the scripts that rely on other issues to calculate their values.

Anyway, I believe the thread dumps should tell you more about what's happening in there.

0 votes
Norman Abramovitz
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 9, 2014

I did not see which database you are using. Maybe using the actual database tools would give you a different perspective. You should be able to see which statements are actually locking. If you then search for the remaining statements using those tables, it might give you a better idea what is going on.

0 votes
Rp Subhub
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 9, 2014

Do you have JavaMelody Monitoring Plugin? It shows current requests (you may take a look at stack traces of long-running) and stack traces for threads using jdbc connections.

Suggest an answer

Log in or Sign up to answer