Every Day Confluence dies. PID file exists with no process

Every day without fail I have to log into my confluence box to run the start-confluence.sh script.

It is running on top of a CentOS 6.3 (64-bit) install. Confluence is running with a MySQL (5.1.67) database on a dedicated box. The VM has been provisioned with 2 CPUs @ 2.0 GHZ with 4G of ram and 2G of swap.

The Hard drive partitioning is as follows:

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/vg_confluence-root 9.9G 2.6G 6.8G 28% /

tmpfs 1.9G 88K 1.9G 1% /dev/shm

/dev/sda1 194M 54M 131M 30% /boot

/dev/mapper/vg_confluence-confluence 82G 622M 77G 1% /confluence

/dev/mapper/vg_confluence-home 5.0G 139M 4.6G 3% /home

After the restart the application runs as expected, but during long periods of idleness confluence stops responding

5 answers

This widget could not be displayed.

Some obvious questions:

1. What error(s) do you see in your log files?
2. What errors do people in their browser when Confluence stops?
3. What version of Confluence?
4. When did this start occurring?
5. Have you installed, updated Confluence and/or plugin(s) recently?

2013-02-24 17:09:21,038 WARN [http-8080-1] [jersey.spi.inject.Errors] processErrorMessages The following warnings have been detected with resource and/or provider classes:
2013-02-24 17:09:30,584 ERROR [http-8080-4] [atlassian.event.internal.EventPublisherImpl] invokeListeners There was an exception thrown trying to dispatch event 'com.atlassian.confluence.event.events.content.mail.notification.ContentNotificationAddedEvent[source=com.atlassian.confluence.mail.notification.DefaultNotificationManager@234bb715]' from the invoker 'com.atlassian.event.internal.SingleParameterMethodListenerInvoker@6c911796'.
at com.atlassian.confluence.util.ConfluenceErrorFilter.doFilter(ConfluenceErrorFilter.java:22)
at com.atlassian.confluence.servlet.FourOhFourErrorLoggingFilter.doFilter(FourOhFourErrorLoggingFilter.java:65)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)

Feb 24, 2013 5:09:21 PM com.sun.jersey.spi.inject.Errors processErrorMessages
WARNING: The following warnings have been detected with resource and/or provider classes:
WARNING: A HTTP GET method, public javax.ws.rs.core.Response com.atlassian.confluence.tinymceplugin.rest.DraftsResource.getDrafts(int,int), should not consume any entity.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000003f1629c188, pid=2098, tid=140462108604160
#
# JRE version: 6.0_26-b03
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libc.so.6+0x9c188] __gettimeofday+0x18
#

It does reference a file hs_err_pd2098.log but the file looks to be almost all memory/core dumps

So let me pick off a few of these (first I should state that I attached the logs when submitting the ticket to atlassian, I assumed they would show up here but they obviously did not)

2) The browsers hang and eventually time out

3) Confluence 4.3.7

4) From the first time I completed the install 2-3 weeks ago

5) I have installed no plugins, and have not updated confluence assuming that the binary I downloaded from the website was the most current.

1) I see a few java exceptions similar to this

2013-02-23 22:08:28,011 ERROR [scheduler_Worker-5] [sf.hibernate.util.JDBCExceptionReporter] logExceptions Communications link failure
2013-02-23 22:08:28,015 ERROR [scheduler_Worker-6] [sf.hibernate.util.JDBCExceptionReporter] logExceptions No operations allowed after connection closed.Connection was implicitly closed by the driver.
2013-02-23 22:08:28,016 ERROR [scheduler_Worker-5] [sf.hibernate.util.JDBCExceptionReporter] logExceptions Communications link failure
2013-02-23 22:08:28,018 ERROR [scheduler_Worker-5] [confluence.schedule.quartz.ConfluenceQuartzThreadPool] run Error while executing the Runnable:
at net.sf.hibernate.exception.ErrorCodeConverter.handledNonSpecificException(ErrorCodeConverter.java:90)
at net.sf.hibernate.exception.ErrorCodeConverter.convert(ErrorCodeConverter.java:79)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1119)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3486)



There are issues between Confluence and MySQL, given that you have JDBC errors.

We don't run MySQL, so someone else may be able to help, but I'd be re-reading the documentation about database connectivity. Also have a look at your MySQL logs, and possibly your server logs.

https://confluence.atlassian.com/display/CONF43/Database+Setup+For+MySQL

I presume your VM is still running when Confluence hangs.

Do you get the same error trying to run locally on the server (i.e. ssh in, then use 'links' to log in)

Stuart, thanks for the link, that is precisely the one that I followed to setup confluence in the first place. I did look at the server logs but I will poke in the MySQL logs when I get a chance

As to connecting locally, yes its the same response from links (or forwarding firefox over X). Because the process goes away but the PID file is still there it seems to be a hard crash.

My appologies for this being the first time I have used the answers website. I put that information as well in the ticket but it did not transfer here either.

When you run the start-confluence..sh script you get a message similar to "PID exists but no process found." The exact error escapes me, though when (not if) it happens again I will post it.

Thanks for your time on this

I have had another crash

2013-02-26 15:41:42,270 ERROR [http-8080-1] [atlassian.event.internal.EventPublisherImpl] invokeListeners There was an exception thrown trying to dispatch event 'com.atlassian.confluence.event.events.content.mail.notification.ContentNotificationAddedEvent[source=com.atlassian.confluence.mail.notification.DefaultNotificationManager@7dfeb118]' from the invoker 'com.atlassian.event.internal.SingleParameterMethodListenerInvoker@3f897240'.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (safepoint.cpp:308), pid=5868, tid=140267451959040
#  guarantee(PageArmed == 0) failed: invariant
#

This widget could not be displayed.

Steve,

I think that java version (which I think is 1.6.0_26?) may be problematic.

I have nearly the same setup as you but have been running Java(TM) SE Runtime Environment (build 1.6.0_32-b05) for quite some time without problems.

Are you using the .tar.gz or the installer?

I am using the binary installer of confluence and the repos for java.

Are you recommending that I build java from source?

This widget could not be displayed.

I'd be surprised if that was the problem, but always good to go with the latest Oracle Java.. Java 1.6 update 41 JDK.

http://www.oracle.com/technetwork/java/javase/downloads/jdk6downloads-1902814.html

This widget could not be displayed.

Not build from source - just unpack the .tar.gz and shove it in a folder.

Actually the latest JDK is 1.7.0_15 but you do not want to be going there. The one stuartu recommends is good.

You can be certain that at least one component of your install is bad - just keep replacing bits till it works like I do ;)

Check your mysql install form top to bottom, JDBC driver, reinstall everything etc

If still no luck then you need to get your crashdumps which I have found to be not that straightforward depending on the nature of the crash.

http://middlewaremagic.com/weblogic/?p=4482

http://weblogs.java.net/blog/kohsuke/archive/2009/02/crash_course_on.html

I have upgraded to the version of java linked here, verified the version and started confluence. I will give an update in a few days when stability can be determined

Mar 11, 2013 6:03:51 AM com.sun.jersey.spi.container.servlet.WebComponent filterFormParameters
WARNING: A servlet POST request, to the URI http://confluence/rest/dashboardmacros/1.0/updates, contains form parameters in the request body but the request body has been consumed by the servlet or a servlet filter accessing the request parameters. Only resource methods using @FormParam will work as expected. Resource methods consuming the request body by other means will not work as expected.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000003f1629c188, pid=2087, tid=139882862630656
#
# JRE version: 6.0_26-b03
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libc.so.6+0x9c188] __gettimeofday+0x18
#
# An error report file with more information is saved as:
# /opt/atlassian/confluence/bin/hs_err_pid2087.log
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp

# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000003f1629c188, pid=2087, tid=139882862630656
#
# JRE version: 6.0_26-b03
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed
oops)
# Problematic frame:
# C [libc.so.6+0x9c188] __gettimeofday+0x18
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
#

--------------- T H R E A D ---------------

Current thread (0x00007f3908cfb000): JavaThread "com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1" daemon [_thread_blocked, id=2243, stack(0x00007f3904465000,0x00007f3904566000)]

[code]

Mar 11, 2013 6:03:51 AM com.sun.jersey.spi.container.servlet.WebComponent filterFormParameters
WARNING: A servlet POST request, to the URI http://confluence/rest/dashboardmacros/1.0/updates, contains form parameters in the request body but the request body has been consumed by the servlet or a servlet filter accessing the request parameters. Only resource methods using @FormParam will work as expected. Resource methods consuming the request body by other means will not work as expected.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000003f1629c188, pid=2087, tid=139882862630656
#
# JRE version: 6.0_26-b03
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libc.so.6+0x9c188] __gettimeofday+0x18
#
# An error report file with more information is saved as:
# /opt/atlassian/confluence/bin/hs_err_pid2087.log
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
[/code]

[code]

# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000003f1629c188, pid=2087, tid=139882862630656
#
# JRE version: 6.0_26-b03
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode linux-amd64 compressed
oops)
# Problematic frame:
# C [libc.so.6+0x9c188] __gettimeofday+0x18
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
#

---------------
T H R E A D ---------------

Current thread (0x00007f3908cfb000): JavaThread "com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1" daemon [_thread_blocked, id=2243, stack(0x00007f3904465000,0x00007f3904566000)]
[/code]

This widget could not be displayed.

> C [libc.so.6+0x9c188] __gettimeofday+0x18

Is something up with that libc package? Is it an old version? Are all your packages up to date?

The latest OS update was Feb 26, 2013. So I may be a couple of weeks behind but on the whole I would say I am well patched.

Even if something is wrong with libc, there is not much I can do about it until an update comes down the pipes. There is no way I am going to compile libc from sorce

> # JRE version: 6.0_26-b03

If you installed the latest (Oracle) Java 1.6, then you aren't using it. The version should be 6.0_43.

You would also want the JDK, not the JRE.

http://www.oracle.com/technetwork/java/javase/downloads/jdk6downloads-1902814.html

On linux, I have the JAVA_HOME environment variable set.

I presume you run confluence as a different user from root. Double check the java -version command gives the correct version there to.

Java versions change quite quickly. I wouldn't be too concerned with the minor difference, but 43 would obviously be preferred over 41.

Thats interesting... when I do this:

[root@confluence ~]# java -version
java version "1.6.0_41"
Java(TM) SE Runtime Environment (build 1.6.0_41-b02)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)

I also tried this

[root@confluence ~]# /usr/java/jdk1.6.0_41/bin/java -version
java version "1.6.0_41"
Java(TM) SE Runtime Environment (build 1.6.0_41-b02)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)

I do note that /usr/java/jdk1.6.0_41/bin is not in my path, should it be?

Also, at the time it was recommended this is the package(s) that I downloaded:

jdk-6u41-linux-amd64.rpm
jdk-6u41-linux-x64-rpm.bin

[root@confluence ~]# which java
/usr/bin/java

Just for reference

[confluence1@confluence ~]$ java -version
java version "1.6.0_41"
Java(TM) SE Runtime Environment (build 1.6.0_41-b02)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)

I will try setting the JAVA_HOME variable when I get a chance to bring it down again (or it crashes)

I have set JAVA_HOME in the .bashrc of confluence1

We will see if it makes a difference...

Its still crashing, making this service horribly unreliable

I noted the following this morning which escaped me before

Server startup logs are located in /opt/atlassian/confluence/logs/catalina.out
Using CATALINA_BASE:   /opt/atlassian/confluence
Using CATALINA_HOME:   /opt/atlassian/confluence
Using CATALINA_TMPDIR: /opt/atlassian/confluence/temp
Using JRE_HOME:        /opt/atlassian/confluence/jre/
Using CLASSPATH:       /opt/atlassian/confluence/bin/bootstrap.jar
Using CATALINA_PID:    /opt/atlassian/confluence/work/catalina.pid

Do I need to set the JRE_HOME. This output is generated from running /opt/atlassian/confluence/bin/start-confluence.sh

I have upgraded to CentOS 6.4 which seemed to make a slight difference but confluence is still crashing.

I have tried running the start-confluence script as confluence1 and also running with the -fg. Neither seem to make any difference as to the longevity of the process. Almost every morning confluence is unavailable with a stale pid

I changed the JRE_HOME in the setenv.sh so that it is now pointing to the java 1.6.0.41 JRE. We will see whether this makes s difference.

I also increase the amount of ram for Java from 256/512/256 to 1024/2048/512

Suggest an answer

Log in or Sign up to answer
Atlassian Summit 2018

Meet the community IRL

Atlassian Summit is an excellent opportunity for in-person support, training, and networking.

Learn more
Community showcase
Published 4 hours ago in Bitbucket

Branch Management with Bitbucket

As a project manager, I have discovered that different developers want to bring their previous branching method with them when they join the team. Some developers are used to performing individual wo...

31 views 0 5
Read article

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you