Confluence Unavailable all of a sudden, no errors in logs, restart not helping

Keith Leo-Smith
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
April 18, 2019

Hi,

 

A couple of days my confluence server stopped responding and i cannot understand why. generally there will be an error in the logs but when i restart there is no error and doing a net-stat shows nothing binded to the port

 

It is running on an 8GB RAM VPS and i am the only user on the server. It been working for 2 years without issue. There are plenty resources available . I have noticed that thee CPU stats are maxing out since the confluence server stopped responding. The service using the CPU resources is funny enough a confluence service but i'm not sure what confluence service is consuming 100% CPU or even if this is the issue why my server is not responding.

 

The startup logs (catalina.out) show the following after a restart: (I have restarted several times)

18-Apr-2019 20:30:42.841 WARNING [main] org.apache.tomcat.util.digester.SetPropertiesRule.begin [SetPropertiesRule]{Server} Setting property 'debug' to '0' did not find a matching property.
18-Apr-2019 20:30:42.957 WARNING [main] org.apache.catalina.startup.SetAllPropertiesRule.begin [SetAllPropertiesRule]{Server/Service/Connector} Setting property 'debug' to '0' did not find a matching property.
18-Apr-2019 20:30:42.974 WARNING [main] org.apache.tomcat.util.digester.SetPropertiesRule.begin [SetPropertiesRule]{Server/Service/Engine} Setting property 'debug' to '0' did not find a matching property.
18-Apr-2019 20:30:42.988 WARNING [main] org.apache.tomcat.util.digester.SetPropertiesRule.begin [SetPropertiesRule]{Server/Service/Engine/Host} Setting property 'debug' to '0' did not find a matching property.
18-Apr-2019 20:30:43.045 WARNING [main] org.apache.tomcat.util.digester.SetPropertiesRule.begin [SetPropertiesRule]{Server/Service/Engine/Host/Context} Setting property 'debug' to '0' did not find a matching property.
18-Apr-2019 20:30:43.083 WARNING [main] org.apache.tomcat.util.digester.SetPropertiesRule.begin [SetPropertiesRule]{Server/Service/Engine/Host/Context} Setting property 'debug' to '0' did not find a matching property.
18-Apr-2019 20:30:43.626 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8090"]
18-Apr-2019 20:30:43.665 INFO [main] org.apache.tomcat.util.net.NioSelectorPool.getSharedSelector Using a shared selector for servlet write/read
18-Apr-2019 20:30:43.676 INFO [main] org.apache.catalina.startup.Catalina.load Initialization processed in 975 ms
18-Apr-2019 20:30:43.685 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service Tomcat-Standalone
18-Apr-2019 20:30:43.685 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet Engine: Apache Tomcat/8.0.51

It seems to never go past this stage

 

My server.xml shows:

Server port="8000" shutdown="SHUTDOWN" debug="0">
<Service name="Tomcat-Standalone">
<!--
==============================================================================================================
DEFAULT - Direct connector with no proxy, for unproxied HTTP access to Confluence.

If using a http/https proxy, comment out this connector.
==============================================================================================================
-->
<Connector port="8090" connectionTimeout="20000" redirectPort="8443"
maxThreads="48" minSpareThreads="10"
enableLookups="false" acceptCount="10" debug="0" URIEncoding="UTF-8"
protocol="org.apache.coyote.http11.Http11NioProtocol"/>

 

 

A netstat shows the following: [netstat -tulpn | grep LISTEN]

root@wiki:/opt/atlassian/confluence/bin# netstat -tulpn | grep LISTEN
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1465/sshd
tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN 1619/postgres
tcp6 0 0 :::8091 :::* LISTEN 2180/java
tcp6 0 0 127.0.0.1:8005 :::* LISTEN 1657/java
tcp6 0 0 :::8080 :::* LISTEN 1657/java
tcp6 0 0 :::22 :::* LISTEN 1465/sshd

 no confluence?? which makes sense why curl http://localhost:8090 or curl http://localhost:80 says connection refused

 

The CPU usage shows the following:

PID   USER     PR NI VIRT    RES   SHR S  %CPU   %MEM  TIME+     COMMAND
3108 conflue+ 20 0 253040 6384 0 S 196.0 0.2 11693:13

What is this service consuming all my cpu??

 

I'm hoping somebody can assist me?

 

Kind regards,

keith

1 answer

0 votes
Daniel Eads
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 18, 2019

Hey Keith, welcome to the Atlassian Community!

First of all, thank you for providing so much detail in your question. This is the kind of information that is extremely helpful in troubleshooting, and I really appreciate that you took the time to include it all.

Unfortunately from the CPU load you're seeing and the timeframe you mentioned, I believe your instance was affected by an opportunistic attack against the CVE-2019-3396 Widget Connector vulnerability from March 20th (see Confluence Security Advisory - 2019-03-20). We've seen an infection going around that injects malware and the bitcoin miner it tries to run uses all the CPU available on the box. Initially the kerberods malware was being deployed as the payload, but other attacks might be trying to inject different payloads.

I'd recommend tackling things in this order:

  1. Kill malicious processes
  2. Clean up your crontab
  3. Upgrade Confluence
  4. Use a malware scanner to find remaining malware traces

Malicious processes

The top command will help you find processes (probably running under the confluence user account) that are consuming a large amount of CPU. If Confluence is currently stopped, you can probably plan on killing any processes running as the confluence user. note the process ID (pid) from the top output and then kill the process using kill -9 followed by the pid. Example:

sudo kill -9 12395

Clean up your crontab

Since most malware adds a cronjob that relaunches the malware every few minutes, you'll also need to check the crontab file and remove any suspicious-looking entries. For Ubuntu, this is stored in the /var/spool/cron/crontabs/ directory. Normally you should use the crontab command to edit the crontab, but for cleanup purposes we'll be inspecting the file for any pre-existing entries.

Using vim (or whichever text editor you're comfortable with), you'll open the file and remove suspicious-looking jobs.

sudo vim /var/spool/cron/crontabs/confluence

Confluence comes up on system startup through the SysV/systemd daemons, so we would expect the confluence user's crontab to not exist under normal circumstances. It's most likely the case that any entries in this file are malicious, but make sure you check them before deleting them entirely.

Upgrade Confluence

Once your CPU is under control and new malicious process aren't spawning, you need to upgrade Confluence to a version that isn't affected by the vulnerability. I'd recommend looking at one of these versions (latest releases as of this post):

Use a malware scanner

Finally, you need to clean up any remaining traces of malware on your system. The LSD malware cleanup tool will be useful for removing the Kerberods malware. Other malware payloads might need different cleanup tools depending on which attack and payload were used. A good starting place for detecting other types of infections are the scanners linked here. Once a particular infection is identified, googling for "____ removal tool" is a good place to start if the scanner was unable to remove the malware automatically.

Please let me know if you have more questions!
Daniel | Atlassian Support

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events