We recently experienced an hour long period of extremely high (90%+) CPU usage on our 16 core server which abide after about an hour. During this time no users reported any issues accessing Confluence and monitoring information from Wily indicated that response times were normal. For all intensive purposes the system was functional and within normal tolerances. However we would like to isolate what caused the CPU spike and were wondering if there is anyway to isolate which process would have caused this.
I don`t have an answer, but we have exactly the same problem.
Twice in the last three weeks, the CPU went from 10% to about 30% for a couple of hours, and then spiked to 100% and never came down.
During this time, There were a lot of connections to the DB that were all idle, and usage in the application seemed unaffacted. Even collarborative editing was behaving correctly.
rebooting brought the CPU back to normal, but I don`t have a resolution, or explanation.
We are suddenly seeing the same issue. Running top, the "khugepageds" process owned by confluence is pegged, reporting 198-200% CPU, load average of 2.3, and User CPU over 30%.
This is running on an AWS EC2, which I doubled, only to see it run fine for less than an hour and then max out again. (Which kicks in AWS throttling, so it would probably consume all the CPU if it could.)
Before I doubled the CPU, the application wouldn't respond. After that, it works with a slight delay, but is still maxed out. Throwing more CPU at it will soon become unacceptable.
Kill the khugepageds processes, delete the rogue crontab entry for "confluence" that is running curl/wget processes, delete /tmp/kerberods (which has probably opened reverse SSH tunnels on your system), and reboot. Then patch your system.
Do you know if the spike is coming from Confluence, or some other process on your system?
For Confluence, the best thing to do is, when the spike starts happening begin taking a bunch of thread dumps using the built-in admin tool spaced 5-10 seconds apart and then submit them to https://support.atlassian.com for analysis.
If the spike is happening outside of Confluence, you'll need to resort to other ways that are dependent on your operating system. For Windows, you could use the built in performance monitoring tools to plot CPU usage by process over time and then see which process is spiking.
For *nix systems, you can use the top command.
Hey there, folks! For most of us, the past six months- yes, you read that right- have been a journey. More people than ever before have pivoted to working remotely, and navigating being on-scre...
Connect with like-minded Atlassian users at free events near you!Find an event
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no Community Events near you at the moment.Host an event
You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events