JEMH is causing 100% CPU utilization on our JIRA server. We believe it's JEMH auditing that's causing the issue. When JIRA makes queries to PostgreSQL database regarding auditing events, some of such queries take more than 10 minutes to complete. During this 10 minutes, the JIRA is pegging 100% CPU until it hangs.
Right now I can't even open the auditing page to clean up events via web interface.
How can I clean up all the events directly in the database? This is urgent. Our production JIRA server has been unstable last four days.
Any prompt help would highly appreciated.
I've identified the root cause for the problem.
JEMH maintained too much email log events. The following db query to the events table took more than 10 minutes to return result if it ever does:
SELECT * FROM public."AO_78C957_AUDITEVENTS";
This caused JIRA/java process continuously run at 100% CPU until it hang. Deleting all the entries in table "AO_78C957_AUDITEVENTS" brought JIRA back to life. Since then, I dropped the event retention time from 3 months to one day. Over night, I saw 205413 email messages were processed by JEMH (at least that's what JEMH reported).
So for high email volume site like us, 3 month default setting for event retension is way too high.
One question I still have is why and how JIRA was brought down to its knees by JEMH?
Please help find out why.
Hi Simon, the AUDIT data is scanned for expired content every email, if you specify 3 months, its going to to retain and scan all email received in that period, its only duing a timestamp check but the issue volume is high. Setting a retention period to much less (a day) will of course reduce the volume of records that need to be scanned on email receipt.
I have generated large (hundreds of K) volumes before, I will do some more testing on that volume and see what falls out. I would think that a better solution would be a nightly job to remove older content, I'll certainly work to resolve this sooner rather than later.
There isnt a way to stop auditing at the moment, as you referred on the issue below, I will be putting in place measures to remove this impact.
The history needs to be purged, setting a day as the retention period is exactly what needs to be done,but still requires historic data to be removed (documented here). Please verify that the audit tables are empty, the impact of this should be near instant.
I'm tracking this at https://studio.plugins.atlassian.com/browse/JEMH-1067 , its my top prio right now, please feedback on how the purge works for you on the JIRA issue.
Hey Atlassian community, I help lead engineering at Sentry, an open-source error-tracking and monitoring tool that integrates with Jira. We started using Jira Software Cloud internally last year, a...
Connect with like-minded Atlassian users at free events near you!Find a group
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no AUG chapters near you at the moment.Start an AUG