Community
Products
Jira Software
Questions
"Failed to peek." and "failuresCount: 200/200" Errors in JIRA DATA CENTER and low performance

"Failed to peek." and "failuresCount: 200/200" Errors in JIRA DATA CENTER and low performance

We have JIRA DATA CENTER with three nodes and with 200,000 issues. Also we use some of useful add-ons such as : Structure, Scriptrunner, eazybi, . . .
At some peak hours of consumption abnormally CPU usage goes up and number of established sessions on 8443 port without reason increase and does not decrease, so JIRA is very slow and not responding, so we must restart application and this happens repeatedly.

In JIRA log file we found two errors.
Can anyone help me on this?

1 answer

0 votes

Hi @Hamid Gholami,

Based on the inputs, I would suggest to check the logs for errors and identified the plugin or script which causes the CPU usage increase.

If possible, increase the allocated memory (JVM ) on each node and verify.

Additionally, go through the below Atlassian KB article which include performance related issues like distribution of requests across nodes

https://confluence.atlassian.com/enterprise/jira-data-center-common-problems-939513809.html

Thanks,

Kiran.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Hello @Kiran Panduga {Appfire}

Thank you for your response.

we have 32G Memory totally and I set 6G JVM in jira application.

Also when that problem happened we found two errors in JIRA log file:

ERROR NUM1:

019-01-27 15:30:51,033 localq-reader-3 ERROR [c.a.j.c.distribution.localq.LocalQCacheOpReader] Critical state of local cache replication queue - cannot peek from queue: [queueId=queue_NODE1_1_4b1b4dc8cf38b3c64b1d657da8f5ac8c, queuePath=/opt/atlassian/application-data/jira/localq/queue_NODE1_1_4b1b4dc8cf38b3c64b1d657da8f5ac8c], error: Failed to peek.
com.squareup.tape.FileException: Failed to peek.
at com.squareup.tape.FileObjectQueue.peek(FileObjectQueue.java:59)
at com.atlassian.jira.cluster.distribution.localq.tape.TapeLocalQCacheOpQueue.peek(TapeLocalQCacheOpQueue.java:198)
at com.atlassian.jira.cluster.distribution.localq.tape.TapeLocalQCacheOpQueue.peekOrBlock(TapeLocalQCacheOpQueue.java:216)
at com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpQueueWithStats.peekOrBlock(LocalQCacheOpQueueWithStats.java:198)
at com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpReader.peekOrBlock(LocalQCacheOpReader.java:157)
at com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpReader.run(LocalQCacheOpReader.java:71)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.lang.ClassNotFoundException: com.codebarrel.jira.plugin.automation.store.CachingAutomationConfigStore$TenantRuleId
at com.atlassian.jira.cluster.distribution.localq.tape.TapeLocalQCacheOpConverter.from(TapeLocalQCacheOpConverter.java:25)
at com.atlassian.jira.cluster.distribution.localq.tape.TapeLocalQCacheOpConverter.from(TapeLocalQCacheOpConverter.java:16)
at com.squareup.tape.FileObjectQueue.peek(FileObjectQueue.java:57)
... 10 more
Caused by: java.lang.ClassNotFoundException: com.codebarrel.jira.plugin.automation.store.CachingAutomationConfigStore$TenantRuleId
... 1 filtered
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:628)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1781)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at com.atlassian.jira.cluster.distribution.localq.tape.TapeLocalQCacheOpConverter.from(TapeLocalQCacheOpConverter.java:23)
... 12 more

ERROR NUM2:

2019-01-27 15:23:30,039 localq-reader-11 ERROR [c.a.j.c.distribution.localq.LocalQCacheOpReader] Abandoning sending: LocalQCacheOp{cacheName='org.marvelution.jji.releasereport.CiBuildReleaseReportColumn', action=REMOVE, key=ERP-2285, value=null, creationTimeInMillis=1548602413120} from cache replication queue: [queueId=queue_NODE2_0_31fe71b00865ae60db401068d5159de9, queuePath=/opt/atlassian/application-data/jira/localq/queue_NODE2_0_31fe71b00865ae60db401068d5159de9], failuresCount: 200/200. Removing from queue. Error: java.rmi.NotBoundException: org.marvelution.jji.releasereport.CiBuildReleaseReportColumn
com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpSender$UnrecoverableFailure: java.rmi.NotBoundException: org.marvelution.jji.releasereport.CiBuildReleaseReportColumn
at com.atlassian.jira.cluster.distribution.localq.rmi.LocalQCacheOpRMISender.send(LocalQCacheOpRMISender.java:88)
at com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpReader.run(LocalQCacheOpReader.java:83)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.rmi.NotBoundException: org.marvelution.jji.releasereport.CiBuildReleaseReportColumn
at sun.rmi.registry.RegistryImpl.lookup(RegistryImpl.java:166)
at sun.rmi.registry.RegistryImpl_Skel.dispatch(Unknown Source)
at sun.rmi.server.UnicastServerRef.oldDispatch(UnicastServerRef.java:411)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:272)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:276)
at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:253)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:379)
at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
at com.atlassian.jira.cluster.distribution.localq.rmi.BasicRMICachePeerProvider.lookupRemoteCachePeer(BasicRMICachePeerProvider.java:64)
at com.atlassian.jira.cluster.distribution.localq.rmi.BasicRMICachePeerProvider.create(BasicRMICachePeerProvider.java:39)
at com.atlassian.jira.cluster.distribution.localq.rmi.CachingRMICachePeerManager.getCachePeerFor(CachingRMICachePeerManager.java:58)
at com.atlassian.jira.cluster.distribution.localq.rmi.CachingRMICachePeerManager.withCachePeer(CachingRMICachePeerManager.java:91)
at com.atlassian.jira.cluster.distribution.localq.rmi.LocalQCacheOpRMISender.send(LocalQCacheOpRMISender.java:63)
... 6 more

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Hi @Hamid Gholami,

Based on the error message, it looks like the issue is with the cache replication configuration.

I would suggest to go through the below KB article and increase the queue max size and other options.

https://confluence.atlassian.com/enterprise/jira-data-center-cache-replication-954262828.html

Additionally, I would also suggest to check the cluster cache replication timeout related issues and follow the recommendation provided by Atlassian in the below link:

https://confluence.atlassian.com/jirakb/cluster-cache-replication-healthcheck-fails-due-to-unable-to-complete-within-the-timeout-945545127.html

Thanks,

Kiran.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Products

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Get product advice from experts

Join a community group

Advance your career with learning paths

Earn badges and rewards

Connect and share ideas at events

"Failed to peek." and "failuresCount: 200/200" Errors in JIRA DATA CENTER and low performance

1 answer

Suggest an answer

Was this helpful?

Thanks!

TAGS

Community showcase

Atlassian Community Events

Ask a question

Products

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Get product advice from experts

Join a community group

Advance your career with learning paths

Earn badges and rewards

Connect and share ideas at events

"Failed to peek." and "failuresCount: 200/200" Errors in JIRA DATA CENTER and low performance

1 answer

Suggest an answer

Was this helpful?

Thanks!

TAGS

Community showcase

Atlassian Community Events