Jira goes down after some time running

Jose Carrizo January 4, 2013

I have been experiencing this problem in the last few days since installed Jira for evaluation.

What happens is that Jira just goes down after a day running, I don't know exactly how much time it takes but one day it is running and the other day it's not working any more.

By now the only solution I have is start Jira's service again and it goes seamlessly... until the error happens again.

Update:

Well it crashed again, the kernel killed the java process, this is what I found in the log:

Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189655] postgres invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189666] postgres cpuset=/ mems_allowed=0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189674] Pid: 28127, comm: postgres Not tainted 3.2.0-31-virtual #50-Ubuntu
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189676] Call Trace:
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189691]  [<c01f95e5>] dump_header.isra.6+0x85/0xc0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189694]  [<c01f981c>] oom_kill_process+0x5c/0x80
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189697]  [<c01f9c35>] out_of_memory+0xc5/0x1c0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189700]  [<c01fdacc>] __alloc_pages_nodemask+0x72c/0x740
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189703]  [<c01f8948>] filemap_fault+0x1f8/0x370
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189707]  [<c0108504>] ? xen_set_pte_at+0x34/0xa0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189873]  [<c010597a>] ? pte_pfn_to_mfn.part.7+0x9a/0xb0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189878]  [<c021396e>] __do_fault+0x6e/0x550
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189880]  [<c0217375>] handle_pte_fault+0x95/0x2c0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189883]  [<c021778e>] handle_mm_fault+0x15e/0x2c0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189891]  [<c06a930b>] do_page_fault+0x15b/0x490
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189894]  [<c01bee14>] ? handle_irq_event+0x44/0x60
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189897]  [<c0104a62>] ? xen_clts+0x72/0x150
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189899]  [<c06a6c80>] ? do_debug+0x180/0x180
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189903]  [<c0112484>] ? math_state_restore+0x44/0x60
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189905]  [<c06a91b0>] ? vmalloc_fault+0x190/0x190
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189908]  [<c06a6577>] error_code+0x67/0x6c
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189910] Mem-Info:
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189911] DMA per-cpu:
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189913] CPU    0: hi:    0, btch:   1 usd:   0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189914] Normal per-cpu:
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189916] CPU    0: hi:  186, btch:  31 usd: 148
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189920] active_anon:144448 inactive_anon:1379 isolated_anon:0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189921]  active_file:71 inactive_file:115 isolated_file:0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189921]  unevictable:0 dirty:2 writeback:13 unstable:0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189922]  free:1392 slab_reclaimable:1767 slab_unreclaimable:1305
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189923]  mapped:2845 shmem:3220 pagetables:587 bounce:0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189929] DMA free:2492kB min:76kB low:92kB high:112kB active_anon:3696kB inactive_anon:16kB active_file:12kB inactive_file:60kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:4kB writeback:0kB mapped:24kB shmem:24kB slab_reclaimable:8kB slab_unreclaimable:0kB kernel_stack:16kB pagetables:8kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:60 all_unreclaimable? yes
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189932] lowmem_reserve[]: 0 602 602 602
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189939] Normal free:3076kB min:3100kB low:3872kB high:4648kB active_anon:574096kB inactive_anon:5500kB active_file:272kB inactive_file:400kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:616712kB mlocked:0kB dirty:4kB writeback:52kB mapped:11356kB shmem:12856kB slab_reclaimable:7060kB slab_unreclaimable:5220kB kernel_stack:1224kB pagetables:2340kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1137 all_unreclaimable? yes
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189943] lowmem_reserve[]: 0 0 0 0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189946] DMA: 25*4kB 23*8kB 10*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 2492kB
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189953] Normal: 283*4kB 7*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3076kB
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.189961] 3420 total pagecache pages
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.190017] 0 pages in swap cache
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.190021] Swap cache stats: add 0, delete 0, find 0/0
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.190023] Free swap  = 0kB
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.190024] Total swap = 0kB
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192095] 159472 pages RAM
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192097] 0 pages HighMem
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192098] 7102 pages reserved
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192099] 7296 pages shared
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192100] 147521 pages non-shared
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192102] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192105] [  247]     0   247      706       39   0       0             0 upstart-udev-br
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192108] [  251]     0   251      740      106   0     -17         -1000 udevd
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192111] [  296]     0   296      706       74   0     -17         -1000 udevd
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192114] [  335]     0   335      739      105   0     -17         -1000 udevd
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192116] [  378]     0   378      709       38   0       0             0 upstart-socket-
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192119] [  433]     0   433      729       69   0       0             0 dhclient3
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192122] [  596]     0   596     1668      111   0     -17         -1000 sshd
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192124] [  608]   102   608      812       61   0       0             0 dbus-daemon
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192127] [  613]   101   613     7569      151   0       0             0 rsyslogd
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192129] [  709]     0   709      643       31   0       0             0 getty
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192132] [  715]     0   715      643       31   0       0             0 getty
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192134] [  720]     0   720      643       30   0       0             0 getty
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192137] [  722]     0   722      643       32   0       0             0 getty
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192139] [  726]     0   726      643       30   0       0             0 getty
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192142] [  738]     0   738      541       30   0       0             0 acpid
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192144] [  739]     0   739      652       42   0       0             0 cron
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192146] [  740]     0   740      615       34   0       0             0 atd
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192149] [  771]     0   771      643       31   0       0             0 getty
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192151] [  773]   103   773     6113      214   0       0             0 whoopsie
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192154] [ 6518]   106  6518    12431     1064   0     -13          -900 postgres
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192157] [ 6520]   106  6520    12459     2030   0       0             0 postgres
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192159] [ 6521]   106  6521    12431      373   0       0             0 postgres
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192162] [ 6522]   106  6522    12628      388   0       0             0 postgres
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192164] [ 6523]   106  6523     5050      280   0       0             0 postgres
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192167] [28037]  1001 28037   217227   131934   0       0             0 java
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192169] [28053]   106 28053    13324     2542   0       0             0 postgres
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192172] [28127]   106 28127    13374     2407   0       0             0 postgres
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192174] [  743]     0   743      749       61   0       0             0 cron
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192177] [  744]     0   744      556       19   0       0             0 sh
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192179] [  745]     0   745      533       21   0       0             0 run-parts
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192182] [  748]     0   748      556       27   0       0             0 apt
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192184] [  860]     0   860     9716     6241   0       0             0 update-apt-xapi
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192187] [  861]   106   861    12505      314   0       0             0 postgres
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192189] Out of memory: Kill process 28037 (java) score 867 or sacrifice child
Jan  6 06:35:09 domU-12-31-39-0E-14-B7 kernel: [261534.192203] Killed process 28037 (java) total-vm:868908kB, anon-rss:527736kB, file-rss:0kB

6 answers

1 accepted

0 votes
Answer accepted
Jose Carrizo January 11, 2013

Finally resolved :]

It was certainly a memory problem, first Jira had more memory configured than the available in the system, I assumed that java automatically realizes about that but... not, java actually tries to take all the memory you put in Xmx so I changed that.

The error happened again, which didn't surprise me since I know the server have very little memory, but this time took more time to happen.

Then I realized that there wasn't a swap partition... that's why the OOM was taking action, after adding a swap file the problem was resolved, no more memory errors.

I'm running Jira with Xmx 300mb and 256mb of max permgen, the postgres database is running in the same server using very little memory, and those are the only two applications running on it.

The server have 600mb of memory available, I know is very little but is working very well by now.

0 votes
Rahul Aich [Nagra]
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 5, 2013

If you are certain that it is due to memory issue then I will recommend you the following:

1. increase Heap Space memory.See link on how to do it.

https://confluence.atlassian.com/display/JIRA/Increasing+JIRA+Memory

2. make sure you have not installed any plugins which is causing memory leaks. In case you have installed a lot plugins then check the logs to confirm which one is causing problems. Also consider disabling them for a few days and monitor performance.

3. Make sure that the windows drive has enough disk space on it to support xml backup every day.

Also have a look at this url

https://confluence.atlassian.com/pages/viewpage.action?pageId=191069

4. Consider installing the java melody plugin to monitor memory usage of jira instance over a period of time.

https://marketplace.atlassian.com/plugins/net.bull.javamelody

Again unless you have access to the stdout.log it would be difficult to pin-point at the root cause of the issue.

Rahul

Rahul Aich [Nagra]
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 6, 2013

It is really difficult to comment on it unless we see the logs on why the system goes down every few days. Most likely cause can be due to memory leaks due to any plugins that were installed.

Pls access the stdout.log and paste the final few lines here and then we can work.

Rahul

Jose Carrizo January 6, 2013

The server have too little memory, around 600MB, do you think that if I increase it to 1GB the problem wil be resolved?

Jose Carrizo January 8, 2013

Sorry I didn't say it is running on linux, Ubuntu 12.04 32 bits. You can see the log files here: https://docs.google.com/open?id=0B9ZxuZO5BbchTHk1dElFdWJ3SGM

I didn't found any memory related error there, which is very strange. And yes, I installed GreenHopper plugin, apart from that it is a default installation with two very very small projects.

Renjith Pillai
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 8, 2013

So what is the last part of the log sayting before you restarted JIRA?

And the earlier linux OOM killing is not happening?

Jose Carrizo January 8, 2013

These are the last lines, it is all I can put in a comment:

2013-01-05 07:19:40,296 http-bio-8080-exec-24 ERROR carrizo 439x547x2 1m8rls 190.219.228.180 /secure/Search!Jql.jspa [velocity] Left side ($totalSize) of '<' operation has null value at templates/plugins/jira/macros.vm[line 8, column 18]
2013-01-05 17:45:54,028 QuartzWorker-1 INFO ServiceRunner    Backup Service [jira.bc.dataimport.DefaultExportService] Data export completed in 487ms. Wrote 1240 entities to export in memory.
2013-01-05 17:45:54,072 QuartzWorker-1 INFO ServiceRunner    Backup Service [jira.bc.dataimport.DefaultExportService] Attempting to save the Active Objects Backup
2013-01-05 17:45:55,554 QuartzWorker-1 INFO ServiceRunner    Backup Service [jira.bc.dataimport.DefaultExportService] Finished saving the Active Objects Backup
2013-01-06 01:10:38,169 http-bio-8080-exec-2 WARN carrizo 70x670x1 1xe42n7 190.219.228.180 /rest/api/1.0/menus/greenhopper_menu [service.rapid.view.RapidViewDao] could not find entity of type interface com.atlassian.greenhopper.service.rapid.view.RapidViewAO with key 1
2013-01-06 05:46:55,172 QuartzWorker-1 INFO ServiceRunner    Backup Service [jira.bc.dataimport.DefaultExportService] Data export completed in 369ms. Wrote 1301 entities to export in memory.
2013-01-06 05:46:55,194 QuartzWorker-1 INFO ServiceRunner    Backup Service [jira.bc.dataimport.DefaultExportService] Attempting to save the Active Objects Backup
2013-01-06 05:46:56,610 QuartzWorker-1 INFO ServiceRunner    Backup Service [jira.bc.dataimport.DefaultExportService] Finished saving the Active Objects Backup

And yes the OOM killedthe java process again, above I edited the original post with the latest kernel log.

Is it possible that the preccess gets killed without any out of memory exception in jira log?

Renjith Pillai
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 8, 2013

Yep, Linux evaluates the highest users of memory and just kills it. The process will not have any chance to do anything.

  • What is the Xmx value set for JIRA in the setenv.sh?
  • What's the total RAM of the server
  • Where is database for JIRA running?
  • Any other application servers?
0 votes
Jose Carrizo January 5, 2013

Well, it looks like everyting is going ok, I'll wait one more day to mark this as resolved.

Rahul, right now I don't have access to my server but I'll check it before closing the post.

Anyway I think it is confimed since I found a message in /var/log/kern.log saying that the kernel killed the java process due to memory usage.

0 votes
Rahul Aich [Nagra]
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 4, 2013

this error is not related to jira going down. Can you check stdout.log and paste the final few lines from that log here....

0 votes
Andy Brook [Plugin People]
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 4, 2013

Yep, memory issues are more likely, this looks like a request processing error, which isnt likely to take down JIRA.

0 votes
Jose Carrizo January 4, 2013

I think "Hercules" boot nailed it, it could be a memory configuration problem. I'll see if changing it resolve the problem.

Suggest an answer

Log in or Sign up to answer