Bamboo server automatically paused

Guillaume Lucazeau January 31, 2013

Hello,

it seems that since our recent upgrade from 4.3 to 4.4, one of our Bamboo server is automatically paused during the night. It happens on the instance used to run many test plans, we have another one just for builds where we don't have the problem.

Do you know what could cause the server to pause like this? Here is the audit log:

10:48 AM, ven., 1 févr. glucazeau Server state changed to 'RUNNING' from 'PAUSED'
06:14 AM, ven., 1 févr. SYSTEM Server state changed to 'PAUSED' from 'RUNNING'
12:01 AM, ven., 1 févr. SYSTEM scheduleBackupConfiguration.lastRanDate 1359586866368 1359673277956
12:01 AM, ven., 1 févr. SYSTEM Server state changed to 'RUNNING' from 'PAUSING'
11:55 PM, jeu., 31 janv. SYSTEM Server state changed to 'PAUSING' from 'RUNNING'
12:01 PM, jeu., 31 janv. glucazeau Server state changed to 'RUNNING' from 'PAUSED'
06:04 AM, jeu., 31 janv. SYSTEM Server state changed to 'PAUSED' from 'RUNNING'
12:01 AM, jeu., 31 janv. SYSTEM scheduleBackupConfiguration.lastRanDate 1359500467395 1359586866368

Thank you

Guillaume

7 answers

0 votes
Guillaume Lucazeau January 31, 2013

Thanks for your help and your explanations.

I checked this setting and moved the automated backup at noon, during the day we run less plans and I will be able to unpause the server before leaving the office in case it would happen again. So it should work, just waiting for the fix now :-)

Best regards

Guillaume

0 votes
PiotrA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 31, 2013

Got it!

That wait-for-all-plans-to-finish-before-export logic introduced in 4.4 has a flaw. Generally it means that if you do "instant" export (like when you don't tick 'wait until all running jobs have completed' or when you do manual export) and there is at least one plan that is running before and after the export, then you will have your Bamboo instance paused as soon as there will be no running plans at all!

I've raised https://jira.atlassian.com/browse/BAM-12759 for that bug - I'd like to ask you to vote on it and add yourself as a watcher. As a workaround you could try using "wait until all running jobs have completed" option - I think this should make Bamboo not pause after exports.

Thanks for your cooperation on this. Is there anything else that we could make for you?

0 votes
Guillaume Lucazeau January 31, 2013

> How do I read these audit logs? From the bottom to the top, right? That's the order of execution?

Yes, exactly.

The process you've shown makes sense and is the expected behaviour but indeed it doesn't match my logs. I looked in the bamboo.log file and I didn't see any error, it's doing the backup (" Writing xml to file...") while the plan is running, then swiching to RUNNING from PAUSING, then to PAUSED.

2013-02-01 12:00:00,002 INFO [QuartzScheduler_Worker-6] [ServerLifecycleManagerImpl] Server state changed to 'PAUSING' from 'RUNNING'
2013-02-01 12:00:00,002 INFO [4-BAM::PlanExec:pool-7-thread-3] [ExecutionLimitsServiceImpl] AUTOMATICDEPLOYMENT-TRUNKHOURLY cannot be run because the system is PAUSING
2013-02-01 12:00:00,004 INFO [QuartzScheduler_Worker-6] [XmlMigrator] Creating zip:/home/bamboo/qa-data/bamboohome/backups/bamboo_backup_2013_02_01.zip
[...] Writing XML to file
2013-02-01 12:05:52,216 INFO [QuartzScheduler_Worker-6] [XmlMigrator] Export completed. 0:05:52.212
2013-02-01 12:05:52,400 INFO [QuartzScheduler_Worker-6] [ServerLifecycleManagerImpl] Server state changed to 'RUNNING' from 'PAUSING'
2013-02-01 12:16:52,416 INFO [2-Server Lifecycle Manager:pool-5-thread-1] [ServerLifecycleManagerImpl] Server state changed to 'PAUSED' from 'RUNNING'
2013-02-01 12:31:30,629 INFO [qtp1027818036-131] [ServerLifecycleManagerImpl] Server state changed to 'RUNNING' from 'PAUSED' by 'glucazeau'

Our builds have different execution time, up to 2 hours but not 6 hours. But there a lot so they're scheduled like there is not that much time between them during the night.

I was going to run this scenario again when I noticed I didn't check this paramater in the backup schedule:

"Should Bamboo backup wait until all running jobs have completed."

I never paid attention to this feature so I guess that's the cause of my problem?

0 votes
PiotrA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 31, 2013

How do I read these audit logs? From the bottom to the top, right? That's the order of execution?

Actually I've looked at the Bamboo source code and it seems to me that the backup/pausing is a three-step process:

1) start the backup, try to pause the server
1.1) audit log will show "Server state changed to "PAUSING" from "RUNNING"
1.2) at the "PAUSING" state no *new* plan will be executed, Bamboo will wait for all plans to finish
2) when all plans finish execution, change state to "PAUSED" (log -> Server state changed to "PAUSED" from "PAUISING")
2.1) do the export
2.2) at the end of export unpase the Bamboo (log -> Server state changed to "RUNNING" from "PAUSED")
3) update last backup ran date (log -> scheduleBackupConfiguration.lastRanData 123123123)

To be honest that process doesn't really match to your logs, is it? Or am I blind? In your last logs there is a missing "changed to PAUSED from PAUSING" log - and that would make sense that you'd still seen some plan running - what is not making a sense to me that you obtained a new, complete zip archive in the backup directory, while the server changes state to RUNNING from PAUSING (it should change firstly from PAUSING->PAUSED->RUNNING).

Are you sure that your exports are done properly, and without any exception? (check atlassian-bamboo.logs and generally skim the contents of export.zip) - maybe some problem occur during the backup procedure and the Bamboo jumps out from the proper execution path which would lead to leaving server in PAUSING/PAUSED state?

How long are your builds running? Could it be that you have some long-run build that is scheduled for example at 11:55 PM (a few minutes prior to the scheduled backup) and they run for 6 hours?

0 votes
Guillaume Lucazeau January 31, 2013

Now the plan has completed and the server has been paused:

12:16 PM, ven., 1 févr. SYSTEM Server state changed to 'PAUSED' from 'RUNNING'
0 votes
Guillaume Lucazeau January 31, 2013

Hello Piotr,

thanks for your answer.

I noticed that too but every plan between 12am and 6am ran fine (and there are around 10), so I didn't understand why suddenly the bamboo server paused, 6 hours after the backup.
I also upgraded Bamboo on the 30/01, and I had this issue the two nights after. I never had it before and I didn't make any significant change after the upgrade (no new plan etc.)

I didn't know how long this backup task takes, so I scheduled it at noon and it took around 7 minutes to see the zip archive in my backup directory. I didn't notice before scheduling the backup but a test plan was running (and is still running). Here is the audit log:

12:05 PM, ven., 1 févr. SYSTEM scheduleBackupConfiguration.lastRanDate 1359716752400
12:05 PM, ven., 1 févr. SYSTEM Server state changed to 'RUNNING' from 'PAUSING'
12:00 PM, ven., 1 févr. SYSTEM Server state changed to 'PAUSING' from 'RUNNING'
11:56 AM, ven., 1 févr. glucazeau scheduleBackupConfiguration.backupCronExpression 0 55 23 ? * * 0 0 12 ? * *

So I might be wrong but it seems that the server is not completely paused, the plan is still running and the backup was still executed.

Thanks for your help.

0 votes
PiotrA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 31, 2013

Hi Guillaume,

I find it interesting that the logs indicate there was a scheduled backup triggered near the pausing. What I find more interesting is that there is this issue: https://jira.atlassian.com/browse/BAM-8233 that is marked as fixed in 4.4. The issue is about adding auto-pause during scheduled backups. So... could it be that your backup is taking hours to complete (you mention that this particular Bamboo instance is used to run *many* plans) and the nightly scheduled backup job just don't finish before the morning and as a result you observe the still-paused state?

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events