Does anyone have some best practices for a scheme that automatically reboots remote agents? I'm thinking of adding a cron job on the main Bamboo server that loops over all agents, disables them, waits for the current build to finish (if there is one), reboots, waits for the machine to come back, and enables it.
Is there a better way that escapes me?
We manage hundreds of remote agents for Atlassian's internal build system. Most of then Linux. We hardly ever need to reboot them. Though that's probably because we try to have builds be 'well-behaved' in the sense that they tear down any transient resources they might create while they run.
The agent processes themselves we manage with daemontools so we could easily loop over all agents and do:
$ svc -d /service/bamboo-agent
followed by
$ shutdown -r now
Although if there are no builds running just calling 'shutdown' is sufficient since daemontools will ensure the 'service' comes back up on restart. You would just have to make sure daemontools is always running in your /etc/rc.local configuration
We also use Puppet to manage the state of all those agents. So adding the cronjob definition uniformly across all agents would require 1) writing the Puppet code to do it 2) updating the puppetmaster
Since Puppet ensures that daemontools is always running then calling 'shutdown' with Puppet present is also sufficient.
Other ways to loop over your agents including using cluster-SSH and Fabric. We use both for adhoc orchestration of maintenance tasks.
Dealing with the agents directly is fairly straightforward, the tricky part even for us is negotiating with Bamboo for a 'window of opportunity' to shut the agents down. Przemek can correct me but since Bamboo 3.3 there should be a way to PAUSE the server. I think there might even be a REST endpoint to do that.
What command(s) do you use for your daemontools bamboo-agent run script?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
https://marketplace.atlassian.com/plugins/com.edwardawebb.bamboo-agent-apis
Open source and free until Atlassian provides a more elegant solution. Uses security tokens instead of standard credentials to allow scripted interactions without exposing sensitive accounts.
Even has handy bash-like output
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
To automatically manage agents I would suggest a free and open plugin I recently released that includes both json and bash-like output for easy automation.
Use of security tokens keeps sensitive credentials out of scripts.
https://eddiewebb.atlassian.net/wiki/display/AAFB/Remote+Agent+Management+with+monit
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
This page was deleted.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
See https://eddiewebb.atlassian.net/wiki/display/AAFB/User+Guide for various examples including calls from monit and cron
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
An alternative approach to doing this (although requiring a bit of work if you have a lot of agents, you must have a bored intern for that if you have a 100 agents license):
- create a plan with a job that executes a script task (a file based one, it will be easier to modify it later) that disables the agent remotely and does the reboot in background after ~2-3 seconds. A disabled agent will finish its current job, but will not pick up a new one. If you don't fork off the reboot, the build will be failed.
- Clone this job within your plan to have as many jobs as you have agents
- assign a unique capability to each of your agents
- add a requirement to a unique agent to each Job
It should be simpler to set up, but if you often set up new agents or have lots of agents to begin with, I'd rather go with the approach you've described. Also, the remote disabling of agents may be a security risk if not secured properly (just like the remote reboot in your approach).
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Since I don't have a bored intern at my disposal, I'm going with the cron job. One thing I needed to do that wasn't in my initial question is that the rebooting process writes a small temp file on the machine with the current state of the agent (enabled or disabled). That way, when the machine starts up again the cron job runs and looks for the temp file. This way it won't inadvertently enable a machine that was already in a disabled state.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@przemek-
" that disables the agent remotely "
Have any advice on that step? I dont see any APIs or agent commands to disable agents.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
There's no API, you have to POST to the action that disables the agents.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
until now!
https://marketplace.atlassian.com/plugins/com.edwardawebb.bamboo-agent-apis
Open source and free until Atlassian provides a more elegant solution. Uses security tokens instead of standard credentials to allow scripted interactions without exposing sensitive accounts.
Even has handy bash-like output
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.