Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Preventing renaming of agents when they are restarting

kettch March 1, 2019

Hello,

 

I've been running in trouble recently with Bamboo, when changing the way it is started on the agents. Previously, all OS X agents on Bamboo were starting using a script as a login item. Now that the agents are being migrated to being set up using Ansible, we can't do that anymore (thanks to security on Mojave), so we moved towards using a user Launch Agent for that, as recommended in the Bamboo documentation.

It brings some nice stuff, like the agent reloading Bamboo automatically when it stops for some reason, but also causes trouble: whenever we reboot the machine, and Bamboo starts again, the server will give it a new unique name (the initially configured name, with '(2)' at the end), and also a new ID, as it seems the server is still seeing the 'old' agent. It's still the same machine, with the same domain name and IP.

On the server side, this also means that the "new" agent has no capabilities, and thus won't build much. The only solution then is to request a shutdown, rename on both the agent and on the Bamboo server, and also re-enter the right ID on the agent. Quite cumbersome...

Tried looking at the logs both on the agent and the server, but not much there to be seen so far...

Any idea what could cause this and how to avoid it?

1 answer

1 accepted

0 votes
Answer accepted
Daniel Santos
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 4, 2019

Hi @kettch

Thank you for sharing a detailed description of your problem.
If I understood you correctly you are using Ansible now to start your agents.

  • Do you mind sharing the configuration used for that?
  • What is the command run by ansible to start the agent?
  • Can you also share what is the Bamboo version used?
kettch March 4, 2019

Hmm, not quite it, sorry for the confusion. Ansible is actually used to configure the agents initially, and then keep them in shape.

Before that, the startup of Bamboo was done using a shell script that was added as a Login Item. Unfortunately, doing that fully automatically through Ansible is not possible, because adding login items requires a manual confirmation on the agent itself on the latest OS versions (Mojave, most notably).

What we were trying to do now was use a user launch agent, that can be installed by Ansible, as described here: https://confluence.atlassian.com/bamkb/configuring-bamboo-to-start-automatically-on-startup-on-mac-os-x-302812729.html

The only difference we have is the parameters for bamboo-agent.sh, that we changed to 'console' instead of '-fg'.

We're currently using Bamboo 6.2.8 build 60214, soon to be upgraded to an earlier version. Also note that I tried reverting to a login item (manually), and that fixes the issue.

Daniel Santos
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 4, 2019

Hi @kettch

I think I got it now. So you use Ansible to install the agent and set the startup script to load it by creating a <Agent>.plist file on /Library/LaunchAgents.

I just tested the script creation as you described and restarted it multiple times (both agent and server) and didn't get the same result as you. The agent kept its name and on every restart.

This is the script I used:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.atlassian.bamboo_agent</string>

<key>UserName</key>
<string>dsantos</string>

<key>ProgramArguments</key>
<array>
<string>/Users/dsantos/atlassian/bamboo/remoteAgents/agent-6.2.8/bin/bamboo-agent.sh</string>
<string>console</string>
</array>

<key>StandardErrorPath</key>
<string>/var/tmp/agent.err</string>

<key>StandardOutPath</key>
<string>/var/tmp/agent.out</string>

<key>SessionCreate</key>
<true/>

<key>RunAtLoad</key>
<true/>
</dict>
</plist>

 

Some questions that may help us:

  • What differences do you see between your script and mine?
  • When exactly the problem is triggered? (only when rebooting the entire machine?)
  • Does Ansible perform any changes on agent files?

 

Also note that I tried reverting to a login item (manually), and that fixes the issue.

Can you explain better what you were reverting here? I'm not following you in this sentence.

kettch March 4, 2019

Yup, that's it. There are only 2 differences I can see between your configuration and ours:

  • The agent is installed as a user-space launch agent (in ~/Library/LaunchAgents)
  • We also have the "KeepAlive" key set to true

The problem was seen mainly when rebooting the entire machine, but also when we would unload/load the launch agent manually (using launchctl). Also, Ansible does not change any files from the agent.

Can you explain better what you were reverting here? I'm not following you in this sentence.

Sure, what I mean is that I tried, manually (i.e. not through Ansible) disabling the launch agent and replacing it back with the method that we were using before (a script triggered by a Login Item), and now the agent won't get renamed anymore. The script in question is just launching "bamboo-agent.sh -fg", and that's it. The problem with this method being that the Login Item can't be installed automatically with Ansible.

Daniel Santos
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 5, 2019

Hi @kettch

Reading your message I'm led to think that this problem is not consistent. I mean, it does not happen every time. We will need to isolate better what is causing this issue.

I was able to reproduce it only when deleting the file bamboo-agent.cfg.xml which is the file who handles the agentUuid. This forces the agentUuid to be recreated causing the agent to have the same IP name and being duplicated (Bamboo is not able to identify that it is the same agent getting back online).

Can you please check your agent logs to verify if you have any log entries talking about changes on bamboo-agent.cfg.xml ?

Also, could you confirm if you are using docker or any other agent image to start your agents?

kettch March 6, 2019

Your last message hinted me at something: when I look at the bamboo-agent.cfg.xml, there is no agentUuid!

After a bit of investigation, what I found is that, since we were working on the Ansible setup, we were replaying the Ansible playbook, and thus replacing the agent config with a fresh file that didn't contain any agentUuid, thus causing the agent to be rset. In fact, it's also doing that with the Login Item, as long as I run Ansible after starting the agent.

So I'm fixing that, thanks a lot for pointing me on the right track!!

Like Daniel Santos likes this
Daniel Santos
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 6, 2019

I'm really glad that we could nail this one together.
I hope things go smooth now.

Thank you for sharing your findings here. This will be certainly helpful to other users.

Have a good one!
See you in the next thread.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events