Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Dockerized (Kubernetes) Remote Agent OOM Java Heap Space - can not reconfigure

zoltan_zvara February 26, 2021

We are running Remote Agents on Kubernetes, from the patched Docker image of:

FROM atlassian/bamboo-agent-base:7.2.2

One of our jobs most probably changed the state of the Agent (or an Agent bug) so that it throws FATAL exceptions of OOM due to Java Heap Space being not sufficient.

The `wrapper.conf` should be updated, and we do update it, but Bamboo Agent has not been designed to be cloud-native. Due to the below problem, updating the `wrapper.conf` while the container is running, not possible:

In Kubernetes, when the Pod starts, it loses its state, thus, the agent will be reinstalled and the `wrapper.conf` will be overwritten with default values. AFAIK there is no way to specify wrapper configuration parameters to the installer. On the other hand, the wrapper reloads the configuration once the Pod restarted, but then, the configuration is already lost.

bamboo@atlassian-bamboo-agent-master-atlassian-bamboo-agent-57fd8z2hd9:~$ ps aux | more
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
bamboo 1 0.1 0.0 33570360 56560 ? Ssl 08:58 0:01 java -jar /home/bamboo/atlassian-bamboo-agent-installer.jar https://#redacted#/agentServer/
bamboo 19 0.0 0.0 2916 1980 ? S 08:58 0:00 /bin/sh /home/bamboo/bamboo-agent-home/bin/bamboo-agent.sh console
bamboo 87 0.0 0.0 18504 4564 ? Sl 08:58 0:00 /home/bamboo/bamboo-agent-home/bin/./wrapper /home/bamboo/bamboo-agent-
home/bin/../conf/wrapper.conf wrapper.syslog.ident=bamboo-agent wrapper.pidfile=/home/bamboo/bamboo-agent-home/bin/./bamboo-agent.pid wrap
per.name=bamboo-agent wrapper.displayname=Bamboo Agent wrapper.statusfile=/home/bamboo/bamboo-agent-home/bin/./bamboo-agent.status wrapper
.java.statusfile=/home/bamboo/bamboo-agent-home/bin/./bamboo-agent.java.status wrapper.script.version=3.5.41 --
bamboo 1852 17.7 0.5 5807728 738116 ? Sl 09:08 1:19 /opt/java/openjdk/bin/java -Dbamboo.home=/home/bamboo/bamboo-agent-home
-Dbamboo.agent.ignoreServerCertName=false -Dbamboo.allow.empty.artifacts=false -Xms256m -Xmx512m -Djava.library.path=../lib -classpath ..
/lib/wrapper.jar:../lib/bamboo-agent-bootstrap.jar -Dwrapper.key=D286D9Qw8Tn1eZ9sUniswYU_NS73BzXz -Dwrapper.port=32000 -Dwrapper.jvm.port.
min=31000 -Dwrapper.jvm.port.max=31999 -Dwrapper.pid=87 -Dwrapper.version=3.5.41-st -Dwrapper.native_library=wrapper -Dwrapper.arch=x86 -D
wrapper.disable_shutdown_hook=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=10 -Dwrapper.lang.domain=wrapper -Dwrapper.lang.folder=../lang
org.tanukisoftware.wrapper.WrapperSimpleApp com.atlassian.bamboo.agent.bootstrap.AgentBootstrap https://#redacted#/agentServe
r/
bamboo 2041 0.5 0.0 8188 4960 pts/0 Ss 09:15 0:00 bash
bamboo 2081 0.0 0.0 8924 3332 pts/0 R+ 09:15 0:00 ps aux
bamboo 2082 0.0 0.0 5720 908 pts/0 S+ 09:15 0:00 more
bamboo@atlassian-bamboo-agent-master-atlassian-bamboo-agent-57fd8z2hd9:~$

1 answer

1 vote
Max Malygin February 26, 2021

Hi

We use https://bitbucket.org/atlassian/per-build-container to run on-demand agents in kubernetes. And it works!

Of the significant disadvantages, it is worth considering that it takes about 1 minute to initialize an agent, due to which even the shortest tasks increase the pipeline processing time. That is, elastic agents should be used for long-term/heavy tasks.

And if a task executed in PBC ended with an error, then the agent will not terminate correctly and will hang until a timeout of 10 minutes.

It is also embarrassing that https://bitbucket.org/atlassian/per-build-container/issues?status=new&status=open&sort=-updated_on have been without consideration for a long time.

zoltan_zvara March 8, 2021

Hi Max, thanks for your notes!

We solved the issue in part by converting the agent to a `StatefulSet` and then patching the `wrapper.conf` manually. I'll look into PBCs and will report back with issues discovered.

Like Max Malygin likes this

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events