Posted to stackoverflow as well but they aren't any help with monolithic questions no matter how desperate the user is. Here is what I posted as it is all the relevant information and to save time re-writing everything.
Sorry for more vague question but I an under a huge time crunch (about a week) to fully understand and design a Bamboo build system that uses Docker based remote agents so please don't down vote this as I really need help with this and turning to the Stackoverflow community for their vast wealth of knowledge.
We're currently using Bamboo Server 6.6 (hope to be upgraded soon to some new version) with about 30 configured standard/dedicated remote agents (Linux in our case). I've created a Docker container that has our CentOS 7 based build and test environment and currently using this in script tasks via docker run command. I've now been tasked to have a Bamboo based solution where everything (including the remote agent) is running in a container. I want to say that I am not a Docker expect, I am not in IT, I don't have access to the server and I'm not familiar with Docker Swarms, Kubernetes or AWS Clusters but have been tasked with designing a build environment. We do have RH OpenShift installed but have been told (by RH) that this is not suitable for our needs using it directly (and we want to continue with Bamboo as we are also using Bitbucket and Jira Atlassian products). Thinking OpenShift could be used in some fashion as a Kubernetes cluster but not sure or how this would be implemented/configured.
Our environment... We are stretching the boundaries of Docker but are required for our needs. We are doing cpp compiles for several projects using make and bitbake with a custom compilers and packages. We are running several static code analysis tools as well as python based verification and regression tests. We produce a code base that is used by other teams so there is no real end product that gets built but we do create a QEMU KVM image that we run in the container for testing purposes. Doing so does require us to use the --privileged option when running the container.
Our IT department has no knowledge of this so I will be working to present a model they can look to implement. I do not believe that AWS elastic remote agents are an option we they don't seem agreeable to moving to the cloud and if we do it appears our choice would be Azure. I'm in the initial investigation so my knowledge is limited as to the various available methods and options but want to go ahead and reach out earlier than later with my question because of the limited time. I've found little so far in regards to working models and how everything interacts (i.e. currently we use git to clone our source to the host remote agent along with various other files from things like Artifactory and build/test in the container via a shared directory. How does this work in a cluster? In the containerized remote agent, do I start with my current CentOS 7 based docker container or do I create a new one based on a Bamboo remote agent container? What do I do about OS requirements in that case. Will this even work with the --privileged requirements (read somewhere that Docker Runner cannot pass commands like this)? Shouldn't matter but this solution will probably be rolled out to replace Windows remote agents with Windows based containerized agents used by other groups.
I can really use the communities help with how all this works so please share any knowledge, websites, info, etc. that you may have. Even open to a more direct conversation with someone that has this implemented if you are willing.
Feeling the pressure, Jason
I can partly answer your question, but will leave it for others to answer the rest.
I've now been tasked to have a Bamboo based solution where everything (including the remote agent) is running in a container.
This is very much possible with Docker runner. All you need to do is to enable Docker Runner by navigating to the respective job and head to Docker taband choose Docker container and provide the Docker image. This is also detailed in this page - Docker runner.
We produce a code base that is used by other teams so there is no real end product that gets built but we do create a QEMU KVM image that we run in the container for testing purposes. Doing so does require us to use the --privileged option when running the container.
If your build involves running a QEMU KVM image, I think it is best to run your KVM images on a dedicated remote agent as you are doing currently. Running a virtualised environment within a container might not work well or you may have challenges in setting it up.
However, if you prefer to still try Docker runner, pease note that until Bamboo 6.7.0, Docker is expected to run as a root user. With the recent versions, we can now tell the docker image which user to use, with a USER clause in the Dockerfile. However, there is an existing bug BAM-19827 that prevents users from using a different user. I just thought of making you aware of the current limitation with this approach.
You may also explore this open source Plugin - Per build container (owned by our Build engineering team) to run the bamboo agents in kubernetes cluster. This is typically helpful when you want to spin up docker based agents on demand. Each agent will build the single job it was triggered for and gets removed afterwards.
The directive I'm having is to remove all requirements to static machines wanting everything in containers (including the remote agent). We are also looking towards a solution that is extensible therefore running in a cluster seems to be our best bet.
That being said I've looked at Docker Runner but this still runs on our dedicated agents which we are trying to eliminate. We also have a fairly complex environment in which we run. As stated we are running QEMU KVMs which require the --privileged option but cannot be passed using Docker Runner (from all that I have read to date).
For my clarity I want to stop and discuss the work flow for a Docker Runner build as I'm expecting to need to present this as part of my solution presentation.
The plan is created to run in an isolated build in a Docker Container using our custom build container and any required volumes we need as configured on the Docker tab for that job.
Task are created as normal for pulling source as needed and for running builds (I assume these would be Docker tasks?)
When the plan is triggered the local agent starts the container, performs all the tasks (within the container?) and closes the container when the job is complete.
What I am not sure of is where tasks run. Does a Source Code Checkout, check out to the local working directory (assuming it does since there is the shared volume and it is not using the container)? What other tasks don't use the docker container? Is it only used in docker tasks that use the container? Would really like to know what runs where.
Back to the question at hand. It is good to know the change in 6.7. I know we have a plan to upgrade but not sure what version. I expect jumping to 6.10 at this point. I have had a lot of permissions issues along the way. I have a user in the container and it has to match exactly to our remote agent user. I mean the UID and GID have to match.
I think the per build container solution is what we want but not sure how all this works. I know nothing of the plugin (as of yet) but expect that the Bamboo server is configured to use the cluster, of our choice, and assume it spins up the container runs the commands and gets destroyed. Not sure if the plans looks the same, how/where source code is updated, etc. I really need to get a good working knowledge of the workflow for this solution for a presentation I am putting together.
Is this different than using AWS elastic remote agents?
I've done some reading on the PBC plug-in and want to know if I'm getting closer to understanding the correct workflow.
The server will be configured to use a Kube cluster. The plug-in provides a Misc config option to state which remote agent container to use for the specific job. When the plan runs, the job starts the remote agent (in the cluster) which in turn starts performing the given tasks. Just like the standard host installed remote agent, the containerized remote agent contains the build directory which is shared with the build environment container. All source code clones, intermediate build files, etc. are kept in the container's build directory, as with the host installed remote agent, until the job is complete and the remote agent is destroyed. Is this accurate?
I stall have a few questions.
As there are several ways to start the build environment container are they all available in this configuration? We are using script tasks running the container via docker run command. Is this supported?
How does the containerized remote agent know to start the build container within the Kube cluster? Is this some configuration setting somewhere?
I see where the remote agent can be configured with CPU and RAM but what about storage space? How much storage can be dedicated to the remote agent?
I would appreciate any additional information that may be missing regarding the workflow and now things are orchestrated.
Trying to answer your questions, based on my personal understanding of PBC (no guarantees).
The build container (and thus agent) is started as soon as the plugin detects a job is scheduled in the build queue having the 'per-build container agent' option enabled (the checkmark as shown in the screenshot on this page: https://bitbucket.org/atlassian/per-build-container/src/7714261b9bcda3f870d4ccacf1d47c951ad4c779/bamboo-isolated-docker-plugin/README.md).
So "How does the containerized remote agent know to start the build container" is mute, as the remote agent does not exist at that point, it follows the lifecycle of the build as instrumented by the plugin. The created kubernetes pod includes 2 containers: one sidecar hosting the bamboo agent and the other running the docker image as specified for the job.
What about storage space? As these pods/containers are ephemeral, I assume there is no option of defining external, lasting storage (through volume mounts/PVC...). Everything is written to the container filesystem and discarded as soon as the build ends. In practice, this means the files are temporarily stored on the local hard disk of the machine the pod happens to run on.
Hope this helps.
Thanks @Boris Van Hardeveld for the response. Still not clear on the containers.
Are you saying that on the page where I specify the sidekick container for that job I also have to specify my build environment container?
We currently are running our remote agent normally but using a container for our builds. We pull our source code then run a script task where we build it using docker run... How would this look in the new paradigm?
@Jason Templeman the use of the word 'sidekick' is overloaded here, as there are multiple categories of sidekick's in the PBC world... I'll try to explain :-)
When using PBC on Kubernetes, you need to prepare an image which will run the Bamboo agent. You only need to do this only once in the administrative section of Bamboo and the details about this process are available at this page: https://bitbucket.org/atlassian/per-build-container/src/master/bamboo-kubernetes-backend-plugin/. Let's call this container the 'agent-container'.
For each Job, you (or a random developer) can define a docker image which will be used to execute the job. So you will potentially have lot's of job specific containers. Let's call these 'job-1-container', 'job-2-container'... etc.
For each Job, you (or a random developer) can in addition to the main docker image executing the build provide supplementary images, which will result in additional containers being created in the Pod during the build. Specification of these supplementary images is completely optional, and will normally only be of use when doing some sort of integration testing during the build. Let's call these 'job-1-supp-1-container', 'job-1-supp-2-container'... etc.
So, in the base case where no supplementary images are provided, the Pod constructed by the plugin to execute the build will contain the following containers:
When supplementary images are provided, the Pod will for example contain the following containers:
In this case: agent-container, job-1-supp-1-container and job-1-supp-2-container can all be seen as 'sidekicks' to the main job-1-container executing the build, hence the potential confusion.
You can safely ignore everything about supplementary containers at this point, as they will not be applicable to your use-case I guess.
Now, what you would like to do is to invoke docker from within 'job-1-container'. This looks like inception, because 'job-1-container' is itself a container, but it is a valid use-case and can supported in several ways. The term you are looking for is 'Docker-in-Docker' (abbreviated DinD). Please have a look at the following article as a starting point: https://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/, and https://medium.com/hootsuite-engineering/building-docker-images-inside-kubernetes-42c6af855f25 for kubernetes specifics.
Given your cluster is running on-premises and there are no security issues, you could adjust the pod specification (again, see: https://bitbucket.org/atlassian/per-build-container/src/master/bamboo-kubernetes-backend-plugin/) of the plugin to mount the docker socket of the host system in your build container. I could not find an explicit example of how to do this, but from the following commit (https://bitbucket.org/atlassian/per-build-container/pull-requests/208/relaxed-constraints-for-detecting/diff) I assume it is somehow supported. When you are stuck with this your best option might be to open a ticket on the PBC repository, or try to contact one of the developers directly.
@Jason Templeman could you please shoot me an email at email@example.com (once you find the time ;-)? I'm currently in the process of building a 'kubernetes agents' plugin for Bamboo and would really like to learn more about your requirements so it can help shape its feature set. More than happy to return the favor by supporting you with anything I know about this stuff (in private or on this thread). Thanks a lot.