You're on your way to the next level! Join the Kudos program to earn points and save your progress.
Level 1: Seed
25 / 150 points
1 badge earned
Challenges come and go, but your rewards stay with you. Do more to earn more!
What goes around comes around! Share the love by gifting kudos to your peers.
Keep earning points to reach the top of the leaderboard. It resets every quarter so you always have a chance!
Join now to unlock these features and more
Yesterday, pipelines started failing for one of our repositories. Every run got stuck in the "Build setup" phase and eventually the pipeline would fail because the time limit was exceeded. This happened even for past commits that previously had passed successfully.
Inspecting the pipeline run log, it looks like it actually completes the build setup phase and is about to move onto the next step. (ie it successfully pulls down the docker image, checks out the source code, configures the environment, etc)
After lots of investigating I discovered the problem seems to be connected to the docker "USER" instruction.
Our repo uses a custom docker image based on node:
If I comment out the "USER" line the pipeline runs and passes. If I leave it in the pipelines get stuck at "Build setup" and eventually time out.
Posting this in case another team has run into the same problem. Anybody have some insight into this?
Hello @parogers !
and thanks for reaching out to the Atlassian Community!
You should indeed be able to run the pipeline with different users as long as :
1. the image contains this user (or it was created during the docker build)
2. this user has a home directory inside the image
for the node user in the node:18.16.0 image, I confirmed both of the requirements are met. I tried reproducing the error on my own pipeline, but the build was completed successfully, so I wonder if you might not be building the custom image with a different architecture other than AMD64, which could cause the pipeline to silently fail after the build setup.
Following are the steps I used to run a pipeline successfully with the user node :
RUN echo "this is a test"
docker buildx build --platform linux/amd64 --push -t mydockerhubrepo/testnodeuser .
name: Test step
By following those steps the correct node user was printed by the id command, and the pipeline was completed without errors.
Would it be possible for you to rebuild your custom image based on the instructions above and let us know if the error persists?
If you have any questions, feel free to ask!
Thank you, @parogers !
Thanks Patrik, I think I've been able to narrow down the problem further.
Here's my Dockerfile:
RUN echo "testing"
My docker build command: (tag removed)
docker buildx build --platform linux/amd64 --tag=XYZ -f docker/pipelines/Dockerfile .
Here's my yml file: (image name removed)
name: "Build and test"
With the above setup the pipeline hangs at the "Build setup" phase and eventually times out.
Now if I remove the "caches: node" config from the yml file the pipeline runs fine and passes. If I leave in the caches config but remove "USER node" it also runs fine.
It looks like we don't need the node caching anyways, so I'm removing it from the project. But maybe the problem is an interaction between caching and non-root users?
Hello @parogers ,
Thanks for sharing additional context.
Unfortunately, I was not able to reproduce the issue using the same Dockerfile and YML file you have shared, but as you mentioned, the cache is indeed extracted during the Build Setup with the root user and some conflict might be causing this, but this usually just affects permissions to files.
Since I'm not able to reproduce on my end, I would need to access your build in order to investigate if that is the case. I understand you found a workaround by disabling the cache for this particular build, but if you would like to proceed with the investigation of what is causing this issue, please let me know, so I can open an internal ticket for you.
Thank you, @parogers !