You're on your way to the next level! Join the Kudos program to earn points and save your progress.
Level 1: Seed
25 / 150 points
Next: Root
1 badge earned
Challenges come and go, but your rewards stay with you. Do more to earn more!
What goes around comes around! Share the love by gifting kudos to your peers.
Keep earning points to reach the top of the leaderboard. It resets every quarter so you always have a chance!
Join now to unlock these features and more
The Atlassian Community can help you and your team get more value out of Atlassian products and practices.
I have a self-hosted runner configured on a Kubernetes cluster (EKS) and having some issues with the underlying runner process pulling images from ECR at runtime.
I've uploaded the auxiliary images from Use your Docker images in self-hosted runners | Bitbucket Cloud | Atlassian Support into a private repo.
There is no issues pulling images from the cluster or even from within the containers (I've manually logged into the appropiate ECR repo), but it looks like the runner has a java process that is triggered when a step is executed which pulls additional images at run time. This is where I get the following issue:
Status 500: {"message":"Head https://XXX.dkr.ecr.eu-west-1.amazonaws.com/v2/XXX/tools/manifests/prod-stable: no basic auth credentials"}
It seems odd that the Java process doesn't use the docker config in the container.
Below is the redacted manifest I'm using to deploy the runner (note that I can deploy this runner successfully, the issue appears at job execution):
apiVersion: v1 kind: List items: - apiVersion: v1 kind: Secret metadata: name: runner-oauth-credentials labels: accountUuid: XXX repositoryUuid: XXX runnerUuid: XXXX data: oauthClientId: XXX= oauthClientSecret: XXX== - apiVersion: batch/v1 kind: Job metadata: name: runner spec: template: metadata: labels: accountUuid: XXX repositoryUuid: XXX runnerUuid: XXX spec: containers: - name: runner image: XXX.dkr.ecr.eu-west-1.amazonaws.com/XXX/bitbucket/pipeline-runner:1.435 resources: limits: memory: "8Gi" cpu: "2" env: - name: ACCOUNT_UUID value: "{XXX}" - name: REPOSITORY_UUID value: "{XXXX}" - name: RUNNER_UUID value: "{XXX}" - name: OAUTH_CLIENT_ID valueFrom: secretKeyRef: name: runner-oauth-credentials key: oauthClientId - name: OAUTH_CLIENT_SECRET valueFrom: secretKeyRef: name: runner-oauth-credentials key: oauthClientSecret - name: WORKING_DIRECTORY value: "/tmp" - name: PAUSE_IMAGE value: "XXX.dkr.ecr.eu-west-1.amazonaws.com/XXX/bitbucket/pipeline-pause:latest" - name: AUTH_PROXY_IMAGE value: "XXX.dkr.ecr.eu-west-1.amazonaws.com/pipeline/XXX/pipeline-auth-proxy:prod-stable" - name: CLONE_IMAGE value: "XXX.dkr.ecr.eu-west-1.amazonaws.com/XXX/pipeline-tools:prod-stable" volumeMounts: - name: tmp mountPath: /tmp - name: docker-containers mountPath: /var/lib/docker/containers readOnly: true # the runner only needs to read these files never write to them - name: var-run mountPath: /var/run - name: docker-in-docker image: XXX.dkr.ecr.eu-west-1.amazonaws.com/XXX/docker:20.10.7-dind resources: limits: memory: "8Gi" cpu: "2" securityContext: privileged: true # required to allow docker in docker to run and assumes the namespace your applying this to has a pod security policy that allows privilege escalation volumeMounts: - name: tmp mountPath: /tmp - name: docker-containers mountPath: /var/lib/docker/containers - name: var-run mountPath: /var/run imagePullSecrets: - name: ecr-credentials restartPolicy: OnFailure # this allows the runner to restart locally if it was to crash volumes: - name: tmp # required to share a working directory between docker in docker and the runner - name: docker-containers # required to share the containers directory between docker in docker and the runner - name: var-run # required to share the docker socket between docker in docker and the runner # backoffLimit: 6 # this is the default and means it will retry upto 6 times if it crashes before it considers itself a failure with an exponential backoff between # completions: 1 # this is the default the job should ideally never complete as the runner never shuts down successfully # parallelism: 1 # this is the default their should only be one instance of this particular runner