Selfhosted Linux Docker Runner failing with "No such container: ..._system_auth-proxy"

Chris Flora February 22, 2024

When running my pipeline against my freshly spun up docker installation, I'm getting this error before any of the scripts run:

Status 404: {"message":"No such container: 67d5e5b0-717b-56f7-bcb9-ffdc1e351cf2_f6ff8605-0455-443a-b606-394a84893155_system_auth-proxy"}

I don't see anything of note in the docker logs - I see the "Remove" logged, and then immediately the "Creating" is logged, and then two seconds later it "Inspects" and fails.  This happens immediately after it "Inspects" my primary image.

I haven't been able to find anything about this "system_auth-proxy" image, and have no idea why it's failing.  Any ideas?  The exact same config works fine with the atlassian runners.

I have docker running on a clean Ubuntu 22.04 VM, and used the docker documentation to install from their repos.  CPU, memory, and diskspace are nowhere near the limits for the VM.  I also tried installing docker on bare metal, and see the same results.

Here's the full bitbucket-pipeline.yaml if it helps.  This is getting pushed to the develop branch. I've tried installing hugo from scratch in the build steps instead of using a prebuilt image with no difference.

Any help or ideas on where to look are appreciated!!

pipelines:
default:
- step:
name: Default no-op
script:
- echo "Only pushing develop and main branches to Firebase Hosting"
branches:
main:
- step:
name: Build
runs-on:
- self.hosted
- linux
image: hugomods/hugo:0.122.0
script:
- hugo --cleanDestinationDir --minify -e production
artifacts:
- public/**
- step:
name: Deploy to Firebase Hosting
runs-on:
- self.hosted
- linux
deployment: production
script:
- pipe: atlassian/firebase-deploy:5.1.0
variables:
FIREBASE_TOKEN: $FIREBASE_TOKEN
PROJECT_ID: $FIREBASE_PROJECT
EXTRA_ARGS: --only hosting:$FIREBASE_SITE
develop:
- step:
name: Build the Hugo site
runs-on:
- self.hosted
- linux
image: hugomods/hugo:0.122.0
script:
- hugo -DEF --cleanDestinationDir --minify -e develop
artifacts:
- public/**
- step:
name: Deploy to Firebase Hosting
runs-on:
- self.hosted
- linux
deployment: develop
image: node:18.16.1
caches:
- node
script:
- npm install -g firebase-tools:13.1.0
- firebase deploy --token $FIREBASE_TOKEN --project $FIREBASE_PROJECT --only hosting:$FIREBASE_SITE

 

2 answers

1 accepted

0 votes
Answer accepted
Chris Flora March 6, 2024

The problem appears to have been something on my Proxmox server.  My original runner was set up in a Proxmox VM running Ubuntu 22.04.  I then tried a second time with docker installed directly on the Proxmox server - received the same error in both cases.  Since my testing had me leaning towards a problem on the docker side instead of the bitbucket side, and the only lead I found in my searching with a similar issue pointed to a problem with SELinux (which isn't installed on Proxmox, but it made me think maybe there's something with AppArmor or some root Proxmox dependency), I decided to try spinning up a new bare-metal Ubuntu server on my network.  And bam, worked immediately.

So for anyone digging this thread up in the future, if you're running on Proxmox, try running it outside of Proxmox.  I am not planning on digging any deeper here since I was already planning on moving away from Proxmox for other reasons.  But good luck!

1 vote
Chris Flora February 23, 2024

Oh, and the really fun part is that the error is intermittent.  Sometimes I get the error on the first step of the pipeline, sometimes on the second step, and sometimes not at all.  And it can vary with the exact same git commit.  In the screenshot below, you can see where some failed in less than a minute, which corresponds to the first step, and some failed after over a minute (second step):

Screenshot 2024-02-23 at 08-19-42 ctflora _ chore-chore-website Pipelines — Bitbucket.png

Chris Flora March 5, 2024

OK, spent some more time on this today.  Trying to eliminate any variables I could think of, so I:

  1. Set up a fresh docker install on my bare-metal server instead of inside a VM
  2. Deleted the old runner and created a fresh one.  The new runner immediately shows "ONLINE" in the pipelines settings.  This one is at v1.561 while the previous attempt was at v1.559
  3. Replaced my bitbucket-pipeline with a dead-simple "Hello World" 

Aaaaaaand I'm still getting the same 404 error as shown in the top comment.

 

bitbucket-pipelines.yml:

pipelines:
default:
- step:
name: Hello World
runs-on:
- self.hosted
- linux
image: atlassian/default-image:latest
script:
- echo "Hello, self-hosted pipeline World!!!"

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PERMISSIONS LEVEL
Site Admin
TAGS
AUG Leaders

Atlassian Community Events