Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Local Runner Failure (since AWS incident) - Docker builds

Scott Klein December 8, 2021

Once the AWS us-east issue was resolved, and our pipelines were back up I started getting errors in my local runner builds that involve Docker

My pipelines that do not require Docker run just fine

When I turn off the use of my local runner they also run just fine

This is the container log for my local runner:

[2021-12-08 23:12:46,966] Updating runner state to "ONLINE".
[2021-12-08 23:12:55,994] Setting runner state to executing step.
[2021-12-08 23:12:55,997] Getting step StepId{accountUuid={b9c2e85f-a8f8-489d-a409-c179f889814c}, repositoryUuid={6e0c4a8b-aa66-4d76-badc-64856a6e88f7}, pipelineUuid={1d71607b-f83b-4901-8f8e-70d0eb9af25c}, stepUuid={3bbc11dd-6247-4e22-b1a4-987c8839bae0}}.
[2021-12-08 23:12:56,000] Getting oauth token for step.
[2021-12-08 23:12:56,002] Getting environment variables for step.
[2021-12-08 23:12:56,480] Getting all artifacts for step.
[2021-12-08 23:12:56,486] Getting SSH private key.
[2021-12-08 23:12:56,489] Getting known hosts.
[2021-12-08 23:12:56,751] SSH private key not found
[2021-12-08 23:12:57,414] Setting up directories.
[2021-12-08 23:12:57,416] Starting log uploader.
[2021-12-08 23:12:57,420] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_clone
[2021-12-08 23:13:00,382] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_build
[2021-12-08 23:13:00,431] Setting up step timeout: PT2H
[2021-12-08 23:13:00,431] Starting websocket listening to STEP_COMPLETED events.
[2021-12-08 23:13:00,432] Checking for step completion every PT30S seconds.
[2021-12-08 23:13:00,690] Updating step progress to PULLING_IMAGES.
[2021-12-08 23:13:00,965] Pulling image docker-public.packages.atlassian.com/sox/atlassian/bitbucket-pipelines-dvcs-tools:prod-stable.
[2021-12-08 23:13:01,420] Appending log line to main log.
[2021-12-08 23:13:02,970] Pulling image 570148091578.dkr.ecr.us-west-2.amazonaws.com/epcr-mob-api-runner:latest.
[2021-12-08 23:13:04,750] Pulling image docker-public.packages.atlassian.com/sox/atlassian/bitbucket-pipelines-docker-daemon:v20-prod-stable.
[2021-12-08 23:13:07,653] Pulling image docker-public.packages.atlassian.com/sox/atlassian/bitbucket-pipelines-auth-proxy:prod-stable.
[2021-12-08 23:13:09,243] Pulling image docker-hub.packages.atlassian.com/google/pause:latest.
[2021-12-08 23:13:11,571] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_pause
[2021-12-08 23:13:11,625] Updating step progress to CLONING.
[2021-12-08 23:13:11,626] Creating container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_pause.
[2021-12-08 23:13:11,728] Starting container.
[2021-12-08 23:13:11,899] Generating clone script.
[2021-12-08 23:13:11,901] Creating container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_clone.
[2021-12-08 23:13:11,901] Executing clone script in clone container.
[2021-12-08 23:13:12,011] Starting container.
[2021-12-08 23:13:12,038] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_system_docker
[2021-12-08 23:13:12,100] Creating container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_system_docker.
[2021-12-08 23:13:12,100] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_system_auth-proxy
[2021-12-08 23:13:12,160] Creating container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_system_auth-proxy.
[2021-12-08 23:13:12,241] Starting container.
[2021-12-08 23:13:12,251] Starting container.
[2021-12-08 23:13:12,454] Adding container log: /var/lib/docker/containers/c9379b6d8c472d54ab430d46ec380b3768af1dd9327799f86680408b70d69c0f/c9379b6d8c472d54ab430d46ec380b3768af1dd9327799f86680408b70d69c0f-json.log
[2021-12-08 23:13:12,455] Waiting on container to exit.
[2021-12-08 23:13:12,457] Creating exec into container.
[2021-12-08 23:13:12,512] Starting exec into container and waiting for exec to exit.
[2021-12-08 23:13:12,585] Adding container log: /var/lib/docker/containers/7c288ae85ea2833f67bc014c8168751343fd33f4d856e6fd5d3b876422b0b793/7c288ae85ea2833f67bc014c8168751343fd33f4d856e6fd5d3b876422b0b793-json.log
[2021-12-08 23:13:12,585] Waiting on container to exit.
[2021-12-08 23:13:12,749] Adding container log: /var/lib/docker/containers/129c939437feee239939214da5a35c73c3a234bd26ba05cb85cc4f0857cb10ff/129c939437feee239939214da5a35c73c3a234bd26ba05cb85cc4f0857cb10ff-json.log
[2021-12-08 23:13:12,750] Waiting on container to exit.
[2021-12-08 23:13:12,822] Container has state (exitCode: Some(4), OOMKilled Some(false))
[2021-12-08 23:13:12,825] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_build
[2021-12-08 23:13:12,880] Not uploading caches. (numberOfCaches: 1, resultOrError: FAILED)
[2021-12-08 23:13:12,882] Updating step progress to UPLOADING_ARTIFACTS.
[2021-12-08 23:13:13,242] Updating step progress to PARSING_TEST_RESULTS.
[2021-12-08 23:13:13,420] Appending log line to log: {6ae33797-2008-4cb0-ad33-f24de01eb6b6}.
[2021-12-08 23:13:13,423] Appending log line to main log.
[2021-12-08 23:13:13,607] Test report processing complete.
[2021-12-08 23:13:13,607] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_clone
[2021-12-08 23:13:13,802] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_clone
[2021-12-08 23:13:13,823] Appending log line to log: {a275ad9d-9511-4696-b197-59a4210e8be0}.
[2021-12-08 23:13:13,874] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_build
[2021-12-08 23:13:13,929] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_system_docker
[2021-12-08 23:13:14,001] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_system_auth-proxy
[2021-12-08 23:13:14,131] Removing container 9a911e0f-947c-5944-b4bd-b9119dae8773_3bbc11dd-6247-4e22-b1a4-987c8839bae0_pause
[2021-12-08 23:13:14,285] Updating step progress to COMPLETING_LOGS.
[2021-12-08 23:13:14,420] Appending log line to main log.
[2021-12-08 23:13:14,565] Shutting down log uploader.
[2021-12-08 23:13:14,778] Tearing down directories.
[2021-12-08 23:13:14,780] Cancelling timeout
[2021-12-08 23:13:14,780] Completing step with result Result{status=FAILED, error=None}.
[2021-12-08 23:13:15,061] Setting runner state to not executing step.
[2021-12-08 23:13:15,062] Waiting for next step.
[2021-12-08 23:13:16,961] Updating runner state to "ONLINE".

 The only error-ish thing I see is this log

[2021-12-08 23:13:12,822] Container has state (exitCode: Some(4), OOMKilled Some(false))

 

Why would I suddenly be getting OOM errors on my builds that have run for months w/o issue? 

1 answer

1 accepted

Suggest an answer

Log in or Sign up to answer
0 votes
Answer accepted
Scott Klein December 9, 2021

Solution was to reset docker, let it restart and re-install the runner image. Had some issues with this as well, had to hard kill my docker service and restart my machine before Docker was able to start up again.

FYI: This is also likely related to the fact that in the middle of a build running on my local I lost my internet -- might have left docker in a bad state somehow that could not be resolved by just deleting containers/images and re-installing. I had to reset Docker back to factory settings and then re-install the runner

TAGS
AUG Leaders

Atlassian Community Events