Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

External Docker image pulled over and over

viteka February 5, 2021

I have an external Docker image hosted in ECR, which I use in every step of my pipeline. This image changes very rarely (say once a month). When I run a pipeline, it seems that image is pulled over and over again, but not in every step, only in some; shouldn’t BitBucket pipelines keep a local version of the image, and at each step check if the remote image has changed, and only pull if that is the case? Perhaps it is an option to specify in the YAML that I am not using? Since the image is 3.7GB, it takes ~3 minutes to pull, making each step unnecessarily long.

In the screenshot you can see 4 steps, all using the same exact image. It turns out that step 1 and 2 wait for the image to be downloaded (they are both steps that take less than a minute, minus the image pull); however steps 3 and 4 seem to already have the image available (step 4 usually takes 4 minutes to run even if the image is already available).ECA57E70-8BA6-4A5E-ABF8-54253112DC0C.jpeg

1 answer

0 votes
ktomk
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 9, 2021

Hey @viteka ,

When I run a pipeline, it seems that image is pulled over and over again, but not in every step, only in some; shouldn’t BitBucket pipelines keep a local version of the image, and at each step check if the remote image has changed, and only pull if that is the case? 

IIRC you get a checkout per step. That means it is possible to distribute each step onto another node in a network or nodes. If each node has it's own local docker cache, it's perfectly possible you sometimes stay on the same node and sometimes not.

However I'm guessing a bit here because I do not have any insights how this is done on Bitbucket. But I know the steps are isolated and you need the artifacts to transport (and caches are set per step, too).

What you could try to do is to add the Docker cache and see if it helps. This might come with the price of activating the Docker service (as the service is necessary to activate the cache):

Please take it with grain of salt, I would consider this experimental to see whether or not this runs better.

Apart from these Docker cache technicalities:

Given you have four steps that run after each other and all use the same image what pops into my mind is why not do these four steps in one step. This will for sure only download the image once.

And with that size it sounds like it's a build image already that has all the tools you need.

If all four steps are using the same image already the only thing that you have from multiple steps is a clean checkout in between, something you can do with the git utility if its in your build container.

Does that sound as an option for you?

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events