Pipeline cache doesn't work as expected

Uliana Andreeva
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
September 11, 2018

Hi there,

I have a problem with the maven cache. 

On the first step I see that cache is extracted successfully:

Cache "mvnrepo": Downloading
Cache "mvnrepo": Downloaded 55 MB in 4 seconds
Cache "mvnrepo": Extracting
Cache "mvnrepo": Extracted in 0 seconds

No new dependencies downloaded during this step and in "Build teardown" section I see:

Cache "mvnrepo": Skipping upload for existing cache

On the second step I see again that cache is extracted successfully, but then new dependencies are downloaded during this step. But in "Build teardown" section I see again that cache uploading is skipped. And of course, when I rerun this build, this dependencies are again downloaded on the second step.

What should I change for update cache after second step to save cache between builds and avoid downloading dependencies again and again? 

My bitbucket-pipelines.yml file: 

image: maven:3.3.9

pipelines:
default:
- step:
name: Unit testing
caches:
- mvnrepo
script:
- mvn -B test
- step:
name: Package build
caches:
- mvnrepo
script:
- mvn -B package -DskipTests
definitions:
caches:
mvnrepo: /root/.m2/repository

I tried using default 'maven' cache name in caches section, but the result is the same.

2 answers

0 votes
Nicu
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
June 5, 2023

Oops, this was meant to be a comment under the other answer. Not seeing how to delete my own answer. Please ignore this.

Hi,

Is it possible to have user-configurable cache update policy?

In my case, I am using the cache to leverage `ccache` when building c++ codes. Ccache is already intelligent about what to keep or evict based on a configurable maximum cache size (and other params).

With the current policy of potentially week-old cache, it's quite possible for a lot of unnecessary cache mises for things that were touched say 6 days ago but not since.

Uploading the cache on every pipeline run might be too much, but can we at least tighten the expiration to ~1 day?

0 votes
Steven Vaccarella
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
September 13, 2018

Hi Uliana,

The cache is only uploaded if it doesn't already exist. This is by design, to avoid incurring the expense of uploading the cache on every run. The cache will be automatically cleared when it is a week old or it can be cleared manually in the UI. If your two steps have substantially different dependencies you could create two separate caches, one for each step. Eg:

image: maven:3.3.9

pipelines:
default:
- step:
name: Unit testing
caches:
- mvnrepo1
script:
- mvn -B test
- step:
name: Package build
caches:
- mvnrepo2
script:
- mvn -B package -DskipTests
definitions:
caches:
mvnrepo1: /root/.m2/repository
mvnrepo2: /root/.m2/repository

See "How does caching work?" in the docs for more details: https://confluence.atlassian.com/bitbucket/caching-dependencies-895552876.html

Cheers,
Steven

R March 5, 2019

is there any possiblity to have this fixed? because for many projects the set of cachable artifacts changes constantly, in particular for larger ones. In this case will always be invalid for 7 days and every single execution trigger a cache miss rather than do that one upload.

Maybe content hashes for the caches could help to check whether there is an update and then trigger the upload. It seems at least for the typical builds of the master branch that would be the desired behavior.

Steven Vaccarella
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 5, 2019

If memory serves we experimented with updating the cache automatically based on content but many of the tools we tested would update metadata (such as timestamps) in the cache even though there were no significant changes, so it's actually not trivial to detect a real change generically across a wide variety of tools.

Automatic updates could also become a problem when there are multiple branches with slightly different dependencies sharing the same cache (which is often the case).

Having said that we can certainly look at doing something more intelligent with caches if there is enough interest. If you haven't already, you should watch and vote on this ticket: https://bitbucket.org/site/master/issues/16314/refresh-caches-when-dependencies-are 

Like Ben Evans likes this
Nicu
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
June 5, 2023

Hi,

Is it possible to have user-configurable cache update policy?

In my case, I am using the cache to leverage `ccache` when building c++ codes. Ccache is already intelligent about what to keep or evict based on a configurable maximum cache size (and other params).

With the current policy of potentially week-old cache, it's quite possible for a lot of unnecessary cache mises for things that were touched say 6 days ago but not since.

Uploading the cache on every pipeline run might be too much, but can we at least tighten the expiration to ~1 day?

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events