Clear the bitbucket pipeline caches only if the package.json or (particular files have changed)

Yogeshwar Chaudhari April 28, 2022

Background: 

I am currently working on a node application. I am considering caching "node_modules" that can be reused in between builds. We have a considerable number of projects in the org and hence saving even 5 secs will have a huge impact across the org over time.

What am I looking for:

It would be ideal to provide a way to clear the caches based on changes in specific files.
If those specific files have not changed, use the data from the cache otherwise clear cache -> download new node_modules-> cache again.

For example

pipe: clear-pipeline-caches
   depends: [package.json. package-lock.json]

2 answers

1 vote
Patrik S
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
May 10, 2022

Hello @Yogeshwar Chaudhari ,

Welcome to Atlassian Community!

I would suggest making use of the changeset condition in your pipeline. The changesets condition is defined at the step level and is used to execute a step only if one of the modified files matches the expression in includePaths, like the example below :

- step:
          name: step1
          script:
             - echo "my condition step"
          condition:
              changesets:
                 includePaths:
                   # only xml files directly under path1 directory
                   - "path1/*.xml"
                   # any changes in deeply nested directories under path2
                   - "path2/**"

If the commit has not changed one of that files, the step will be skipped and considered successful.

In a pull-request pipeline, all commits are taken into account, and if you provide an includePathlist of patterns, a step will be executed when at least one commit change matches one of the conditions.For other types of pipelines, only the last commit is considered. More details on this documentation.

So using the changeset condition, along with the pipe atlassian/bitbucket-clear-cache, you could create a step to clear the cache that is only executed when the package.json file is modified :

definitions: 
  steps:
    - step: &clear-cache-if-needed
        name: 'Clear cache if dependency changed'
        script:
          - pipe: atlassian/bitbucket-clear-cache:3.1.1
            variables:
             BITBUCKET_USERNAME: $BB_USERNAME
             BITBUCKET_APP_PASSWORD: $BB_APP_PASS
             CACHES: ["node"]
             DEBUG: "true"
        condition:
              changesets:
                 includePaths:
                   # only if package.json file has changed
                   - "package.json"
pipelines:
  default:
    - step: *clear-cache-if-needed
    - step:
        name: 'Normal Step'
        script:
          - echo "This is a normal step"
  branches:
    develop:
      - step: *clear-cache-if-needed
    main:
      - step: *clear-cache-if-needed

You can configure the atlassian/bitbucket-clear-cache to clear all the caches or just the ones listed in CACHES variable. You can refer to that pipe's documentation here for details on how it can be configured.

You're also welcome to read the below blogpost for further details :

Hope that helps! Let me know if you have any question.

Thank you, @Yogeshwar Chaudhari !

Kind regards,

Patrik S

0 votes
Travis Pennetti
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
March 20, 2024

Adding an answer in hopes that other folks will avoid the pain of bitbucket pipelines default cache configurations.

I would recommend avoiding the "Pre-defined caches" because they will only update the cache once a week. According to the docs:

When does a cache get cleared?

Any cache which is older than 1 week will be cleared automatically and repopulated during the next build....

 There's got to be a better way...

Use the cache definition block and specify files that should cause the cache to be reloaded. For example, a cache for `node_modules` might look like this:

definitions:
caches:
node-modules-cache:
key:
files:
- yarn.lock
path: node_modules

When `yarn.lock` changes, the cache for `node_modules` will be reloaded:

pipelines:
default:
- step: &dependencies
trigger: automatic
name: Install Dependencies
caches:
- node-modules-cache
script:
- yarn install --pure-lockfile

 

Hope this helps someone else avoid the head-scratching and pain of their default options!

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events