Background:
I am currently working on a node application. I am considering caching "node_modules" that can be reused in between builds. We have a considerable number of projects in the org and hence saving even 5 secs will have a huge impact across the org over time.
What am I looking for:
It would be ideal to provide a way to clear the caches based on changes in specific files.
If those specific files have not changed, use the data from the cache otherwise clear cache -> download new node_modules-> cache again.
For example
pipe: clear-pipeline-caches
depends: [package.json. package-lock.json]
Adding an answer in hopes that other folks will avoid the pain of bitbucket pipelines default cache configurations.
I would recommend avoiding the "Pre-defined caches" because they will only update the cache once a week. According to the docs:
When does a cache get cleared?
Any cache which is older than 1 week will be cleared automatically and repopulated during the next build....
Use the cache definition block and specify files that should cause the cache to be reloaded. For example, a cache for `node_modules` might look like this:
definitions:
caches:
node-modules-cache:
key:
files:
- yarn.lock
path: node_modules
When `yarn.lock` changes, the cache for `node_modules` will be reloaded:
pipelines:
default:
- step: &dependencies
trigger: automatic
name: Install Dependencies
caches:
- node-modules-cache
script:
- yarn install --pure-lockfile
Hope this helps someone else avoid the head-scratching and pain of their default options!
Hello @Yogeshwar Chaudhari ,
Welcome to Atlassian Community!
I would suggest making use of the changeset condition in your pipeline. The changesets condition is defined at the step level and is used to execute a step only if one of the modified files matches the expression in includePaths, like the example below :
- step:
name: step1
script:
- echo "my condition step"
condition:
changesets:
includePaths:
# only xml files directly under path1 directory
- "path1/*.xml"
# any changes in deeply nested directories under path2
- "path2/**"
If the commit has not changed one of that files, the step will be skipped and considered successful.
In a pull-request pipeline, all commits are taken into account, and if you provide an includePathlist of patterns, a step will be executed when at least one commit change matches one of the conditions.For other types of pipelines, only the last commit is considered. More details on this documentation.
So using the changeset condition, along with the pipe atlassian/bitbucket-clear-cache, you could create a step to clear the cache that is only executed when the package.json file is modified :
definitions: steps: - step: &clear-cache-if-needed name: 'Clear cache if dependency changed' script: - pipe: atlassian/bitbucket-clear-cache:3.1.1 variables: BITBUCKET_USERNAME: $BB_USERNAME BITBUCKET_APP_PASSWORD: $BB_APP_PASS CACHES: ["node"] DEBUG: "true" condition: changesets: includePaths: # only if package.json file has changed - "package.json" pipelines: default: - step: *clear-cache-if-needed - step: name: 'Normal Step' script: - echo "This is a normal step" branches: develop: - step: *clear-cache-if-needed main: - step: *clear-cache-if-needed
You can configure the atlassian/bitbucket-clear-cache to clear all the caches or just the ones listed in CACHES variable. You can refer to that pipe's documentation here for details on how it can be configured.
You're also welcome to read the below blogpost for further details :
Hope that helps! Let me know if you have any question.
Thank you, @Yogeshwar Chaudhari !
Kind regards,
Patrik S
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.