Can somebody explain what these two config options actually mean and how do they work in practice i mean how does the user see this in-use.
|Controls whether ref advertisement operations are cached.|
Controls whether clone operations are cached.
I tried reading https://confluence.atlassian.com/display/STASH/Scaling+Stash+for+Continuous+Integration+performance but it doesn't say much how the user uses this and when and why you should do it.
I mean why would you want to cache a clone ? When will that be a good idea ? In a fast changing world will a clone that is cached not expire on the next commit ?
There are really two benefits of caching a clone. The main benefit is for CI servers that perform multiple clones for any given build. Examples are multi-step builds or dependent build plans. In these cases a single push to a repository will trigger multiple clones of the same repository. For instance, in our CI environment, a single change to the repository will trigger our main build which has been split in 6 parallel build steps. When that main build goes green, a number of specialized builds are triggered (performance, multi-database, etc.). All in all that single push can trigger up to 25 clones. Given that cloning a repository is a fairly CPU and memory heavy operation, caching the clone significantly reduces that cost.
The second benefit is that by streaming the clone to a cache, the git command can terminate as soon as the output has been written to disk (vs streamed over the network). Since the git process holds on to a big chunk of memory until the clone has been fully sent to the client, that can reduce the time that memory on the server is being claimed.
Your point about the cache expiring on the next commit is valid, but from the CI-triggered clones typically happen in a fairly short time span (within 1m for parallel builds and say within 30m - 1h for dependent builds) and there is a good chance that the cache won't expire because of a push in that time frame.
You can check whether you're benefitting from the cache or not by sending a REST request to
The output lists the number of cache hits and misses (overall and per repository).
Hope this helps,
@Michael thank you for the quick answer. Why is it that your build does a clone and not just a Fetch.
How does the clone cache actually work ? I mean is the clone cache linked to a commit so you have multiple clone caches for different commits ?
Do you have any comment on the cache-refs ? What are those for ?
We have many different build agents that build many different projects. A fetch requires that there is an initial clone of the repository on that agent, which isn't always true. For simplicity the agents always perform a (shallow) clone.
The clone cache inspects the clone request and creates a cache key from the list of refs and other parameters that the git client requests. So, a 'git clone --depth 1 <repo-url>' would result in a different cache-key then 'git clone <repo-url>', etc.
The cache-refs cache the output of the 'ref advertisment' phase of the clone/fetch protocol, in which the server sends a list of refs + (the commit hashes they point to) to the client. This call is typically not expensive, but if you have a build server that's configured to poll for changes frequently it could still be beneficial to cache these refs as well.
Bitbucket Pipelines helps me manage and automate a number of serverless deployments to AWS Lambda and this is how I do it. I'm building Node.js Lambda functions using node-lambda ...
Connect with like-minded Atlassian users at free events near you!Find a group
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no AUG chapters near you at the moment.Start an AUG
We're bringing product updates and pro tips on teamwork to ten cities around the world.Save your spot