Hi all,
We use a Bitbucket server v5.7.
When pushing large git repositories (20GB) we sometimes see that git gc process is started automatically during the push and *extremely* slows down the server.
I understand git gc can be run manually, and usually prompts automatically when the repository reaches certain sizes/number of objects.
Is there any way we can control this automatic process to only start outside of office hours so it doesn't impact performance?
Or maybe we can cancel it altogether and schedule it to run manually at night?
Thanks,
Yael
You can not control garbage collection. GC happens when the garbage collector algorithm decides it should happen. You should inspect why garbage collection happens, and make sure that it is really GC slows down your system. Usually it is not the case. Try to figure out what are the hot classes, have a look at deadlocks, make sure that your Jira has enough JVM memory.
Hi Alexey,
Thanks for the quick response.
Is the gc process related to the Bitbucket JVM parameters?
If so - I see that JVM is now configured: Xms512m, Xmx1g
What are the recommendations for JVM tuning?
Our server's resources are:
8 CPU
32 GB
Thanks,
Yael
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
It is difficult to say without analyzing your instance. Try to increase jvm till 8gb.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Alexey,
I saw this statement in the Atlassian documentation:
The memory consumption of Git is not managed by the memory settings in
_start-webapp.sh
or_start-webapp.bat
. The Git processes are executed outside of the Java virtual machine, and as a result the JVM memory settings do not apply to Git.
Is JVM tuning really relevant in this case? Since git gc is also a git process
Yael
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
The documentation is correct. Bitbucket uses git as an external program. Git is a different process from Bitbucket. As far as understand you have problems with Bitbucket, not with GIT. As I said earlier, it is difficult to say what your problem is without analysis. But you can try to increase JVM.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
In the vast majority of cases increasing the JVM heap size only does more damage than good, since the memory allocated to the JVM is not available to git processes, starving them of memory. It is therefore not recommended to increase the heap beyond the default settings unless there are specific situations requiring this. 8 GB is definitely excessive.
Git garbage collection is completely separate from JVM garbage collection and increasing the heap size will not help in this case at all.
Cheers,
Christian
Premier Support Engineer
Atlassian
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Christian,
Thanks for your response.
Our gc problem actually derives from another problem regarding commits compression:
We push many commits which are actually very similar in content, but as different commit objects (using "git commit-tree" for example).
This somehow "tricks" git to think they are totally different commits, and causing the repository to duplicate in size on each push. This is why the git gc process is so meaningful in our case.
Questions:
1. Can we somehow tune the Bitbucket/git compression algorithm to better recognize this behavior?
2. Can we schedule a nightly "git gc" to all/some git repositories? Maybe using cron? Is there a suggested solution for this requirement?
Thanks,
Yael
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.