Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Bitbucket garbage collection process slows down server performance

yael May 1, 2018

Hi all,

We use a Bitbucket server v5.7.
When pushing large git repositories (20GB) we sometimes see that git gc process is started automatically during the push and *extremely* slows down the server.

I understand git gc can be run manually, and usually prompts automatically when the repository reaches certain sizes/number of objects.

Is there any way we can control this automatic process to only start outside of office hours so it doesn't impact performance?
Or maybe we can cancel it altogether and schedule it to run manually at night?

Thanks,
Yael

1 answer

0 votes
Alexey Matveev
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 1, 2018

You can not control garbage collection. GC happens when the garbage collector algorithm decides it should happen. You should inspect why garbage collection happens, and make sure that it is really GC slows down your system. Usually it is not the case. Try to figure out what are the hot classes, have a look at deadlocks, make sure that your Jira has enough JVM memory.

yael May 1, 2018

Hi Alexey,

Thanks for the quick response.

Is the gc process related to the Bitbucket JVM parameters?
If so - I see that JVM is now configured: Xms512m, Xmx1g

What are the recommendations for JVM tuning?

Our server's resources are:
8 CPU
32 GB

Thanks,
Yael

Alexey Matveev
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 1, 2018

It is difficult to say without analyzing your instance. Try to increase jvm till 8gb.

yael May 1, 2018

Hi Alexey,

I saw this statement in the Atlassian documentation:

The memory consumption of Git is not managed by the memory settings in _start-webapp.sh or _start-webapp.bat. The Git processes are executed outside of the Java virtual machine, and as a result the JVM memory settings do not apply to Git.

Is JVM tuning really relevant in this case? Since git gc is also a git process

Yael

Alexey Matveev
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 1, 2018

The documentation is correct. Bitbucket uses git as an external program. Git is a different process from Bitbucket. As far as understand you have problems with Bitbucket, not with GIT. As I said earlier, it is difficult to say what your problem is without analysis. But you can try to increase JVM.

Christian Glockner
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
May 1, 2018

In the vast majority of cases increasing the JVM heap size only does more damage than good, since the memory allocated to the JVM is not available to git processes, starving them of memory. It is therefore not recommended to increase the heap beyond the default settings unless there are specific situations requiring this. 8 GB is definitely excessive.

Git garbage collection is completely separate from JVM garbage collection and increasing the heap size will not help in this case at all.

Cheers,

Christian

Premier Support Engineer

Atlassian

yael May 3, 2018

Hi Christian,

Thanks for your response.
Our gc problem actually derives from another problem regarding commits compression:
We push many commits which are actually very similar in content, but as different commit objects (using "git commit-tree" for example).
This somehow "tricks" git to think they are totally different commits, and causing the repository to duplicate in size on each push. This is why the git gc process is so meaningful in our case.

Questions:

1. Can we somehow tune the Bitbucket/git compression algorithm to better recognize this behavior?

2. Can we schedule a nightly "git gc" to all/some git repositories? Maybe using cron? Is there a suggested solution for this requirement?

Thanks,
Yael

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events