Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Bitbucket Backup Client Performance

Vitalii Djiguir June 1, 2016

Hello, 

We have been running Stash and then Bitbucket server for some time now. It is running on a dedicated Windows VM and external MSSQL DB in Azure. Backup client is scheduled to run every night at 2 am which might be an overkill. The total backup size is just over 2GB and it takes just over 2 hours. We are about to migrate a lot more code onto this server and I am a bit concerned with amount time it would take to perform a backup.

So, I started to look at some statistics from Atlassian and found this article: https://confluence.atlassian.com/bitbucketserver/using-the-bitbucket-server-backup-client-776640064.html#UsingtheBitbucketServerBackupClient-Howitworks

In the last paragraph of How It Works section it states:

"As an indication of the unavailability time that can be expected when using the Bitbucket Server Backup Client, in our testing and internal use we have seen downtimes for Bitbucket Server of 7–8 minutes with repositories totalling 6 GB in size. For comparison, using Bitbucket Server DIY Backup for the same repositories typically results in a downtime of less than a minute."

According to Backup logs it takes about 6-7 minutes to get to the following line which gets logged for the next hour:

2016-05-28 02:07:08,024 INFO  [main] c.a.b.i.b.c.l.DefaultApplicationHome Verifying Bitbucket home

and then the following line is logged for another hour:

2016-05-28 03:09:34,011 INFO  [main] c.a.b.i.b.c.l.DefaultApplicationHome Backing up Bitbucket home

Can someone explain what are we doing wrong and why it would take that long to backup our server?

 

3 answers

0 votes
Vitalii Djiguir June 7, 2016

Hi Everyone,

Thanks for suggestions. I will be looking into DIY at some point in the future when I get more time.

For now, I will give a try to Johannes suggestion to see if that helps. I also have been monitoring the server and noticed that it is running at 90% memory utilization constantly. It is a VM in Azure with 4GB of RAM, so that might also be an issue. 

Regards,

Vitalii

0 votes
Johannes Kilian
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 1, 2016

Hi Vitali,

I had the same question - but in our case the backup runs for more than 8(!)hours.Analysing the reasons it shows, that we had a lot of repositories with a lot of commits within a long period of times. The hosted repositories have not been packed but rather each commit was still stored as a single object within git object store. With this we ended up with several 100000 (>1.000.000) files within the folder where Bitbucket stores his repositories. Having so many files brought the OS (windows) at the limits of its performance - causing a very looooooong duration for a backup. (Even a plain 1:1 copy of the repository folders runs for several hours with this amount of files...)

Finally I came up with a script which does a manual garbage collection and a repack on all repositories on the server:

  • Switch off Bitbucket-Service
  • Iterate over all hosted repositories in $BITBUCKET_HOME\shared\data\repositories
  • On each repository do a:
    • git gc
    • git repack -adf --window=200 --depth=200

After having done this cleanup procedure our file count has been reduced drastically (<100000) and the backup time went back to a duration we could accept (approx 1h), as we run backup overnight.

Hint: Be sure to have an up-to-date-backup before running the procedure above

 

Johannes

 

Johannes Kilian
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 7, 2016

Cleaning up the repository might even speed up the DIY-backup ... wink

0 votes
ThiagoBomfim
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
June 1, 2016

Hi Vitalii,

If your backup client is already taking 2 hours and you are going to migrate more code into Bitbucket/Stash, we would recommend you to look into:

As we describe on the page:

The key to reducing downtime is the use of optimal, vendor-specific database and file system backup tools. Such tools are generally able to take snapshots (though sometimes in a vendor-specific format) in much less time than the generic, vendor-neutral format used by the Bitbucket Server Backup Client.

So the scripts have been developed aiming help customers that were struggling with the Backup client. I know you're running on Windows and the current scripts should be able to function for you if you install Cygwin.

This approach (with small modifications) can be used for running DIY Backups on:

  • Linux and Unix
  • OSX
  • Windows with cygwin (note that cygwin Git is not supported by Bitbucket Server).

Let us know if that helps.

Thiago

Mike Friedrich
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 2, 2016

I can confirm, the DIY Backup is great.

Our duration while Bitbucket is unavailable is just around a minute or two (about 90GB in repositories).

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events