Bitbucket Backup Client Performance

Hello, 

We have been running Stash and then Bitbucket server for some time now. It is running on a dedicated Windows VM and external MSSQL DB in Azure. Backup client is scheduled to run every night at 2 am which might be an overkill. The total backup size is just over 2GB and it takes just over 2 hours. We are about to migrate a lot more code onto this server and I am a bit concerned with amount time it would take to perform a backup.

So, I started to look at some statistics from Atlassian and found this article: https://confluence.atlassian.com/bitbucketserver/using-the-bitbucket-server-backup-client-776640064.html#UsingtheBitbucketServerBackupClient-Howitworks

In the last paragraph of How It Works section it states:

"As an indication of the unavailability time that can be expected when using the Bitbucket Server Backup Client, in our testing and internal use we have seen downtimes for Bitbucket Server of 7–8 minutes with repositories totalling 6 GB in size. For comparison, using Bitbucket Server DIY Backup for the same repositories typically results in a downtime of less than a minute."

According to Backup logs it takes about 6-7 minutes to get to the following line which gets logged for the next hour:

2016-05-28 02:07:08,024 INFO  [main] c.a.b.i.b.c.l.DefaultApplicationHome Verifying Bitbucket home

and then the following line is logged for another hour:

2016-05-28 03:09:34,011 INFO  [main] c.a.b.i.b.c.l.DefaultApplicationHome Backing up Bitbucket home

Can someone explain what are we doing wrong and why it would take that long to backup our server?

 

3 answers

0 vote

Hi Vitalii,

If your backup client is already taking 2 hours and you are going to migrate more code into Bitbucket/Stash, we would recommend you to look into:

As we describe on the page:

The key to reducing downtime is the use of optimal, vendor-specific database and file system backup tools. Such tools are generally able to take snapshots (though sometimes in a vendor-specific format) in much less time than the generic, vendor-neutral format used by the Bitbucket Server Backup Client.

So the scripts have been developed aiming help customers that were struggling with the Backup client. I know you're running on Windows and the current scripts should be able to function for you if you install Cygwin.

This approach (with small modifications) can be used for running DIY Backups on:

  • Linux and Unix
  • OSX
  • Windows with cygwin (note that cygwin Git is not supported by Bitbucket Server).

Let us know if that helps.

Thiago

I can confirm, the DIY Backup is great.

Our duration while Bitbucket is unavailable is just around a minute or two (about 90GB in repositories).

Hi Vitali,

I had the same question - but in our case the backup runs for more than 8(!)hours.Analysing the reasons it shows, that we had a lot of repositories with a lot of commits within a long period of times. The hosted repositories have not been packed but rather each commit was still stored as a single object within git object store. With this we ended up with several 100000 (>1.000.000) files within the folder where Bitbucket stores his repositories. Having so many files brought the OS (windows) at the limits of its performance - causing a very looooooong duration for a backup. (Even a plain 1:1 copy of the repository folders runs for several hours with this amount of files...)

Finally I came up with a script which does a manual garbage collection and a repack on all repositories on the server:

  • Switch off Bitbucket-Service
  • Iterate over all hosted repositories in $BITBUCKET_HOME\shared\data\repositories
  • On each repository do a:
    • git gc
    • git repack -adf --window=200 --depth=200

After having done this cleanup procedure our file count has been reduced drastically (<100000) and the backup time went back to a duration we could accept (approx 1h), as we run backup overnight.

Hint: Be sure to have an up-to-date-backup before running the procedure above

 

Johannes

 

Cleaning up the repository might even speed up the DIY-backup ... wink

Hi Everyone,

Thanks for suggestions. I will be looking into DIY at some point in the future when I get more time.

For now, I will give a try to Johannes suggestion to see if that helps. I also have been monitoring the server and noticed that it is running at 90% memory utilization constantly. It is a VM in Azure with 4GB of RAM, so that might also be an issue. 

Regards,

Vitalii

Suggest an answer

Log in or Sign up to answer
Atlassian Community Anniversary

Happy Anniversary, Atlassian Community!

This community is celebrating its one-year anniversary and Atlassian co-founder Mike Cannon-Brookes has all the feels.

Read more
Community showcase
Piotr Plewa
Published Dec 27, 2017 in Bitbucket

Recipe: Deploying AWS Lambda functions with Bitbucket Pipelines

Bitbucket Pipelines helps me manage and automate a number of serverless deployments to AWS Lambda and this is how I do it. I'm building Node.js Lambda functions using node-lambda&nbsp...

1,767 views 1 5
Read article

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you