Repository size on cleared repo

After overriding my repo, the size is far larger than it should be. Bitbucket tells me my repo is 1.2 GB. But if i clone it, i get a 380MB folder.

 

Here is what i did before, in order to reduce my repo size (back then it was about 800MB):

  • locally deleted my .git folder
  • added some entries to my .gitignore
  • $ git init
  • $ git add --all ./
  • $ git commit -m "initial commit"
  • $ git add origin [...]
  • $ git push --force -u origin master

Can someone explain to me, why this leads to a repo size of 1.2GB in bitbucket, while cloning results in a 380MB folder?

 

 

1 answer

1 vote

Git uses Garbage Collection for automatic memory/storage management, see e.g. Garbage Collecting and esp. git-gc:

Runs a number of housekeeping tasks within the current repository, such as compressing file revisions (to reduce disk space and increase performance) and removing unreachable objects which may have been created from prior invocations of git add.


Users are encouraged to run this task on a regular basis within each repository to maintain good disk space utilization and good operating performance. [emphasis mine]

Some git commands may automatically run git gc; see the --auto flag below for details. [...] 

As further outlined in section Configuration, there are various related settings and many default to several days or even months. All these can be changed of course, and in a hosted service like Bitbucket, the emphasized activity of running/configuring appropriate garbage collection is the responsibility of the provider. The possibly resulting size difference you are observing has been subject to many issues in the Bitbucket tracker already - Eric van Zijst explains how/why the Bitbucket schedule for garbage collection is somewhat dynamic in his recent answer to Purge dangling objects in Git:

We originally ran a gc on every push. However that can be expensive for larger repos and so the frequency was tied to several repo properties, including push frequency and size.

Furthermore, we don't run git gc --prune=now and so a gc run does not necessarily remove (all) unreferenced objects. This is necessary as some of Bitbucket's internal processes rely on unreferenced objects having a grace period. [...]

See Eric's answer for details on how to contact support, if this turns out to be an issue in your scenario.

Suggest an answer

Log in or Join to answer
Community showcase
Piotr Plewa
Published Dec 27, 2017 in Bitbucket

Recipe: Deploying AWS Lambda functions with Bitbucket Pipelines

Bitbucket Pipelines helps me manage and automate a number of serverless deployments to AWS Lambda and this is how I do it. I'm building Node.js Lambda functions using node-lambda&nbsp...

709 views 0 4
Read article

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you
Atlassian Team Tour

Join us on the Team Tour

We're bringing product updates and pro tips on teamwork to ten cities around the world.

Save your spot