GIT repo size in bitbucket has changed after conversion from SVN

Hi ,

We have recently performed migration from CVSNT to GIT. Since the straight way to convert was not available we did CVS to SVN and then SVN to git. We were successfully able to convert without any errors. Following this we wanted to perform come clean up as part of which we deleted some of the files in most of the repos. As part of the migration we split the repo which was ~4GB into ~300MB individual repos in order to accommodate them after pushing to Bitbucket cloud which has a 2GB restriction on repo sizes. But we have observed that after the migration all the GIT repos irrespective of the previous file sizes occupied around 700MB on disk. Even the repos which were around 200MB previously now occupy 700MB on the disk. The consolidated repos occupy around 16GB on disk which is unacceptable. Is this behaviour intended or did we miss something ?

This has been the case for all repos
Kindly clarify.

 

Thanks in advance
 

1 answer

0 vote
Jeff Thomas Atlassian Team Jul 12, 2016

The repository may need to be repacked. Sometimes the conversion process can create some inefficiently packed repositories. Try the following commands on a copy of the repository (as a test in case there are any issues):

git gc --auto
 
git fsck
 
git repack -adf --window=200 --depth=20

Hi @Jeff Thomas,

Thanks for the work around but we presume that is not the case. We have dug a lil bit deeper and found that we have deleted some directories from repos but the objects are still there which is the cause for the large file size. 
Can you suggest a way where we can remove the files folders and their respective objects.
So far we have only used " git rm" or "rm -rf" only.

Thanks in advance

--Sravan

CC: @Shankar Asam 

Jeff Thomas Atlassian Team Jul 13, 2016

Deleted files are still preserved in the repository history, so you would likely need to rewrite history to clean up data that is truly no longer needed. The BFG Repo Cleaner may be able to help.

If you haven't yet, I would still suggest trying the commands I mentioned to repack the repository to see if it does make a difference.

We have tried running the commands but they are reducing the size only for the existing files in the repo but the ".git" folder remains the same ~700MB. The ".git/objects/pack" folder contains the .pack file which contains all the objects which could not be manipulated. 

We have taken a look at this tool but the part that says "The BFG will update your commits and all branches and tags so they are clean, but it doesn't physically delete the unwanted stuff " to be noted and that is what we want to do. Delete complete files including respective objects.
Keep only the files that are needed along with their history, branches and tags.

Is this feasible ?? 

Thanks

--Sravan 

Suggest an answer

Log in or Sign up to answer
How to earn badges on the Atlassian Community

How to earn badges on the Atlassian Community

Badges are a great way to show off community activity, whether you’re a newbie or a Champion.

Learn more
Community showcase
Posted Jun 12, 2018 in Bitbucket

Do you use any Atlassian products for your personal projects?

After spinning my wheels trying to get organized enough to write a book for National Novel Writing Month (NaNoWriMo) I took my affinity for Atlassian products from my work life and decided to tr...

31,471 views 26 12
Join discussion

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you