We have recently performed migration from CVSNT to GIT. Since the straight way to convert was not available we did CVS to SVN and then SVN to git. We were successfully able to convert without any errors. Following this we wanted to perform come clean up as part of which we deleted some of the files in most of the repos. As part of the migration we split the repo which was ~4GB into ~300MB individual repos in order to accommodate them after pushing to Bitbucket cloud which has a 2GB restriction on repo sizes. But we have observed that after the migration all the GIT repos irrespective of the previous file sizes occupied around 700MB on disk. Even the repos which were around 200MB previously now occupy 700MB on the disk. The consolidated repos occupy around 16GB on disk which is unacceptable. Is this behaviour intended or did we miss something ?
This has been the case for all repos
Thanks in advance
The repository may need to be repacked. Sometimes the conversion process can create some inefficiently packed repositories. Try the following commands on a copy of the repository (as a test in case there are any issues):
git gc --auto git fsck git repack -adf --window=200 --depth=20
Hi @Jeff Thomas,
Thanks for the work around but we presume that is not the case. We have dug a lil bit deeper and found that we have deleted some directories from repos but the objects are still there which is the cause for the large file size.
Can you suggest a way where we can remove the files folders and their respective objects.
So far we have only used " git rm" or "rm -rf" only.
Thanks in advance
CC: @Shankar Asam
Deleted files are still preserved in the repository history, so you would likely need to rewrite history to clean up data that is truly no longer needed. The BFG Repo Cleaner may be able to help.
If you haven't yet, I would still suggest trying the commands I mentioned to repack the repository to see if it does make a difference.
We have tried running the commands but they are reducing the size only for the existing files in the repo but the ".git" folder remains the same ~700MB. The ".git/objects/pack" folder contains the .pack file which contains all the objects which could not be manipulated.
We have taken a look at this tool but the part that says "The BFG will update your commits and all branches and tags so they are clean, but it doesn't physically delete the unwanted stuff " to be noted and that is what we want to do. Delete complete files including respective objects.
Keep only the files that are needed along with their history, branches and tags.
Is this feasible ??
Bitbucket Pipelines helps me manage and automate a number of serverless deployments to AWS Lambda and this is how I do it. I'm building Node.js Lambda functions using node-lambda ...
Connect with like-minded Atlassian users at free events near you!Find a group
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no AUG chapters near you at the moment.Start an AUG
We're bringing product updates and pro tips on teamwork to ten cities around the world.Save your spot