So we had some .FBX files being tracked that were taking up a lot of space in our repo. I used the terminal in Sourcetree to track all .FBX files in Git LFS to try to reduce some space for now with this command:
git lfs track "*.fbx"
After using the command and pushing, I can see a lot of files added to LFS when viewing the repo online in bitbucket as expected, but the repo size is unchanged, as if the .FBX files are now both in LFS and being tracked as before in the main repo. How do I get them to only be used by LFS and not also taking up repo space?
Is standard practice to use BFG after adding files to LFS to reduce the repo size?
Hello @Ben Cole ,
When you move files to LFS, they start being tracked there from the moment of that conversion. However, all historical files and versions of the currently existing matching files remain in the repository itself — this is why it remains large. So yes, if your goal is to reduce the size you need to rewrite the repository history with a tool like BFG and "extract" all matching files to LFS storage.
It is a non-trivial operation. Internally, Git stores objects that represent commits which reference their parent(s) and a file tree, which in turn references blobs that represent files. After a commit is created it is normally never changed anymore. However, in this particular case you need to convert some of those blobs in all historical commits (not just at the tip of your main branch), which means you need to rewrite the file tree and the commit objects which reference those blobs. And you need to do that for all commits that reference any of the matching files. This is why the process is called history rewrite, and this is roughly what tools like BFG are doing.
From Git's perspective an LFS file is a tiny plain text file also called as pointer file. It is LFS client (a "plugin" that you install locally to work with LFS files) which knows what to do with the pointer files — download/upload matching actual files from LFS. And, obviously, Bitbucket which manages that storage and access to the files in it. But for Git the pointer file is the only thing it has to care about.
Does this make sense? Let me know if you have any questions.
Cheers,
Daniil
Hi Daniil,
Ah, that makes sense, it was too good to be true to expect the repo size to automatically reduce after adding those files to LFS after they'd already been added previously to the repo. I'll let you know if I have any other questions, but I think I understand now what I need to do. Thanks for the information and explaining all of that.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.