Permanently remove stripped commits and objects they point to


consider the following sequence of actions

1. Make a wrong commit and push it to bitbucket (e.g. containing a password or anything that should not go there)

2. Realize it and strip the commit (e.g. rewriting history locally and forcing a push)

At this point bitbucket says that the commit is stripped. Is there a way to make the corresponding data really erased?

Currently it is not erased at all. It is sufficient to write down the string corresponding to the stripped commit, go to any commit on the repo, alter the http link to include the string corresponding to the stripped commit in place of that corresponding to the current commit to see all the relevant files.

I think that this is due to bitbucket keeping a reflog not differently from git on a local machine and not erasing stuff that is on the reflog. Is there a way to force delete some commit objects and those objects they point to?

2 answers

1 accepted

Since I am getting solicited via email to act so that this question can be marked as resolved, l'm answering my own question with what I have concluded from the comments that I have received so far and from reading the documents that I have found:

- If you commit by mistake something that contains sensitive data (e.g. passwords), it is impossible to promptly remove it from bitbucket, making it unaccessible by third parties, unless you erase the whole repository/project.

- Trying to keep the repository just stripping the commit (e.g. removing it locally and then forcing a push) does not help, because the commit (together with the corresponding files) remains available via the bitbucket web interface. To this aim it is sufficient to craft an url with the proper commit id. In fact, it remains available until the corresponding log is expired, that happens in no less than 60 days, since one cannot force expire logs on bitbucket.

- In fact, stripping the commit containing private data may be counterproductive. This is because to access the commit one needs to know the commit id. And when someone strips a commit, the bitbucket interface loudly advertises 'commit id so and so has been stripped'. As a matter of fact, what the interface is doing is a bit like saying 'Look! This commit has been erased. Maybe it contained some interesting private data. Here is the id that you need to go searching into it. Even if the commit has been immediately deleted, you have 2 months for searching in it'.

So the conclusion is:

- if you use bitbucket and you have committed by mistake something that should have remained private, erase the whole project as soon as possible and open a new one.

Thanks for the thorough explanation of how Bitbucket handles stripped commits. Have you tried asking support to run a gc as Marcus suggested? When he says "support", I believe he means

From my understanding, gc does not erase anything that is in the ref logs. At least, this is how it works in standard git. In any case, I'll contact support.

I expect this is just the way Git works. Although, if someone had time to write down the commit string (not sure why anyone would do that unless they expected the commit to be erased), they probably also had time to write down any accidentally-committed confidential information.

You might try this article from Github, which applies to Git in general, not any specific Git host:

This is correct. Git automatically cleans up old commits once they are unreferenced. It may take a little while before this happens though. The commits aren't referenced by anything else on Bitbucket, but as long as they exist in the objects or pack files, we will display them. If you need us to force a gc earlier, you can always request it in support.

I do not think it is so simple. If you realize that you made a mistake and you immediately strip the commit, the commit may stay up for just 1 second. The issue is that the notification that the commit has been stripped (that includes the commit id), stays up for days providing an obvious opportunity to go watching what has just been deleted.

It looks like saying loud 'Look! Mr. A has cancelled this commit. Maybe he had a password in it. You only need a string to go and check, and here it is.'

For what concerns the remove-sensitive-data article, it's advice seems wise, but unfortunately is totally inapplicable in bitbucket right now, since (unless I am missing something) you do not have the opportunity to edit or expire the reflogs, which is the key to the actual cleanup.

IMHO it would be desirable to introduce the following changes:

1) When a commit is stripped avoid advertising the commit id in the user or repo notifications

2) When a commit is created similarly avoid advertising the commit id in the user or repo notification (otherwise, it is too easy to look at the commits that were made and do not appear in history)

3) Provide an advanced repo view to the repo administrator with the removed commits and the opportunity to erase the corresponding reflog entries, so that the repo can be actually cleaned.

Suggest an answer

Log in or Sign up to answer
How to earn badges on the Atlassian Community

How to earn badges on the Atlassian Community

Badges are a great way to show off community activity, whether you’re a newbie or a Champion.

Learn more
Community showcase
Posted Jun 12, 2018 in Bitbucket

Do you use any Atlassian products for your personal projects?

After spinning my wheels trying to get organized enough to write a book for National Novel Writing Month (NaNoWriMo) I took my affinity for Atlassian products from my work life and decided to tr...

23,028 views 26 12
Join discussion

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you