Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Permanently remove stripped commits and objects they point to

Sergio Callegari April 1, 2014

Hi,

consider the following sequence of actions

1. Make a wrong commit and push it to bitbucket (e.g. containing a password or anything that should not go there)

2. Realize it and strip the commit (e.g. rewriting history locally and forcing a push)

At this point bitbucket says that the commit is stripped. Is there a way to make the corresponding data really erased?

Currently it is not erased at all. It is sufficient to write down the string corresponding to the stripped commit, go to any commit on the repo, alter the http link to include the string corresponding to the stripped commit in place of that corresponding to the current commit to see all the relevant files.

I think that this is due to bitbucket keeping a reflog not differently from git on a local machine and not erasing stuff that is on the reflog. Is there a way to force delete some commit objects and those objects they point to?

2 answers

1 accepted

0 votes
Answer accepted
Sergio Callegari April 17, 2014

Since I am getting solicited via email to act so that this question can be marked as resolved, l'm answering my own question with what I have concluded from the comments that I have received so far and from reading the documents that I have found:

- If you commit by mistake something that contains sensitive data (e.g. passwords), it is impossible to promptly remove it from bitbucket, making it unaccessible by third parties, unless you erase the whole repository/project.

- Trying to keep the repository just stripping the commit (e.g. removing it locally and then forcing a push) does not help, because the commit (together with the corresponding files) remains available via the bitbucket web interface. To this aim it is sufficient to craft an url with the proper commit id. In fact, it remains available until the corresponding log is expired, that happens in no less than 60 days, since one cannot force expire logs on bitbucket.

- In fact, stripping the commit containing private data may be counterproductive. This is because to access the commit one needs to know the commit id. And when someone strips a commit, the bitbucket interface loudly advertises 'commit id so and so has been stripped'. As a matter of fact, what the interface is doing is a bit like saying 'Look! This commit has been erased. Maybe it contained some interesting private data. Here is the id that you need to go searching into it. Even if the commit has been immediately deleted, you have 2 months for searching in it'.

So the conclusion is:

- if you use bitbucket and you have committed by mistake something that should have remained private, erase the whole project as soon as possible and open a new one.

Seth
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 17, 2014

Thanks for the thorough explanation of how Bitbucket handles stripped commits. Have you tried asking support to run a gc as Marcus suggested? When he says "support", I believe he means support.atlassian.com.

Sergio Callegari April 17, 2014

From my understanding, gc does not erase anything that is in the ref logs. At least, this is how it works in standard git. In any case, I'll contact support.

1 vote
Seth
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 1, 2014

I expect this is just the way Git works. Although, if someone had time to write down the commit string (not sure why anyone would do that unless they expected the commit to be erased), they probably also had time to write down any accidentally-committed confidential information.

You might try this article from Github, which applies to Git in general, not any specific Git host: https://help.github.com/articles/remove-sensitive-data

aMarcus
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 1, 2014

This is correct. Git automatically cleans up old commits once they are unreferenced. It may take a little while before this happens though. The commits aren't referenced by anything else on Bitbucket, but as long as they exist in the objects or pack files, we will display them. If you need us to force a gc earlier, you can always request it in support.

Sergio Callegari April 7, 2014

I do not think it is so simple. If you realize that you made a mistake and you immediately strip the commit, the commit may stay up for just 1 second. The issue is that the notification that the commit has been stripped (that includes the commit id), stays up for days providing an obvious opportunity to go watching what has just been deleted.

It looks like saying loud 'Look! Mr. A has cancelled this commit. Maybe he had a password in it. You only need a string to go and check, and here it is.'

For what concerns the remove-sensitive-data article, it's advice seems wise, but unfortunately is totally inapplicable in bitbucket right now, since (unless I am missing something) you do not have the opportunity to edit or expire the reflogs, which is the key to the actual cleanup.

IMHO it would be desirable to introduce the following changes:

1) When a commit is stripped avoid advertising the commit id in the user or repo notifications

2) When a commit is created similarly avoid advertising the commit id in the user or repo notification (otherwise, it is too easy to look at the commits that were made and do not appear in history)

3) Provide an advanced repo view to the repo administrator with the removed commits and the opportunity to erase the corresponding reflog entries, so that the repo can be actually cleaned.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events