What to do with your Mercurial repos when Bitbucket sunsets support

526 comments

Peter Koppstein August 23, 2019

@ruslanberozov I was able to create a Mercurial repository at https://helixteamhub.cloud. There is doubtless an easier way, but I first created a project, and then a repository (which turned out to be Git), deleted it, and then worked my through the process beginning at:

There are no repositories yet. You can go ahead and create a new repository.

Peter Koppstein August 23, 2019

Since this thread seems to have become a meeting ground for like-minded fans of Mercurial, I'm hoping someone will be able to explain how to use Dropbox reliably as a host for an Hg repo.  Clearly it has its risks if not done carefully, but it seems to me that if there is only ONE person (userid) with write-access to the Dropbox hg repository, things should be as safe as one can reasonably expect.  Of course, not as convenient, but at least safe.  Is there some flaw in my thinking, which is specific to Dropbox, though I'd be interested to know which services of that ilk are known to be unsuitable for the hosting of a one-writer Hg repository.  Thank you!

torohill August 23, 2019

@Ruslan Berezyukyes, 100% sure they have Mercurial support. It's not easy to find but it is mentioned at https://www.perforce.com/products/helix4git in the "What Is Helix TeamHub" section. I was also able to signup for an account and create a Mercurial repo, as @Peter Koppstein mentioned.

Peter Koppstein August 23, 2019

@torohill, @RuslanBerezyuk - It seems that at present the wiki associated with an Hg repo on HelixTeamHub has to be a Git repo!  If anyone would care to join me in appealing for Hg-backed wikis, there's a way to do so at http://support.helixteamhub.com/

aplsimple August 23, 2019

If Atlassian team have religious haters of Mercurial, they might behave more decently with their clients, namely: let the old Mercurial repos to live on and stop ("sunset" sh..) creating new ones. But for now, Atlassian behave like the cuckoo does with her eggs - too much troubles with them indeed, so let them get out. We eggs would be going to GitHub in such case.

How can 1% (as they say) of old and loyal clients cause any harm to such a big (as they say) corporation?

Like # people like this
jakobkappel August 23, 2019

After much consideration, MacDonald's has decided to remove vegetarian offerings from our menu.


According to some survey, only 3% of all fast-food customers are vegetarians. We know precisely how large a share of our customer base is vegetarian, but we’re not telling you. But it’s steadily declining, believe us.


This wasn’t an easy decision, and vegetarian meals will always have a special place in McDonald’s history.


Are you a vegetarian? Here’s a short guide on how to start eating meat :)

Like # people like this
kalthad August 23, 2019

@Tor

 

I have tried out sourchut. quite nice open software, slower than bibucket, so far no

issue system as I can see, so you should use a mercurial extension such as artemsis.

helix: I just followed the link you posted.

Strange they want be to download a sort of server software.

Anybody did create a repository with helix?

kalthad August 23, 2019

@Peter Koppstein 

I followed the link you posted:

They want me to connect with my userid and password.

 

Where to I get these? I looked in their webpage but could not find anything.

Any pointer is welcome

Peter Koppstein August 23, 2019

@kalthad  - At the top RHS there's a link:

"New and returning users may sign in"

Or you might be able to create an account using the form at:
https://www.perforce.com/products/helix-core/free-version-control

torohill August 24, 2019

@kalthadI signed up to Helix TeamHub at https://info.perforce.com/try-perforce-helix-teamhub-free, which is linked from the 'Try It Now' button on the pricing page I originally posted. I haven't tried the signup link from @Peter Koppstein. Once you complete the signup process you can signin at https://helixteamhub.cloud.

For sourcehut you can find issue tracking at https://todo.sr.ht/

Like Peter Koppstein likes this
Arjailer August 24, 2019

Moved my (34) repos to GitHub yesterday - it has a nice automatic Mercurial importer.

I didn't have any wikis and only a handful of issues (which I just manually re-created), so not too much hassle for me.

Been thinking of moving to Git for a while, so ... thanks I guess for giving me the push?

Like Steffen Opel _Utoolity_ likes this
devzendo August 24, 2019

@Peter Koppsteinre using dropbox to store a mercurial repo. You're better off leaving hg to keep the state of the repo consistent, rather than dropbox. Many have not had problems, some have.. see https://stackoverflow.com/questions/1964347/mercurial-and-i-guess-git-with-dropbox-any-drawbacks for details.

kalthad August 24, 2019

@torohill 

 

Thanks, I did this and created my first repo.

 

Helixt looks a bit  like a bitbucket clone (or the other way around)

The question of course remains:

how long will they support mercurial?

Peter Koppstein August 24, 2019

@devzendo- maybe I didn’t explain the single-writer idea properly. It is intended to overcome all those concerns you cited. Also, I’m not suggesting that all Dropbox-like services are sufficiently similar for present purposes. I am specifically  focused on Dropbox, but would be happy to know if there’s anything better than Dropbox for providing a place in the cloud for an .hg folder.  By the way, even if Dropbox somehow managed to screw things up once in a blue moon, it wouldn’t matter too much in the sense that recovery would almost certainly be possible, if inconvenient.

Alex Bream August 24, 2019

@Peter Koppstein- any and all cloud-services for "just files" is The Bad Idea (tm) for replication of repositories, especially when | if they sync local and cloud copy in background.

And now, their recovery doesn't recover repositories reliably - you always have big chance to have broken repo and lost history in all replicas of it. But if you like Russian Roulette - try it

Like Tara McGrew likes this
philipstarkey August 24, 2019

For those interested, I've added Phabricator to the list of mercurial hosting options on the wiki. It seems like you can either pay for their cloud option, or get a free copy to self-host on your own server (e.g. AWS lightsail, Azure, Google cloud, Digital Ocean, etc). Seems like a pretty comprehensive piece of software (based on their website), so I'm going to test it out on a lightsail instance.

While it obviously costs money for the hosting, the advantage of self-hosting is that you aren't at the mercy of another company!

aplsimple August 24, 2019

hg/git remove atlassian.com

hg/git commit -m "Farewell to Atlassian."

cd github.com

hg/git init

hg/git commit -m "Hi GitHub!"

Like # people like this
Tara McGrew August 24, 2019

@Peter Koppstein 

By the way, even if Dropbox somehow managed to screw things up once in a blue moon, it wouldn’t matter too much in the sense that recovery would almost certainly be possible, if inconvenient.

The problem with using cloud services for a repo that you'll write to from more than one machine is that they don't have the concept of "changesets". They track each file individually, so it's like going back from Mercurial or Subversion to CVS. Your .hg folder could have thousands of files in it; if there's a conflict, you don't want to have to check the timestamps on each file to try and make a consistent snapshot of your repo.

If you're only pushing from one machine, and it's read-only on every other machine, that's probably safer.

Peter Koppstein August 24, 2019

@Alex Bream wrote:

> any and all cloud-services for "just files" is The Bad Idea (tm) for replication of repositories

 

I'm afraid that merely restating a generic position is not very helpful with regard to the specific single-writer scenario I've tried to describe. As @Tara McGrew wrote:

If you're only pushing from one machine, and it's read-only on every other machine, that's probably safer.

 

Here's a sketch of a proof that under certain weak assumptions, the single-writer scenario I have in mind is quite safe (e.g. compared to the probability of Atlassian junking your repository while you're away on sabbatical).

Let:

* M be a master repository (not known directly to Dropbox);

* D be an .hg-only clone of M (as in `hg clone -U`), D being known to Dropbox and writeable only by the owner of M;

* C1, ..., Cn be clones of D that are not known to Dropbox. 

* The only updates of the Cs are via pulls from D.

Furthermore let us assume that M is pushed to D only occasionally, so that if for example C1 happens to initiate a pull from D while D is being updated, resulting in the corruption of C1, a later attempt to reclone D will succeed, assuming Dropbox is doing its thing properly.  So let us also assume that the person pulling from D to one of the Cs runs `hg verify` after every pull from D, and knows what to do if there is a problem.

So normally (i.e., while both M and D are as they should be), everything is hunky-dory.

Now let's consider the abnormal situations:

1) M becomes corrupted, e.g. because of a disk failure.  Then normally, D will be untainted and M can be reconstructed via D.

2) D becomes corrupted due to some bug in Dropbox.  Then almost certainly D can be recovered from M.

So the worst-case scenario and extremely rare situation would be if both M and D die at roughly the same time.  This is probably even less likely than Atlassian intentionally discarding your Mercurial repository. And even then, chances are that between all the C clones and the backups of M and D (thank you, Dropbox), not much if anything other than an administrator's time will be lost.  Notice that Dropbox's policies regarding the availability of prior versions of files is an important part of the argument.

 

So my main question really boils down to this: if C1 pulls from D while D is being updated, is the pull guaranteed to abort before C becomes corrupted? If not, will there be some kind of error message, or is it necessary to run `hg verify` after pulling to detect the problem?

philipstarkey August 24, 2019

@Peter Koppstein the success of your scheme depends on the order that dropbox synchronises the files in the .hg folder. Since you can't guarantee that order, there will absolutely be times where a race condition occurs that screws with repositories Cn during a pull from D.

You might want to read how mercurial handles file locking here. That page also details how things can go wrong if hg actions are performed during an rsync (which is going to be similar to how Dropbox works).

I'll also add that I've seen cloud storage just fail to synchronise one of a many files, or synchronise a file in the wrong direction. I don't know how Dropbox works (can you ensure that changes to D do not propagate to M and that M always overwrites D?), but you should consider the risk of whether D might accidentally corrupt M, and that change syncing back to D, and then to Cn, ruining all of the repos.

Alex Bream August 25, 2019

@Peter Koppstein  - your workflow seems rather complex (and unnecessary complex), and even in this scenario you aren't quarantined from "nightmares", as @philipstarkey already wrote.

More steps, more checks without additional benefits (compared to plain `hg serve`) for you… Well, it can work, if you add "Sync D with Dropbox only on demand, not in background" (if it's possible). But just compare it with just `hg serve` (locally) or `hg push SSH-URL`+`hg pull SSH-URL` (cheap web-hosting with all advantages)

kalthad August 25, 2019

@Alex Bream 

Hi so are you suggesting: 

cd $HOME/Dropbox/repo

hg serve

and then

hg push 

hg pull

and then kill the server when you are finished?

your collaborator should do the same on his/her machine?

also this seems easy enough, it could be more comfortable

Alex Bream August 25, 2019

@kalthad- no, I suggested to eliminate Dropbox from chain. And serve (in pull-only mode) only M directly

Peter Koppstein August 25, 2019

@pstarkey - Thanks for taking the time to understand the proposal, and thank you especially for the link (https://www.mercurial-scm.org/wiki/LockingDesign#The_repository_lock), which I take to be extremely promising with regard to my proposal (e.g. it says "readers do not need to acquire a lock").

My experience with Dropbox (including with a variant of the proposed solution) makes me quite hopeful that its simplicity for our users (which boils down to: "do not use `rsync`) will work quite well for us (e.g. better than moving to any git-based solution).

@Alex Bream - Your proposed solution (using hgserve) is not a solution in our case, as should be evident from my emphasis on using Dropbox.  Please also note that (given the considerations in the above paragraph), the proposed solution is very simple from the perspective of C1 ... Cn: they need only be advised to use nothing but `hg pull`.

Our predicament is that we want a free hosted solution, with private repositories and control over who can pull from them, with very few limitations. E.g. Sourcehut (hg.sr.ht) would be promising but they do not support https access to private repos. Perforce is promising, but as pointed out elsewhere in this thread, the likelihood of Perforce pulling the plug on hg seems quite high (see esp. https://www.perforce.com/blog/vcs/git-vs-mercurial-how-are-they-different). 

(Another drawback of Perforce from our point of view is that the wikis are git-based.)

philipstarkey August 25, 2019

@Peter Koppstein I'm not sure I follow your logic. If dropbox synchronises the files in D in an order different to that detailed on the mercurial wiki (for instance if the changelog index is synchronised first), then any of the Cn repositories that read the dropbox copy during that synchronisation may get a corrupted copy of the repository.

Maybe it's not quite as bad as I originally thought, but it's possible that the Cn repository would need to be recloned from dropbox in the case of corruption. That should be easy to test though....just selectively revert various things (e.g. revlogs) in the .hg folder to an older version synced to dropbox, while maintaining the changelog index intact, and then see how badly the Cn repositories behave (aka simulate a Cn sync to a partially synced D). Maybe hg verify is sufficient....but I would suggest checking every variant you can think of.

And of course, your scenario seems to assume that Cn repositories will never push, only pull (so for other people reading this, this is unlikely to be a solution for you as most people don't have a large number of repositories that only pull from a master repo and never push local changes back to master).

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events