How to get only new changesets in pre-receive hook on new-branch push?

When pushing a new branch to a Stash repository, a pre-receive hook gets a RefChange with fromHash=0. Is this the intended behaviour?

Putting that in a ChangesetsBetweenRequest.Builder.exclude(fromHash)... leads to the full repo history being returned.

How can I get only the changesets that are not already in the target repo?

3 answers

1 accepted

3 votes
Answer accepted

Hi Marian,

This is, perhaps somewhat counter-intuitively, expected behaviour of Git (and thus Stash). Git makes no distinction about what are new commits vs commits that _just happened_ to exist in the repository when you pushed.

Perhaps an ASCII diagram to help:

o---o---o---o---o A
 \
  --o---o---o---o B
         \
          --o---o C

The question is - what commits are 'on' branch C? Is it 2 commits, or 4? The answer (as far as Git is concerned) depends entirely on what you're comparing it to. If you compare it to B then it has 2 commits, but if you compare it to A then it has 4. The fact that you might have pushed C after B is irrelevant, and as such when you create a new branch the post-receive hook will pass fromHash=0, and not the last 'seen' commit. What happens if branch B had been pushed and then deleted? If Git hadn't gc'd those commits yet then it would need to walk the entire graph to work out what was now 'visible'.

I won't lie though - this is the cause of many frustrations for Stash, because it would be very handy to know things like - which branch a commit was first 'seen' on, and who 'pushed' it. (And wait until you start having to worry about forks).

For now, as you've discovered, you can use ChangesetIndex.isMemberOf(), which is something we use to index which commits we've 'seen' before. But that is definitely a Stash specific tool/concept, and not related to how Git actually works. We may enhance how some of this works over time to meet our own requirements, but it's not going to be a trivial task.

I hope that helps?

Charles

Thanks for the explanation, it definitely helps to understand the problem.

I do have one problem with ChangesetIndex.isMemberOf() though. I consistently have the case that on any new: fork, clone, create a new branch, push new branch to Stash (no new commits), for all commits (which are existing commits) isMemberOf returns false. The javadoc does state: "true if the provided changeset is (indexed as) a member of repository"

So it seems the repo is not yet indexed. This begs the question: how do I trigger an indexing of a Repository?

Thanks,

Marian

Hi Marian,

Indexing happens on another thread just after a push. Note that it can't happen during a pre-receive because it needs the refs to exist in the repository. Basically you can't really trigger it manually.

As I hinted at in my previous message, forks get even harder. We basically don't fully index forks because of the potential explosion in DB relationships. For large repositories each commit would have a relationship with every fork. We are still investigating how to efficiently store/retrieve this information.

My suggestion for forks would be to maybe try calling getChangeset() instead of isMemberOf(), which will return null if it was the first time it was pushed.

Let me know if that works.

Charles

Hi Charles,

indeed, getChangeset() works on the forked repo where isMemberOf() does not.

Can you confirm that getChangeset bypasses the ChangesetIndex when getting the data?

Many thanks,

Marian

Hi Marian,

It doesn't bypass the index at all - it's a direct connection to the database entry. While this is going to sound confusing we do index commits in a fork, but we don't index the 'membership'. So a null value from that method indicates that Stash has never seen it before in _any_ repository, where-as a non-null ChangesetIndex indicates that Stash has indexed it _somewhere_, but IndexedChangeset.getRepositories() may possibly return nothing if the commit is only on a fork. When you merge that commit to the main repository it will then be indexed correctly and isMemberOf() would start to work for that parent repository only.

Does that make sense?

You're really getting in the nitty-gritty of how Stash works at this point, and my hope is that over time we can introduce a more accurate (but not too costly) indexing of forks.

Cheers,

Charles

Hey Charles,

thanks for the insight. I think I get the idea - which is probably wrong :)

For our workflow it looks like getChangeset is a good substitute, as we are not fork-heavy and "not in any forked repo or master" is identical to "not in this repo", but this won't be the case for everyone out there.

Looking forward to seeing those improvements soon and thanks again for your support,

Marian

Is there an Atlassian feature request for a simpler API for this? Its a common function for most pre-receive hooks. As well as the way mentioned here, the YACC plugin basically excludes refs/heads/*, which seems to work although feels like a different hack (see getBranches in https://github.com/sford/yet-another-commit-checker/blob/master/src/main/java/com/isroot/stash/plugin/ChangesetsServiceImpl.java#L84)

BTW, the atlassian-supplied (but not supported) filesize plugin (https://bitbucket.org/atlassianlabs/stash-filesize-hook-plugin) also gets this wrong - with a small max-file-size configured, using the demo project "git clone https://stash/stash/projects/PROJECT_1/repos/rep_1/browse;git checkout -b test_branch; git push" fails because the plugin checks all the commits in the branch history....

Marian / Charles - Can either one of you tell me how to use "ChangesetIndex.getChangeset" with some code. [ ChangesetIndex is an interface, not a class & I dont see any class in API that implements ChangesetIndex ]

Found a workaround using ChangesetIndex.isMemberOf()

Still, it would be nice to find out if fromHash=0 is the intended behaviour.

Suggest an answer

Log in or Sign up to answer
Community showcase
Published Thursday in Summit

Find a your Summit 2019 Buddy!

Can you believe there's less than one month until Summit 2019? It's time to start planning your agendas, not to mention packing ... In the meantime, introduce yourselves on this thread so that you ...

217 views 15 6
Read article

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you