Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

How to get only new changesets in pre-receive hook on new-branch push?

Marian Grigoras October 16, 2013

When pushing a new branch to a Stash repository, a pre-receive hook gets a RefChange with fromHash=0. Is this the intended behaviour?

Putting that in a ChangesetsBetweenRequest.Builder.exclude(fromHash)... leads to the full repo history being returned.

How can I get only the changesets that are not already in the target repo?

3 answers

1 accepted

Comments for this post are closed

Community moderators have prevented the ability to post new answers.

Post a new question

3 votes
Answer accepted
cofarrell
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 22, 2013

Hi Marian,

This is, perhaps somewhat counter-intuitively, expected behaviour of Git (and thus Stash). Git makes no distinction about what are new commits vs commits that _just happened_ to exist in the repository when you pushed.

Perhaps an ASCII diagram to help:

o---o---o---o---o A
 \
  --o---o---o---o B
         \
          --o---o C

The question is - what commits are 'on' branch C? Is it 2 commits, or 4? The answer (as far as Git is concerned) depends entirely on what you're comparing it to. If you compare it to B then it has 2 commits, but if you compare it to A then it has 4. The fact that you might have pushed C after B is irrelevant, and as such when you create a new branch the post-receive hook will pass fromHash=0, and not the last 'seen' commit. What happens if branch B had been pushed and then deleted? If Git hadn't gc'd those commits yet then it would need to walk the entire graph to work out what was now 'visible'.

I won't lie though - this is the cause of many frustrations for Stash, because it would be very handy to know things like - which branch a commit was first 'seen' on, and who 'pushed' it. (And wait until you start having to worry about forks).

For now, as you've discovered, you can use ChangesetIndex.isMemberOf(), which is something we use to index which commits we've 'seen' before. But that is definitely a Stash specific tool/concept, and not related to how Git actually works. We may enhance how some of this works over time to meet our own requirements, but it's not going to be a trivial task.

I hope that helps?

Charles

Marian Grigoras October 24, 2013

Thanks for the explanation, it definitely helps to understand the problem.

I do have one problem with ChangesetIndex.isMemberOf() though. I consistently have the case that on any new: fork, clone, create a new branch, push new branch to Stash (no new commits), for all commits (which are existing commits) isMemberOf returns false. The javadoc does state: "true if the provided changeset is (indexed as) a member of repository"

So it seems the repo is not yet indexed. This begs the question: how do I trigger an indexing of a Repository?

Thanks,

Marian

cofarrell
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 24, 2013

Hi Marian,

Indexing happens on another thread just after a push. Note that it can't happen during a pre-receive because it needs the refs to exist in the repository. Basically you can't really trigger it manually.

As I hinted at in my previous message, forks get even harder. We basically don't fully index forks because of the potential explosion in DB relationships. For large repositories each commit would have a relationship with every fork. We are still investigating how to efficiently store/retrieve this information.

My suggestion for forks would be to maybe try calling getChangeset() instead of isMemberOf(), which will return null if it was the first time it was pushed.

Let me know if that works.

Charles

Marian Grigoras October 24, 2013

Hi Charles,

indeed, getChangeset() works on the forked repo where isMemberOf() does not.

Can you confirm that getChangeset bypasses the ChangesetIndex when getting the data?

Many thanks,

Marian

cofarrell
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 24, 2013

Hi Marian,

It doesn't bypass the index at all - it's a direct connection to the database entry. While this is going to sound confusing we do index commits in a fork, but we don't index the 'membership'. So a null value from that method indicates that Stash has never seen it before in _any_ repository, where-as a non-null ChangesetIndex indicates that Stash has indexed it _somewhere_, but IndexedChangeset.getRepositories() may possibly return nothing if the commit is only on a fork. When you merge that commit to the main repository it will then be indexed correctly and isMemberOf() would start to work for that parent repository only.

Does that make sense?

You're really getting in the nitty-gritty of how Stash works at this point, and my hope is that over time we can introduce a more accurate (but not too costly) indexing of forks.

Cheers,

Charles

Marian Grigoras October 28, 2013

Hey Charles,

thanks for the insight. I think I get the idea - which is probably wrong :)

For our workflow it looks like getChangeset is a good substitute, as we are not fork-heavy and "not in any forked repo or master" is identical to "not in this repo", but this won't be the case for everyone out there.

Looking forward to seeing those improvements soon and thanks again for your support,

Marian

Bradley Baetz July 7, 2014

Is there an Atlassian feature request for a simpler API for this? Its a common function for most pre-receive hooks. As well as the way mentioned here, the YACC plugin basically excludes refs/heads/*, which seems to work although feels like a different hack (see getBranches in https://github.com/sford/yet-another-commit-checker/blob/master/src/main/java/com/isroot/stash/plugin/ChangesetsServiceImpl.java#L84)

BTW, the atlassian-supplied (but not supported) filesize plugin (https://bitbucket.org/atlassianlabs/stash-filesize-hook-plugin) also gets this wrong - with a small max-file-size configured, using the demo project "git clone https://stash/stash/projects/PROJECT_1/repos/rep_1/browse;git checkout -b test_branch; git push" fails because the plugin checks all the commits in the branch history....

1 vote
Zeeshan Maqbool March 27, 2014

Marian / Charles - Can either one of you tell me how to use "ChangesetIndex.getChangeset" with some code. [ ChangesetIndex is an interface, not a class & I dont see any class in API that implements ChangesetIndex ]

Gabor Nagy _Midori_
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
November 24, 2015
0 votes
Marian Grigoras October 17, 2013

Found a workaround using ChangesetIndex.isMemberOf()

Still, it would be nice to find out if fromHash=0 is the intended behaviour.

TAGS
AUG Leaders

Atlassian Community Events