When pushing a new branch to a Stash repository, a pre-receive hook gets a RefChange with fromHash=0. Is this the intended behaviour?
Putting that in a ChangesetsBetweenRequest.Builder.exclude(fromHash)... leads to the full repo history being returned.
How can I get only the changesets that are not already in the target repo?
This is, perhaps somewhat counter-intuitively, expected behaviour of Git (and thus Stash). Git makes no distinction about what are new commits vs commits that _just happened_ to exist in the repository when you pushed.
Perhaps an ASCII diagram to help:
o---o---o---o---o A \ --o---o---o---o B \ --o---o C
The question is - what commits are 'on' branch C? Is it 2 commits, or 4? The answer (as far as Git is concerned) depends entirely on what you're comparing it to. If you compare it to B then it has 2 commits, but if you compare it to A then it has 4. The fact that you might have pushed C after B is irrelevant, and as such when you create a new branch the post-receive hook will pass fromHash=0, and not the last 'seen' commit. What happens if branch B had been pushed and then deleted? If Git hadn't gc'd those commits yet then it would need to walk the entire graph to work out what was now 'visible'.
I won't lie though - this is the cause of many frustrations for Stash, because it would be very handy to know things like - which branch a commit was first 'seen' on, and who 'pushed' it. (And wait until you start having to worry about forks).
For now, as you've discovered, you can use ChangesetIndex.isMemberOf(), which is something we use to index which commits we've 'seen' before. But that is definitely a Stash specific tool/concept, and not related to how Git actually works. We may enhance how some of this works over time to meet our own requirements, but it's not going to be a trivial task.
I hope that helps?
Thanks for the explanation, it definitely helps to understand the problem.
I do have one problem with ChangesetIndex.isMemberOf() though. I consistently have the case that on any new: fork, clone, create a new branch, push new branch to Stash (no new commits), for all commits (which are existing commits) isMemberOf returns false. The javadoc does state: "true if the provided changeset is (indexed as) a member of
So it seems the repo is not yet indexed. This begs the question: how do I trigger an indexing of a Repository?
Indexing happens on another thread just after a push. Note that it can't happen during a pre-receive because it needs the refs to exist in the repository. Basically you can't really trigger it manually.
As I hinted at in my previous message, forks get even harder. We basically don't fully index forks because of the potential explosion in DB relationships. For large repositories each commit would have a relationship with every fork. We are still investigating how to efficiently store/retrieve this information.
My suggestion for forks would be to maybe try calling getChangeset() instead of isMemberOf(), which will return null if it was the first time it was pushed.
Let me know if that works.
It doesn't bypass the index at all - it's a direct connection to the database entry. While this is going to sound confusing we do index commits in a fork, but we don't index the 'membership'. So a null value from that method indicates that Stash has never seen it before in _any_ repository, where-as a non-null ChangesetIndex indicates that Stash has indexed it _somewhere_, but IndexedChangeset.getRepositories() may possibly return nothing if the commit is only on a fork. When you merge that commit to the main repository it will then be indexed correctly and isMemberOf() would start to work for that parent repository only.
Does that make sense?
You're really getting in the nitty-gritty of how Stash works at this point, and my hope is that over time we can introduce a more accurate (but not too costly) indexing of forks.
thanks for the insight. I think I get the idea - which is probably wrong :)
For our workflow it looks like getChangeset is a good substitute, as we are not fork-heavy and "not in any forked repo or master" is identical to "not in this repo", but this won't be the case for everyone out there.
Looking forward to seeing those improvements soon and thanks again for your support,
Is there an Atlassian feature request for a simpler API for this? Its a common function for most pre-receive hooks. As well as the way mentioned here, the YACC plugin basically excludes refs/heads/*, which seems to work although feels like a different hack (see getBranches in https://github.com/sford/yet-another-commit-checker/blob/master/src/main/java/com/isroot/stash/plugin/ChangesetsServiceImpl.java#L84)
BTW, the atlassian-supplied (but not supported) filesize plugin (https://bitbucket.org/atlassianlabs/stash-filesize-hook-plugin) also gets this wrong - with a small max-file-size configured, using the demo project "git clone https://stash/stash/projects/PROJECT_1/repos/rep_1/browse;git checkout -b test_branch; git push" fails because the plugin checks all the commits in the branch history....
Can you believe there's less than one month until Summit 2019? It's time to start planning your agendas, not to mention packing ... In the meantime, introduce yourselves on this thread so that you ...
Connect with like-minded Atlassian users at free events near you!Find a group
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no AUG chapters near you at the moment.Start an AUG
You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs