Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Stash webhook do not page the changes

David Hoyt January 9, 2014

The webhook seems to be truncating the list of commits at 100. Is there a way to prevent this from happening? The webhook should POST all the related data at once.

1 answer

0 votes
cofarrell
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 9, 2014

Hi David,

That's very much intentional. We try to limit everything we do in Stash so that we don't start running into performance problems when user data is unexpectedly large.

In this particular case there is a global setting that you can set in the $STASH_HOME/stash-config.properties file.

plugin.com.atlassian.stash.plugin.hook.changesLimit=100

EDIT: Apologies I think the limit you want is the following, although the default should be 500, not 100.

plugin.com.atlassian.stash.plugin.hook.changesetsLimit=500

I would be a very careful about making this too large.

Cheers,

Charles

David Hoyt January 9, 2014

This is a problem because the service receiving the POST must know to reach back to Stash and make potentially multiple requests. There are many race conditions that could occur not to mention network partition-related problems. The data should be provided in one payload period for webhooks. For the web UI, this is fine -- I would want it paged. But not for webhooks.

David Hoyt January 9, 2014

It's unclear why you would have performance-related problems. 100 changes in a single commit is not that much to process (now a person pushing that many commits at once should re-think what they're doing -- but during a large merge it could be feasible). For the webhook I'd prefer to disable the limit.

As for the performance problem, why not stream the data out? You can use a constant amount of memory if architected correctly.

David Hoyt January 9, 2014

That's not a feasible solution for many webhook consumers. Yes you could have a network partitioning problem whenever you have 2 machines communicating over an unreliable medium (the network), but I was referring to the fact that the issue is exacerbated when multiple calls are required. That also requires the consumer to know something about the producer (my service needs to know how to formulate a query to get the next page of info. -- speaking of that, how is that done?).

Following your suggestion requires much more work, plus additional storage for each repository, plus having to issue git commands and/or using additional dependencies in order to properly discover and process the necessary information. And all of that still suffers from network partitioning problems.

At the very least, can I specify "plugin.com.atlassian.stash.plugin.hook.changesLimit=-1" to indicate I want "no limits whatsoever"?

cofarrell
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 9, 2014

Hi David,

Honestly, I wouldn't be relying on the webhook data like that. As you mention you might get network partitioning problems, and then lose the data. If it were me I would use the webhook to see what refs are changing, and then actually fetch the latest data with Git. Then you can guarantee you never miss anything. You could also poll to handle missed hook calls.

Otherwise just make that limit really large...

Cheers,

Charles

cofarrell
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 9, 2014

Hi David,

To page more results you could use the Stash REST API:

https://developer.atlassian.com/static/rest/stash/latest/stash-rest.html#idp852240

This endpoint can take since/until for the commit range, as well as start/limit query parameters for paging.

Unfortunately it doesn't look like our common paging API will allow -1 as a limit, it will default to a minimum of 1.

Cheers,

Charles

PS. See my edit in the 'answer' above, I may have given you the wrong property before.

David Hoyt January 9, 2014

So I would need to call back into the REST API? All of this is starting to sound extremely silly to me. None of what I'm hearing seems consistent with what I know and am familiar with regards to webhooks (certainly happy to be educated, though, if this is what goes on elsewhere). :D

To get around the API limitation you could always loop through the pages and stream the data a page at a time.

Perhaps I should simply fork the webhook plugin and make the changes I need? Would you be able to point me to where the source lives for the plugin?

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events