Any ideas on how to prevent the multiple DVCS sync warning problem?

Irwin Schreiman April 25, 2019

I've had a problem where my logs fill up with multiple DVCS sync warning a few times now.  The most recent time I had over 7000 duplicate entries.  The article at, https://confluence.atlassian.com/jirakb/logs-are-full-of-dvcs-sync-warnings-found-more-than-one-match-for-953648787.html is great for helping to resolve the issue, but gives no idea to me on why it is happening or how to prevent it.  We're using the DVCS connector to sync with GitHub and also using Fisheye to sync with some Perforce and SVN repositories.  Any ideas?

1 answer

1 accepted

0 votes
Answer accepted
Andy Heinzer
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
May 6, 2019

Hi Irwin,

The KB you cited eludes to one cause of the problem; syncs that are routinely failing.  When this happens we're likely getting into a state where the database is confused because it has more than one entry for a unique field/value.

The KB does mention one cause of this for Bitbucket:

The sync may fail if there are too many duplicate entries (JIRA keeps retrying probably leading to Rate Limit Exceeded)

So it seems logical to me that you could be exceeding github's rate limits in someway that might be causing problems for Jira to sync that data.  I found some details in Understanding rate limits for GitHub Apps that might help.

 

On a side note, have you recently cloned your Jira instance?  Perhaps to do a test upgrade?  I ask because if you created an xml backup of your Jira database, or cloned the existing Jira database to setup another Jira site for testing, it's possible that this node of Jira could be making the same calls to the same github host.

I hope this helps.

Andy

Irwin Schreiman May 6, 2019

Thanks andrew.  I have cloned, but I deleted the DVCS connection  on the clone.   I think you may be on the right track with the rate limit thing.  We have over 400 users and 400 repos.  Perhaps that is too many for the system, not sure.  I've not configured any special transitions but instead use the out of the box integration.  We do use pull requests for code reviews so the same change could generate multiple pull requests. Could that also be a problem? 

Andy Heinzer
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
May 7, 2019

It is possible that a large number of commits, pull requests, branches, and/or merges would all be data that Jira will try to sync and then match up with the Jira issuekey in question.  So yes, it could be a factor here.

Irwin Schreiman May 7, 2019

Thanks again. And as far as I can tell there is now way to throttle the syncing in JIRA or to stagger the polling time per repository (like I do with perforce and fisheye to prevent a similar problem on the perforce server).  Is that correct? 

Andy Heinzer
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
May 7, 2019

I don't know of any per repo settings you could adjust in regards to DVCS in Jira, but you could follow the KB How to change the interval or schedule of the DVCS repositories sync.  The default interval is 1 hour.  But you could extend this to a longer interval might help here.  The trade off being that the longer the interval, the longer Jira has stale data.

Please note that there are two different values that might have to change here, next_run and interval_millis.  One which sets the next run time, and the other that determines how much times passes between runs.

Irwin Schreiman May 7, 2019

Thanks, that could help. Also what I'm thinking of trying is, instead of creating one GitHub DVCS app for the entire organization, separating them out into multiple ones based on the team. This means each connection to GitHub will have fewer repos to sync.  Not sure if it will work, but plan to try.

Suggest an answer

Log in or Sign up to answer