Stash backup client: Why do we have a timeout waiting for SCM drain?

Pierre Zurmely December 31, 2014

I have currently this error message during my backup:

2014-12-31 02:01:28,478 ERROR [main] c.a.s.i.backup.client.BackupMain A backup could not be created. Reason: Operations from one or more SCMs did not finish within the allotted timeout. To prevent corruption due to inconsistent state, the backup has been aborted. Please try backup up again when the system is under less load.

 

I have set in stash-config.properties the following value

backup.drain.database.timeout=120

 

but:

Is this parameter also valid for scm?

What are the risk increasing this wait? Does the client start backing up things even if Database and SCM are not yet both drained?

 

Thanks for anyone who share any hint about this setting.

Regards,

Pierre

 

5 answers

1 vote
ThiagoBomfim
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 22, 2015

Hi Pierre,

Thanks for your observation. Indeed there was a mistake on the KB and I've corrected it so the right parameters (needs tweaking both on the server and the client):

That said, could you please follow the recommendations above and let us know how you go?

Best regards,
Thiago Bomfim
Atlassian Support

Edit: I also changed the KB title so it's more searchable.

ThiagoBomfim
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 27, 2015

Hi Pierre, Did you get a chance to look at the KB above? All the best, Thiago

Pierre Zurmely January 27, 2015

Thank you Thiago. As setting something in stash-config.properties require to restart stash to be considered, I will do this operation at our next allowed maintenance frame. I keep you informed of the results after next tuesday.

ThiagoBomfim
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 28, 2015

Please do. Cheers!

0 votes
Pierre Zurmely January 20, 2015

Hello,

 

I am still having troubles with the last 1.6 version of the backup tool.

And I am realy confised by this page:

https://confluence.atlassian.com/display/STASHKB/Backup+client+-+Failed+to+drain+SCM

 

Because in this page it seems that there is a confusion between scm and git drain error, and the setting that is acting only on the database aspect of the things.

 

So, how can I handle errors like:

2015-01-21 02:02:08,891 ERROR [main] c.a.s.i.backup.client.BackupMain A backup could not be created. Reason: Operations from one or more SCMs did not finish within the allotted timeout. To prevent corruption due to inconsistent state, the backup has been aborted. Please try backup up again when the system is under less load.

 

Do I have to go in maintenance mode many minutes before starting back up script ?

 

How can I temporise?

 

0 votes
ThiagoBomfim
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 4, 2015

Hi Pierre,

The description of this parameter is on the document below:

  • Stash config properties - Backup

    "Defines the number of seconds Stash will wait for connections to the database to drain and latch in preparation for a backup.

    Value is in SECONDS."

You might have come across the KB below, which gives you more details on how to troubleshoot it:

Best regards,
Thiago Bomfim

 

 

Pierre Zurmely January 4, 2015

Thank you Thiago, I have read this "Stash config properties - Backup" file already. Does it mean that it concern only database drain and we cannot define any timeout for scm drain activities? My error message is about scm: "Operations from one or more SCMs did not finish within the allotted timeout." After increasing the timeout from 60 to 120 seconds, I do not reproduce the error, but I am not sure that there is a direct link as it seems to be a settings only for databases operations. I have missed this KB, I will consider this possibility to go in UPM safe mode. It is very interessting, thank you! Regards, Pierre Zurmely

ThiagoBomfim
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 4, 2015

Hi Pierre, Yes, it is the connections to the database only. I hope this has helped you. Thiago

0 votes
Mike Friedrich
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
December 31, 2014

The timeout exist to allow the backup to wait until internal buffers are flushed and the system is in a consistent state. They don't wait indefinitely to avoid deadlocks (usually a bug, but already happened). 

Pierre Zurmely January 1, 2015

Thank you Mike for those 2 answers. I do understand that it is usefull to have a timeout to handle any unexpected behavior. Do you have more knowledge on the nature of this backup.drain.database.timeout settings? Is it valid for both db and scm or only for DB? As this drain is used to ensure that the system is in a consistent state, does it mean that the backup wait both drain before starting the sync, and so that there is no risk to increase it? Thanks and regards, Pierre

0 votes
Mike Friedrich
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
December 31, 2014

It should give up after the timeout. At least the DIY backup script examples do that, the internal backup should behave the same.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events