It's not the same without you

Join the community to find out what other Atlassian users are discussing, debating and creating.

Atlassian Community Hero Image Collage

Atlassian high availavility: cold failover?

We are trying to setup a Disaster Recovery solution for our Atlassian applications (so far, Crowd, JIRA, Confluence and Crucible). The production environment consist of the following servers:

  • Crowd: running on Windows, linked with AD to provide user authentication
  • JIRA, Confluence and Crucible: Each of this runs on its own Ubuntu 10.04 server
  • MS SQL Server 2008: Database server shared by all the above

Our approach is to have every server replicated to a cold server (in a different geographical location), do an rsync to keep the different data folders up to date and have a secondary database server that we keep up to date with database replication.

First issue is to make sure we filter what files are replicated through rsync, so we do not overwrite the cold server settings like database configuration (should point to the failover DB server).

Second problem is to filter what tables get replicated for the databases. The last releases of Atlassian apps have the User Directory configuration stored in the DB. This means that if we do not filter this settings, we'd have the failover JIRA server pointing to the production Crowd, instead of the failover one.

Still haven't completed this setup, but would like to hear of any thoughts about this setup and other possible solutions to provide resiliance to our Atlassian environment. I'm specially concerned of the administrative burden that this will bring when upgrading the live environment. Also, any changes in the configuration files and/or configuration settings stored in the DB in future releases would probably mean our cold failover environment will be broken.

2 answers

1 accepted

3 votes
Answer accepted

Sounds unnecessarily complex... your JDBC url should contain the DNS alias for the database server, such that if the database is failed over then it the same url automatically points to the DR database system. Unless you are a very small company this should be provided for you by the DBAs I would have thought.

I don't use Crowd, but the same thing applies to LDAP servers. You point to one that gets round-robinned by DNS, and any that are down get dropped automatically. So I'd suggest you just set up DR for Crowd and use F5s or whatever to automatically have the crowd url directed to the correct crowd server.

We use a clustered filesystem so in the event of failover the filesystem is automatically mounted on the DR machine. If we had to change configuration files or ensure that they had not been synced that would just increase the chance of a problem in an already panicked situation.

In short, at least for the DB thing, try to leverage whatever your DBAs recommend.

Thanks for the tick, hopefully other people will chime him with more information. One final piece of advice - test it! And then again every 6 months or so.

Have to say your solution is embarrassingly simple :)

I agree it'd be good to hear from other people implementations.

I'm thinking of creating static entries in the failover servers hosts files to point to LDAP and DB server. This way we can test it without bringing the prod environment up and there'll be less steps to follow in case of failover. We are thinking of doing this manually, no F5s ;)

3 votes
Stefan Broda Atlassian Team Jun 05, 2012

On this topic: Atlassian has just released a dedicated best practice guide for High Availability. It covers a cold failover scenario and includes implementation details on reverse proxying, monitoring, replication and failover mechanisms:

https://confluence.atlassian.com/display/ATLAS/Failover+for+JIRA

how does one access this document? We're about to start a migration/combination and this doc would really come in handy

No... I can't access it anymore -presumably as data center is available, then this document has been retired?

It would be useful for the rest of us, as I need to test our cold standby environment, and it's been a few months since I last reviewed this doc!

Can someone at Atlassian free it up from it's black hole?

CB Atlassian Team Jul 30, 2014

Hi, you can find the newest version of the document here: https://confluence.atlassian.com/display/ENTERPRISE/Failover+for+JIRA+Data+Center

Hi Christine, I don't see any data other than a basic image.

Your previous doc had heartbeat and brbd information and a bit on database replication.

Cheers

Sadly, the new link doesn't have much information at all. There are many of us who are either not using Jira Data Center yet, or choose not to for various reasons. For example, my company has datacenters in different geographical regions. Jira Data Center doesn't cluster between different geographic locations yet. So for us, the cold failover approach makes more sense.. But I can't seem to find cold failover documents for Jira *anywhere* on atlassian -- the few pages that still exist appear to be restricted. I see stuff for Confluence, Bamboo, stash... but not Jira. If I were a conspiracy theorist, it would appear that we are being heavily encouraged to use Jira Data Center. ;)

Suggest an answer

Log in or Sign up to answer
Community showcase
Published in Jira

Jira Cloud for Google Sheets: Automatically Refresh Your Data!

Remember that time you realized it was possible to refresh your Jira data in Google sheets with just one click? What if we told you that you can now get the latest data with no clicks at all?! Zero! ...

392 views 3 11
Read article

Community Events

Connect with like-minded Atlassian users at free events near you!

Find an event

Connect with like-minded Atlassian users at free events near you!

Unfortunately there are no Community Events near you at the moment.

Host an event

You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events

Events near you