Are you in the loop? Keep up with the latest by making sure you're subscribed to Community Announcements. Just click Watch and select Articles.

×
Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in
Celebration

Earn badges and make progress

You're on your way to the next level! Join the Kudos program to earn points and save your progress.

Deleted user Avatar
Deleted user

Level 1: Seed

25 / 150 points

Next: Root

Avatar

1 badge earned

Collect

Participate in fun challenges

Challenges come and go, but your rewards stay with you. Do more to earn more!

Challenges
Coins

Gift kudos to your peers

What goes around comes around! Share the love by gifting kudos to your peers.

Recognition
Ribbon

Rise up in the ranks

Keep earning points to reach the top of the leaderboard. It resets every quarter so you always have a chance!

Leaderboard

Disaster Recovery deployment for JSM

Edited

Guys,

This is a pure discussion about deployment of DR for JSM.

Previously, I was trying the solution in link https://confluence.atlassian.com/enterprise/disaster-recovery-guide-for-jira-692782022.html, but I found the whole process is quite complicated. Consequently, I tried another procedure according to my own environment:

We have a PostgreSQL cluster bundled via Consul and Patroni (3 postgresql instances, one is primary and the other two are replicas, only the primary is R/W). In the production site, I deploy the JSM connecting to the PostgreSQL via HAproxy, it is working fine so far. And in the DR site, I installed another instance of JSM without bringing it up. This is the initial(maybe normal in the future) status.

In this experiment, I copy the dbconfig.xml to the DR site and shut down the production JSM manually, then start the JSM in DR, it is up and running. Since the data is retrieved from the same datbase, all system level configuration is identical, including base URL, system id and so on (for now I have to modify the base URL for further access since there is no DNS or LB configured in the very front).

What I am concerning about now is how could I sync those files generated/installed in production site (such as attachments, avatars and installed plugins. I uncheck the index snapshots as I would like to run a full re-index once the failover is done) to the DR site, I tried the Replication function and indeed it would copy those files to the secondary site (I tried NFS, the permission part almost killed me) but the files are too scatterred to sync back to the expected directory in DR. 

Your suggestion would be much appreciated and I will try it out in my lab then.

Thanks.

Jason Du

2 comments

Hi Jason,

We also have the JSM Data Center version with HA concept and Patroni Cluster in use. However, the application displays I/O errors. The DB is not corrupt, it is still a I/O error.

May I ask if there were any special features during the setup?

We run the cluster across different data centers, our Data center and Azure.

Did you work according to the official recommended documentation from Atlassian?

https://confluence.atlassian.com/adminjiraserver/running-jira-data-center-in-a-cluster-993929598.html

 

Best wishes,

Dennis

"What I am concerning about now is how could I sync those files generated/installed in production site (such as attachments, avatars and installed plugins. I uncheck the index snapshots as I would like to run a full re-index once the failover is done) to the DR site, I tried the Replication function and indeed it would copy those files to the secondary site (I tried NFS, the permission part almost killed me) but the files are too scatterred to sync back to the expected directory in DR. "

Maybe you shouldn't. 

So your <atl-home>/shared directory should be where everything lives bar installed plugins (and they should re-download).  Attachments and Avatars should be in /shared and I believe a copy of plugins should be there. 

This <atl-home>/shared should be a FAST NFS share. It should be accessible from all nodes for node/failure as a common directory. With EC2 were talking an EBS share, with Azure you can do an equivalent. 

 

However this is a DR site. Not production, DR. If I was doing DR, I would have a replication copy that is 1 month older then production of the NFS share with a monthly/yearly archive, why 

RANSOMWARE. 

The biggest call on DR is most likely a hacker, particularly when you are talking Cloud (not cloud security but someone hacking a workstation and uploading malware on your jira tickets). When recovering to DR, I would want to know I am going to a clean environment then restoring latest. 

Just food for thought. 

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events