I understand before asking this questions much, much, more information to get an accurate answer, but I think some one could likely provide more insight to this situation, and any additional insight would be appreciated.
We have a relatively complex server instance of JIRA/JIRA Service Management. Running on Windows Server as a service with a MSQL database. To give a bit of context: ~700 custom-fields, 30-40 actively managed workflows, many scripted fields, jobs, listeners, many behaviours, and a few fragments. We have many mail handlers consuming email from various locations, as well as a Service Desk mail handling. We generate around 75k-100k issues a year in all projects. We have about 25ish applications the big ones being:
Deviniti Extensions (Bundled fields, Queues)
JIRA Misc Custom Fields
Many others, but those being the biggest impacted I'm guessing.
A bit more before the questions...
We have some custom Powershell scripts ran weekly to copy the production instance and Database weekly. It copies the production JIRA data over to a test environment, and the Production database, we insert Dev keys for everything pre-boot, and then the instance boots up licensed Dev with all production data intact.
For the second time now, we've had a failure in this process which resulted in dbconfig config copy failures and inadvertently ended up with both Prod and Test JIRA pointed to the same DB for a time. The last time this happened it was ~12 hours. The most recent occurrence of this was closer to 72 hours as it included a weekend.
Luckily not much activity takes place in the test instance, I am not so much worried about the data integrity from changes happening there. My big question is the long term impact from something like this happening.
Noticeable effects that I've seen both times prior to a disconnect and reboot are: Mail Handlers fail to work properly, anything dependent on cron statements tends to fail, configuration settings from test seem to at times override production config settings.
I've spoken with Atlassian support on this, and obviously the recommendations were to rollback, and little insight could be given to the impact of the 3rd party application configurations. Does anything stick out glaringly as a long term problem once the initial cause is remediated, and everything is pointed back to where it belongs and restarted?
Thank you for any insight on this topic, again I know a lot of specifics would be needed to truly understand all the impact.
There's one very simple thing that stands out to me:
>which resulted in dbconfig config copy failures and inadvertently ended up with both Prod and Test JIRA pointed to the same DB for a time.
Why in heck's name are you copying production settings to a test system?
By all means, copy the database to a test system, and the attachments, and do all the things to isolate test from everywhere else, but why are you copying the database connection to production over to a test system?
The thing that sticks out as a long term problem is "stop connecting test systems to your production database"
I know that's quite a harsh and bloody-minded attitude, but I really can't see why you would do this, or think it's a useful way to get a test system. This is not as complex as you seem to think - just stop connecting test to live and you are ok.
Thank you for the blunt answer, and I 100% and wholeheartedly agree, maybe I did a bad job explaining.
We never had purposely done this, there is some process error on the team here that has inadvertently caused this to happen, twice now. The scripting they did was to exclude all config settings files but somehow they were copied over. (outside of my scope)
My question above is very specific to the potential impacts inside JIRA from this happening. I think the feeling is mutual that this is kind of insane.
Ok, the potential impacts are very simple - a totally corrupt data set that initially appears to be ok, but fails later, is the worst case.
You absolutely need to go back to the backup you took before this error was made, or you'll never know if your data was damaged in ways that are going to bite you later. I've seen something like this done in Jira break the attempted upgrade a year later, by which time of course, it's far too late to roll back to a backup.
for the unfortunate current situation you received many valid information from Nic.
For the future I am wondering if it makes sense to bring up some kind of firewalling between prod and non-prod environment.
Any other solution will work as well - basically anything that prevents an access from non-production environment to live database, even in case of a errorneous configuration.
We’re excited to introduce external collaboration for Confluence, now available in early access. It is available to preview for Confluence Cloud Premium and Enterprise customers. (If you're not on ...
Connect with like-minded Atlassian users at free events near you!Find an event
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no Community Events near you at the moment.Host an event
You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events