I'm curious to see if anyone had at any time issues in Jira Data Center when they lost a large amount of data or have suffered days without acces to the system, whether due to implementation of a new add-on, or new configuration setup which led to an error of that type. I guess those things happen more on the Cloud solution then on the Data Center, but I'd like to see if there any such cases.
Just to be clear, I'm not trying to diss any add-on or anything, just wanted to see what can happen in daily operations and how I can prevent possible incidents before they happen. :)
We have implemented a lot of Data Center environments over the years, so hopefully this helps. First off, I am assuming your question is specific to a multi-node implementation, as Data Center can be implemented single node. Everything is a single point of failure in a single-node DC implementation, as it is on a Server implementation.
I think the biggest mistake that I see people making is to underestimate the level of effort and skills required in order to maintain Data Center (and any clustered application, for that matter.) You add a lot of additional technology to a Data Center environment that you don't have on a Server implementation. Often we are asked to set up configuration management, shared storage, highly available databases and clustered load balancers. It is important that you have the staff to manage these technologies and be able to maintain them and modify them as your needs change. Honestly, this is the root cause of most issues that we have seen.
We also recommend only running supported Data Center Apps, and not running unsupported Apps on a multi-node instance. We have seen cases where an unsupported App caused a node to fall over. The user retried the thing that caused the first node to die and killed a second node, then repeated until all were down. If you must run an unsupported App, I highly recommend that you design a robust testing plan to exercise it fully, with load on the Data Center instance, to determine whether it is safe to use. It's still better to have a policy that you don't run unsupported Apps. I am not aware of any unsupported Apps that misbehave on a single-node DC implementation, but there may be some that I haven't seen.
I hope that helps
Thanks @Dave Theodore _Coyote Creek Consulting_ for your answer. This is actually a single node implementation, and I am aware that that can cause a lot of issues. So far we have implemented the approach that only supported apps can be installed , so everyone is aware what propositions regarding SW they can make. :)
Staffing is important , especially having people that can weigh the pros and cons of improvement requests. Its unbelievable how this one thing can make your workday go wrong. :)
Anyway thank you once again.