Split instances in more details (Part 2)

In this article we are going through the detailed steps of making sure that we do a good split of JIRA and Confluence instances. If you do not know the background I encourage you to check Part 1 of the story where you get a better picture behind that task.

So far after doing research we know that split is possible, we know what we would move and what method we would use.

However no matter which solution we are choosing to have two separate environments imagine that you cannot simply start both at the same time and use it. There is a lot of work to reconfigure things so that new machine would not make any problem. This is the time when it gets tricky.. There are obvious things like Base URL, but we do not create a test environment that would not be heavily used by people. We are doing complete separation and would need to make sure that nothing would make damage. This is where we decided to imagine possible scenarios.

Step 5: Visualization

When we started to dig more a little bit we realize that we have to make more analysis on things like integrations, automation and notifications. We needed to determine things like who is using subscriptions, are the any customization that send custom mails, what 3rd party systems are integrated with JIRA/Confluence and by following this path we end of on a big question what would happen when we change JIRA and Confluence URLs would links still work, what do to with hard links to old systems in whole documentation..

You might know that leaving this as is might introduce confusion, cause that people would receive duplicated notification, would access wrong system, add data there already then going forward would be only worse.. Wrong preparation here would make a lot of work later!

This is where we noticed that we need to first start thinking how we do things for example how to prevent users that would be migrate from accessing old system. Deactivating all (~1000) users manually would not an ideal option.. Adding an Active Directory filter in User Directories where our first idea. But we know that users with Remember Me sessions would not be simply logged out and even we block access it would not apply until we remove all tokens and restart system. Normally we would do it but how to make if fully transparent for the users on the source instance that do not migrate? We did not wanted to suddenly cut access.

Same with other things. Visualization and brainstorming gave us a lot of input that we used to create a risk table with mitigation in case something would go wrong. That had to be done.

Step 6: Testing and Preparation

This is where it started to be fun. We know when is our deadline, we know what might potentially goes wrong and what we still do not know yet, we do not yet have final procedures, so all of this should clarify on final testing.

By using Confluence we put everything that we know about this process and tried to think about what we still have to do. Using Action Items, and by building one huge TODO list helped us monitor progress where we are and what is still left. Using integration with Jira helps us create quickly a ticket from items and later keep track of it. After few weeks we had a working test environment that was migrated to DC where we were able to test all of our ideas how to proceed. Going through this give us estimated time of the effort which we used later to build up exact project schedule.

Step 7: Timeline

Looking at the schedule we noticed that there are many things that need to be done that we might not have enough time over the weekend for whole migration of Jira / Confluence, with reconfiguring everything, changing to DC on risk where heavy used plugins like Insight migrate from plugin to a build in Jira Service Management feature.. At the end we had to include also user testing of key business processes so that on GO/NO GO Decision we would say that everything is good let's roll out.

Of course every case is different but in ours we decided that we need one full workday where we can put entire team into one room at the office and like in old times (before COVID) make this separation real. We chose a date in the middle of that month where all key processes related to timesheets approval, invoicing, are done and used Friday also for migration. Thanks to that on Sunday systems should be ready for testing and if everything would be fine, make it live, so that on Monday everyone could start using a new system.

On this point what need to be crucial is clean communication to the business users.. What would happen, why.. What will be a result of this change etc. We set up Q/A session if someone would like to know more, we added special Confluence page with details that everyone can read, banners and reminded that this day of separation is coming. Users had to know that today they are using this URL and after this date they would be using a new one and what this mean to everyone.

Step 8: Migration

Yes, finally day of migration. We were preparing everyone and ourselves for this for months. Two or three times we changed the way how we would handle this but finally after cooperating with IT that was responsible for infrastructure we decided that we create a copy of the system first, reconfigure and prepare everything on the system, network side.. Similar to creating a new test system. The difference is that this system would be synced on the day of migration again with production data. By doing this we reduced the risk that something would not work on the IT side and also reduced final effort time to minimum.

What we can overall we did earlier, like moving GBs of attachments at night before the migration or creating an exact command line procedures so that thinking would be reduced to zero. We just have to do the same things that we did on TEST and hope that it would take same time.

Of course we still used Friday and the idea of gathering everyone into one room. A day before migration we also synchronized also our TEST system do that all business users can access data when they need it. Read only system was really good idea. Thanks to that we also reduced the pressure that someone is not able to make progress of a task because system is not available. That worked perfectly. Of course we underlined that this is only in basic read mode and changes would not be moved since this would make things more complicated.

I do not want to get into all details what we did and how but definitively few things are worth mentioning. A lot of time we spent on deactivating users on old system. Only deactivated user would not be able to access system and would not receive notifications so this stared to be our main goal before we even start playing with new system. The problem was you cannot deactivate an user if he is a project or component lead and we did not want to change everything in order to still have a chance to revert back.. So this was not helping.. But of course we found a workaround for that, by deactivating the user from Confluence which uses Jira as a User Management (thanks to that it bypass this error and make it possible)

We also decided in our process to change Server ID [https://confluence.atlassian.com/jirakb/change-the-server-id-for-an-instance-of-jira-server-285839562.html] which later caused a little bit problems with Application Links and of course with integration between Jira and Confluence. On the official procedure it says that this step is last step and it is optional but we think that this was required in order to totally separate, create new licenses that would be connected with new Server ID etc. So this is what everyone need to pay attention to.

There were couple of more roadblocks. When we did smoke tests for Confluence we wanted to test if space is successfully deleted and it started a whole bunch of problems. Deleting test space caused an unexpected error that we were not getting on TEST system. That break entire instance. After long investigation we started to think that we would not be able to fix it and whole migration would need to be cancelled.. Finally we found that it is a problem with MySQL database, some procedures were missing and the database need to be recreated with specific parameter and specific privileges granted to user that run database. Hopefully that helped and we finally were able to give the instance for final testing.

Testing was successfully. Process owners tested key procedures for the business and result was promising so decision was to move forward and start using this environment as a new PRD for 1000 users. We were very tired and happy but we also know that it was not the end of work for us.. The show would begin on Monday morning.

If you would like to know what else need to be done and how to handle it jump in to Part 3 of the story.

0 comments

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events