As with current GDPR, the CJEU decision on "Schrems II" and protection of intellectual property, customers asked to me find out if there is a way to migrate "the other way 'round", i.e. from Cloud to on-premise.
I ran different tests and stumbled over problems again and again. First, the ZIP files I created in the Cloud as "backup ZIPs", were corrupted. During restore process of a Confluence Cloud ZIP the restore stopped somewhere in the middle leaving an unsuable Confluence Server. The web GUI and the log files point to "CRC errors".
I tried different zip applications, e.g. Ubuntu's unzip/zip, CentOS unzip/zip, macOS unzip/zip and WinZIP - unfortunately each of them produces CRC errors during the restore except zip files I created with "7zip".
So I repacked the ZIP und restarted a restore. And got new errors telling me about a corrupted entities.xml. I corrected these errors via "atlassian-xml-cleaner", repacked the zip and restartet the restore again. I still got one CRC error with an attachment.
Then I tried the same with a backup ZIP from a Jira Software Cloud instance. This time all the attachments and graphic logos weren't automatically placed in their proper directories, e.g. under /var/atlassian/application-data/jira/data/attachments/....
I had to unzip the zip file and to move the attachment files and logos manually.
Then I stumbled over different usernames. Both the Jira and the Confluence backup zips were created based on the same user accounts coming from Access. But in Confluence the user accounts had names like "firstname.lastname@somedomain.blah" whereas in Jira the account names were created without the domainname, e.g. "firstname.lastname". So, Jira and Confluence were unable to sync user data, e.g. when using Jira as a directory server for Confluence, I got two different user accounts for one physical, real user.
I had to use some SQL expressions and REST calls on Jira to change the usernames into the same pattern Confluence uses and hda to set Jira as a primary directory server in Confluence.
Next problem was (or is) the missing application links. With backups from Server or Data Center, I can change a previously set application link between Jira and Confluence by "repairing" the existing link entry.
With the Cloud backups, there wasn't any application link. So any useful link from and to Jira and Confluence is broken.
I've already found and read a bunch of "solutions" for a few problems, e.g. the corrupted entities.xml file, but I am really missing a simple restore process which is worth the name. The out of the box mechanisms do not work (create a backup of the Cloud, download ZIP, restore on a fresh Confluence or Jira server) and I do not want to recommend a Cloud product to customers if they are never able to "get back" their data and run Jira or Confluence on-premise.
I wonder if I am the only one who tries to leave the Cloud and use Server or Data Center products.
Daniel,
been there, done that.
Means: I've studied any guide, Q&A and hints from Atlassian support I was able to get hold of.
The ZIP download was fine, i.e. no corruption during download. The CRC errors in the backup ZIP were caused on Atlassian's server. And the CRC errors are detected only by Confluence on-premise server. I guess that the "unzip" library in Confluence causes the trouble.
The broken entities.xml is because of a bad implemented algorithm which obviously ignores i18n settings and data, i.e. special charcters like German Umlauts. These special characters have to be masked in the XML via "CDATA" or HTML entities like "ä". But they aren't. So one has either to use an editor to search and replace the entities.xml or use the "atlassian-xml-cleaner" Java app.
Anyway - it's disappointing that there isn't any straight backup/restore process and that errors like with the entities.xml haven't been fixed.
I used the atlassian-xml-cleaner.jar for similar encoding issues many times before too.
XML exports/ restores are very error prone in my experience. Unfortunately.