migrating H2 database to postgres

Kim Greer January 5, 2018

We are running Confluence 5.10.8 using an embedded H2 DB, and need to convert to Postgres external DB. We have created a test machine also running 5.10.8 on which to import the _site_ backup file and this fails every time. We've tried using empty, example and new site/spaces at initialization. We've tried exporting and importing _spaces_ one at a time. These will often succeed  but breaks Confluence, not being able to log in again.

Ultimate goal is to bring Confluence version up to date using external DB.

Our DB backup file is about 450 MB. We can create a Postgres listener, which seems to work OK.

However we have tried many permutations of installation with an external DB, changing the session timeout to 60000 from 60 seconds, memory "CATALINA_OPTS="-Xms1024m -Xmx1024m" to 2048 or 3072 from 1024, installing the Postgres driver in the correct lib directory and so forth, and nothing seems to work. All restore operations either fail, or seem to succeed but trashes the existing (new) site, where we are unable to log in again, or at best goes to an error page. We have tried using a duplicate machine for restore operation and using the original machine (a cloned VM) with new instance of Confluence running on different ports from default.

We are trying to find some tool with which to convert the H2 DB to Postgres, but nothing (free) seems to be available, and the builtin restore seems to always fail. Do we need a third party tool to perform the DB conversion?

Please advise some method to convert our embedded H2 format DB to external Postgres. 

We have seen page https://confluence.atlassian.com/doc/migrating-to-another-database-148867.html?_ga=2.58186351.495878581.1515160684-350701382.1514301010, but it does not seem to help in our situation.

Thanks.

3 answers

2 votes
AnnWorley
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 5, 2018

Hi Kim,

The article you are using: Migrating to Another Database

is the best guide we have for changing databases. Here is an overview of the steps from a high level:
  1. Install Confluence 5.10.8, using PostgreSQL as the database.
  2. Make an XML site backup of your existing instance (the one with the H2 database) Note the admin credentials for that instance.
  3. Restore the XML back-up to the new PostgreSQL based instance.
  4. Log into the restored instance using the credentials from the old instance, as the site restore will have overwritten the entire instance. (See Restoring a Site

You describe several obstacles your team has run into, which must be frustrating. I will address some of them the best I can:

  • The behavior of having the destination logins overwritten is expected, as mentioned in Restoring a Site 
  • If you run into trouble again, please let us know what error message you are seeing when the import fails.
  • Raising the heap memory as you said you tried can be helpful during the restore but you should consider setting it back to the default for day to day operation.
  • The PostgreSQL driver is bundled with Confluence 5.10.8 so you shouldn't have to move it on the file system.

You also mentioned bringing Confluence up to date. If you are planning an upgrade, it needs to be another procedure, after the db migration. Here is our upgrade guide: Upgrading Confluence

 I look forward to any follow up questions.

Thanks,

Ann

 

0 votes
Kim Greer January 9, 2018

First, Ann, thank you for the great checklist. We followed those instructions before and after the original posting. The question was brought up "what do you mean by failing?".

By failing: (1) The web page on which we perform the restore operation gets usually to about 39% and the progress bar stops. A message is shown below the progress bar similar to "A" and "B" below. (2) Failure to complete the restore operation usually results in an error page, and most times prevents us from logging in again. The system at this point has apparently had the password db entry wiped out. In cases in which we are able to log in again, 100% of the time we are presented with error pages basically saying "something has gone wrong". Sorry - I don't have the exact wording.

A. "Import failed. Check your server logs for more information.
com.atlassian.confluence.importexport.ImportExportException: Unable to
complete import because the data does not match the constraints in the
Confluence schema. Cause: PSQLException: ERROR: duplicate key value violates unique constraint "unq_lwr_username" Detail: Key (lower_username)=([removed]_sa) already exists.", or ...

B. "Import failed. Check your server logs for more information. Transaction rolled back because it has been marked as rollback-only".

In the case of "A": this username is from our institution's Active Directory, which worked well in pre-DB-conversion. We tried making backup zip files with AD as part of our configuration and without it being tied to at all. Either way, AD contents (thousands of users and groups) are in the entities.xml file if we examine the backup contents. Maybe we could manually remove the duplicate from entities.xml and make a new zipped backup file?  We might have to do this any number of times depending on the # of duplicate entries.

We have tried to find free tools that would let us examine and delete specific DB entries, such as the duplicates, but have found nothing yet. Neither my co-worker or I are fluent in DB issues, so we are at a disadvantage there. We checked with a colleague who manages DB systems, but he is not familiar with H2 or Postgres.

At this time, my co-worker anI believe we have a chance to restore successfully if we could just removed the AD entries in the backup file

 

As for "B": We found no obvious text explaining the reason for the failure.

AnnWorley
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 9, 2018

Thanks for all the details, it provides a lot of context.

Are you reinstalling Confluence between attempted restores? Once a restore has failed, it cannot be re-done without a fresh database. The errors you are reporting are unique constraint violations - the restore is trying to write user names a second time when they are required to be unique.

if you want to migrate databases without the LDAP users, then I have a idea for that:

  1. Copy your entire instance including the H2 database to a test instance
  2. Remove the LDAP User directory (or set the Base DN to a container with no users and then synchronize, which will remove the users)
  3. Check in the Confluence Administration>Users to make sure the users are gone
  4. Make sure you can log in with an internal administrator from the Confluence built-in User Directory
  5. export the instance. 
  6. import it to a PostgreSQL-based instance
  7. log in with the internal administrator from step 4
  8. Re-create the LDAP User directory using the same settings

I look forward to hearing how it goes.

Kim Greer January 16, 2018

Ann,

 

Thank you for the time and effort you are putting into this.

 

We believe we have tried the instructions with all the steps you include. We thought this would remove the AD username and group entries from the backup / entities.xml file, but surprisingly, all the AD usernames and groups were still present. Maybe we were doing something else wrong or copying the wrong backups or something.

 

Another surprising thing:

We totally removed AD from User Directories, and then made a backup with not including attachments. Our zip file went down from 450 MB to about 55 MB, which seems reasonable (I think). Note that our AD has 74K + usernames. However if we unzip the backup manually and look at the entities.xml file, the AD usernames and groups are still there! Note that our attachments directory is 558MB of raw space/contents.

 

We wanted to just take standard editors to remove a lot of the AD usernames and groups blocks of xml code, but all editors (vi, emacs, jove) that we have tried would fail with “line too long” messages. And we haven’t found a free or open source XML editor that will handle the 1.5 GB entities.xml file either. The XML editors would eventually fail with out-of-memory errors.

 

We will keep trying with other ideas, but it looks like we are stuck right now. We are going to try installing Confluence 6.x and importing just some spaces’ backups. I don’t think we’ve tried that yet, but I’m pretty sure the times we tried starting from scratch with 5.10.8, the spaces would SEEM to succeed, but navigating to other pages in Confluence (like the admin pages) would either result in page errors or with the administrator (“admin”) password getting blown away, leaving us with no option but to uninstall and re-install afresh and try other ideas.

 

Thanks!

KG

0 votes
Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
January 5, 2018

You have not explained what "fail" means.  What is going wrong?  Errors?

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events