Faster page moves in Confluence Data Center

Hi Community!

The Confluence Data Center team recently fixed an issue that we know affected some of you. Now, there’s no need to worry about attachments going missing during a Confluence page move. With a bit of magic hard work, this issue has been resolved thanks to our new attachment storage format, which has a number of benefits.

First released on 14 February 2023 in Confluence 8.1, the new attachment storage format:

  • prevents attachments going missing when pages moves are interrupted

  • retrieves any previously missing attachments

  • seriously sped up page move performance.

These improvements make storing your attachments in Confluence easier to manage, and your page moves faster and more stable.

The bonus surprise - faster page moves!

While the purpose of this project was to fix the issue of missing attachments, the design of the new solution additionally brings about a significant performance improvement for page moves. This is because files no longer need to be moved on the disk.

We tested this on an internal Confluence instance where we compared the speed of page moves for the old (v3) and new (v4) attachment storage formats using pages with one, 500 and 1,000 attachments.

The results speak for themselves:

Screenshot 2024-02-21 at 9.35.42 am.png

💪 Page moves for one attachment are 10% faster, and for 500 and 1,000 attachments they are 11x faster!

This not only improves the experience for the user (who likes to wait 15 seconds when you could be waiting one instead?), but also reduces the risk of something going wrong during a page move, such as an admin restarting the instance, making the entire process more stable!

How Confluence lost track of attachments

Sometimes, attachments were not where they were supposed to be in Confluence. This was caused by page moves, and more specifically the layout of attachment file paths which contained a reference to the space they were located in. Every time a page was moved to another space, Confluence had to not only update details of the page in the database, but also of the attachment. This might not sound too bad at first, but with 1,000 child pages that contain 10,000 attachments, this very quickly becomes… a process to avoid - especially if those attachments also include several versions.

When a page move failed or a server was restarted, the job to move the files did not get rolled back. Confluence would try to access the attachment in its old location, when it had actually already been moved to a new home. This is how attachments went missing.

We provided workarounds to mitigate this issue, but the recovery script required downtime. Understandably, this was not great.

The good news: it’s fixed!

One of our engineers, George, came up with the solution: A more robust way of storing attachments in our file system that no longer reference the space or page. Instead, the file path is now based on a single identifying attribute - the ID of the attachment itself.

This brings with it a multitude of benefits:

🚀 More reliable page moves! Attachments won’t go missing since the file stays where it was created.

🚀 Faster page moves! Files no longer have to be moved on the disk.

🚀 Faster incremental backups! The backup tool has to back up fewer files.

🚀 Less disk consumption! The new layout uses fewer index nodes (we’ve observed a 10% reduction).

With the move from the old (v3) to the new and improved (v4) attachment storage format 🖇️ in Confluence 8.1+, so far we’ve observed no support requests from customers running Confluence v8.1.0+ relating to missing attachments.

What’s great about the fix

A migration process begins upon startup of Confluence 8.1+ that moves all attachments from the old (v3) folder structure to the new one (v4) in the background. This process is great because:

⭐️ Runs in the background - move on with your day in the meantime!

⭐️ Attachments remain available - Confluence can find it, whether it’s migrated yet or not.

⭐️ Missing attachments are restored - any previous data loss is reversed!

This project also paved the way for enabling Amazon S3 object storage, aimed at those of you with large or increasing data needs. The new attachment storage format negated complex data migration code, enabled reverse migrations and minimised downtime during migration to S3.

Thanks for being part of our community!

The Confluence Data Center team

ℹ️ Further information:

3 comments

Comment

Log in or Sign up to comment
Marco February 21, 2024

That sounds very good! Does the migration happen automatically during the update from 7.19 LTS to 8.5 LTS? Or is there a scheduled job that executes the migration afterwared? And if so, is there a way to see if it has been completed? I ask because we have around 10TB of attachments...

agawron
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
February 21, 2024

Hi Marco, thanks for your questions.

> Does the migration happen automatically during the update from 7.19 LTS to 8.5 LTS? Or is there a scheduled job that executes the migration afterward?

A migration process starts in the background once the upgrade to Confluence 8.1+ is complete.

And if so, is there a way to see if it has been completed?

When the migration task is finished, a report file v3-to-v4-report.log will be available in the attachments directory. A new report is created for each migration run. It contains a list of files with corresponding issues, and the migration status is printed at the bottom – for example, completed successfully, completed with warnings, or interrupted. The report file does not print successfully migrated attachments to avoid huge log files.

I ask because we have around 10TB of attachments...

10TB shouldn't be any concern. You can always track migration progress. There is an entry printed in atlassian-confluence.log every 50,000 migrated attachments. Search this log file for the string Attachments migration from ver003 to v4 progressed

You can refer to this page for details and FAQ.

Like Jan Fitzner likes this
Hua Soon SIM [Akeles]
Atlassian Partner
February 26, 2024

I love the feature that the upgrade can reverse previous data loss 👍

For those who are curious whether they are affected by the missing attachments, it is possible for Confluence admins to trigger a scan with Missing Attachment Scanner.

TAGS
AUG Leaders

Atlassian Community Events