What is the maximum length of a .docx file for uploading

Stacey Plowright April 16, 2024

When trying to upload using the "from Word doc" feature, I get the following error: 

 

The content is too long

You can split the content into smaller files and import them separately.
Our files are huge. I need to know what the maximum length/size it accepts is (and if there are any other reasons this error might appear).
Thanks,

2 answers

1 vote
Sumit Uniyal
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 18, 2024

@Stacey Plowright  

The maximum size for a Word document that can be imported into Confluence Cloud is 20MB. If your Word document is larger than this, you will need to split it into smaller files before importing.

The "content is too long" error can also occur if the document contains more than 2 million characters. This is because Confluence has a character limit for each page, and if your document exceeds this limit, you will need to split it into smaller sections.

In addition to size and character limit, the error might appear due to the following reasons:

1. The document contains complex formatting or elements that Confluence can't import. This includes some types of tables, text boxes, and certain types of images.

2. The document is password-protected or encrypted. Confluence can't import these types of documents.

3. The document is corrupted or damaged. If this is the case, you might be able to open it in Word, but not import it into Confluence.


Regards,
Sumit Uniyal

Stacey Plowright April 18, 2024

Hi Sumit:

Can you elaborate on certain types of images? If I can strip them of something on the oXygen end, it would be helpful if I could get them to import correctly on the Confluence end.

Sumit Uniyal
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 18, 2024

@Stacey Plowright Just a small correction here, file size limit for .docx is 100 MB for a single file , for bulk file it is Up to 30 files, 50 MB total and each file cannot exceed 10 MB.

Regarding your query for certain types of images, please find some general tips that might help:

  1. Simplify Your Document: If possible, simplify your document before importing. This could involve reducing the number of images, tables, or other complex elements.

  2. Split Large Documents: If your document is very large or complex, consider splitting it into smaller, more manageable parts before importing. This can help avoid issues related to file size limits and can make the import process smoother.

  3. Check Image Formats: Ensure that the images in your document are in a format supported by Confluence. If not, you might need to convert them before importing.

  4. Optimize Images: Large, high-resolution images can significantly increase the size of your document. Consider optimizing these images to reduce their file size before importing.

  5. Check Table Complexity: If your document contains complex tables, check if they are displayed correctly after import. Confluence might not support all table features, so you might need to adjust the tables in Confluence after import.


Please find below documentation to get more information regarding this feature: 

https://support.atlassian.com/confluence-cloud/docs/import-content-into-confluence-cloud/ 

Stacey Plowright April 18, 2024

Thank you. I'm progressing, I think, but slowly. I've been able to import from Word (their article about receiving an error when opening a document and using the "Open and Repair" option seemed to have helped with that).

However, the images all seem to be maxed out when they're imported from Word to Confluence. What was originally an inline 14x14 image in oXygen that looked okay in Word imports as 1000+ image into Confluence. It seems like the larger images are also "maximized", but they're smaller (the original image size would have been much larger)? I think it may have something to do with Confluence's minimum image size which seems to be around 24 px or something like that?

I also found the "PDF embed for Confluence" and "Office Word" elements. Either of these would work, except it looks like PDF embed doesn't support search (which would be one of the reasons to put the document in Confluence) and Office Word lets you download the content (which is something I'm trying to avoid allowing).

Sumit Uniyal
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 19, 2024

@Stacey Plowright   In case you are facing further issues, then please raise a support ticket with Atlassian directly, as we would require access to your instance for further troubleshooting.
As per the Atlassian Support offerings for any technical support you need to take help from your site admin to log a ticket on your behalf. Please see the below screenshot for reference.

Screenshot 2023-06-02 at 5.41.36 PM.png

 

Regards,

Sumit Uniyal

Stacey Plowright April 19, 2024

Okay. Thanks!

0 votes
Dave Rosenlund
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
April 16, 2024

Welcome to the community, @Stacey Plowright  👋

The default max size is 100 MB, but this is configurable by your site admin., so your max file size could be different than this.  More here.

Best,

-dave

Andrii Maliuta April 16, 2024

Hello @Dave Rosenlund ,

I suppose the 100Mb size is related to Attachments - if you upload .docx as a file to page attachments.

For using Import From Word there does not seem to be a description on max size (in characters, file size or words), though maybe it can be clarified by Atlassian team.

Like # people like this
Stacey Plowright April 17, 2024

Andrii is correct. I'm trying to do the Import from Word option. Does the 100 MB max file size apply to both avenues? Looking at them, they are smaller than that. I have 49,111 KB, 15, 069 KB and 3, 216 KB. 

I am actually trying to get things from oXygen (.dita files) to Confluence (I would want them as the final document, or large chunks of it, in Confluence, not the 600 or so topics the document is comprised of); so I've tried transforming from oXygen to Word and then importing to Confluence, and transforming from oXygen to PDF and then from Adobe to Word. 

The 49, 111 and 15, 069 should be the same .docx document created the two different ways; the 3, 216 was a piece of the document.

All three gave me the same error, so I'm trying to figure out how small I need to go or if there might be something in one of the transformations causing things to fail.

Like Dave Rosenlund likes this
Dave Rosenlund
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
April 17, 2024

Thanks for highlighting that my assumption about the limit applying in both cases is likely wrong. Sorry about that.

Yes. Let’s escalate this to Atlassian (which I have done). 

-dave

Like Andrii Maliuta likes this
Kristian Klima
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 17, 2024

Speaking from experience, the success seems to depend on the complexity and 'richness' of the imported doc file.

I was able to import fairly large chunks from a Paligo export doc file but the quality of imported content progressively decreased. Elements that were rendered nicely at the beginning were all over the place later on. 

Like Andrii Maliuta likes this
Stacey Plowright April 17, 2024

@Dave Rosenlund Okay. Thanks!

@Kristian Klima That's good (if disappointing) to know. 

Like Kristian Klima likes this
Kristian Klima
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 17, 2024

@Stacey Plowright 

As with any import/migration, consistency of the source is extremely important. One needs to pick one's battles.

At the end, we decided to copy-paste the content from a static website (rather than from Paligo or exports) into Confluence as we found out that the humble copy-paste was faster at the end of the day.

Of course, if importing doc files is an integral part of your workflow, it makes sense to look into the structure of the imported documents so they make post-import sanity checks in Confluence as smooth as possible.

Stacey Plowright April 17, 2024

@Kristian Klima Oooh, there's an idea. But I'm not sure about how to get the content from oXygen into a single web page. Hmmmm...

(I'm basically just trying to figure out if I can create an internal version of our external-facing content that is more likely to be read/found by our people. I don't really want to attach PDFs if I can help it so people can't accidentally send the wrong version to customers.) It needs to be a quick process -- I wouldn't be able to spend the time needed to make the Confluence version usable every time I updated the source files.

Kristian Klima
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 17, 2024

Well, back in the day (at CA Technologies) we developed a custom XML convertor that did one very simple thing - converted XML exported from AIT into a Confluence XML. 

After an extensive cleanup in AIT, we gradually migrated all content to Confluence and just continued to work on docs - for internal and external audiences - in Confluence.

Anyway... I presume you have multiple pages in your oXygen. From my previous migrations experience, you have a couple of options:

  • writer a converter so you can import XML into Confluence as a native Confluence XML
  • see how brute-force copy-paste of word documents' content into Confluence's editor works
  • use an 'in between' format / tool. You may get a better result if you import/copy paste a word doc into Google Docs then to Confluence (I sometimes do this acrobatics with excel/spreadsheet files :) 
  • Generate a one-off simple website from oXygen (if possible) and copy paste into Confluence page by page

We're on our 4th migration at Emplifi, there's no magic bullet here :)

Now we have almost everything in Confluence and use a Scroll Viewport app to generate two Documentation websites - one is public, the other is internal accessible via our SSO. Authoring all in a single source and using conditional content macros.

 

Like # people like this
Andrii Maliuta April 17, 2024

These are very good points by @Kristian Klima.

I suppose there are really no 1-way approach to import the documents well directly. We also applied different ways to migrate data from other tools to Confluence - some by copying HTML directly and cleaning it ant converting to xHTML format (Confluence Storage format) and some to Wiki markup where it is more relevant and easier.

As for the Word import there are also some points to keep in mind:

  1. Some symbols in are replaced in Headings to be Confluence page titles so it will not correspond to what you have in Word
  2. If you have '[', ']', '{', '}' in the Word document, it will break the content as Confluence will think these are macros and Wiki format so you need to carefully replace them before import.
  3. Tables , though seem like OK, are usually broken as the header rows will have incorrect formatting and will be just "coloroed" as if they are Header columns
  4. ...

So, I used Word VBA macros to run automatically and "clean" content to prepare it for import to Confluence. And then after import I used Groovy scripts to replace it back to what it was :) 

I suppose that really the HTML is the best option as you can control it and format into xHTML for Confluence and there are a lot of tools to work with HTML and clean it as needed for further import.

So some home-made tools are necessary for this process :) 

Like # people like this
Kristian Klima
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 18, 2024

@Andrii Maliuta your reminded me of my macro set for converting some weird formats from mainframe into confluence HTML. I had two Word macros and two Notepad ++ macros. Took me two days to design but it saved days of work at each use.

Like # people like this
Stacey Plowright April 18, 2024

oXygen was able to direct me on getting things on one webpage. :) 

The text seems reasonable enough copied and pasted into oXygen, but the images are not. I'm getting "Preview unavailable" on all of them, and when I publish, they are still unavailable. 

If I copy and paste a single image by itself from the webpage, it seems to be okay. Any thoughts on formatting things to try in output that might make it work (is Confluence allergic to alt text, etc.).

Kristian Klima
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 18, 2024

@Stacey Plowright Yes, copy/pasting images sometimes work sometimes it doesn't ... but we always took it and ran :)

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PRODUCT PLAN
PREMIUM
TAGS
AUG Leaders

Atlassian Community Events