Can I use Confluence to index and serve a large collection of document files with formats such as PDF and MS Office ?

Although not the main use for Confluence I would like to import, index and serve a large existing document library, with the intention of importing more and more content over time, meanwhile linking as needed from new pages and the search engine. Clearly this can be done piecemeal using the Office Connector + document attachments, however I would like to slurp an entire library structure, maintaining the existing folder structure. Is this feasible and what performance or scaling limitations might I expect? I have read about the WebDAV plugin and also the SOAP API to add attachments, but it is not clear if these would allow to preserve the directory structure and also support very large file numbers.

Please note - the ability to index and search the foreign content is vital for this application.

thanks

1 answer

1 accepted

This widget could not be displayed.

Hi Nic,

Confluence isn't designed to be a Document Management System: It has capabilities/features that mirror some of the DMS feature set, but we're certainly not feature complete.

Attachments in Confluence are not first-class entities; you can't have a standalone document have child resources. You could use pages to emulate "folders", and I suppose you use pages as a metadata storage format, but... it's a bit of stretch.

Performance/scalability wise, we do a best try to index certain documents, but we rely on the cabilities of third-party libraries from the open source community to do this. Indexing is not perfect, and doesn't work well on extremely large instances. Mass import isn't very fast either: we have no good way to stuff huge amounts of attachments into a space. Webdav is really designed for casual use: it's not implemented in such a way to be robust enough to en-masse import huge file numbers.

I'd suggest simply playing with the webDav and indexing capabilities to ascertain for yourself whether Confluence does an adequate job, but I will certainly maintain that Confluence is a first-class collaboration wiki first with attachment capability rather than a Document Management System with a wiki.

Tim

Thanks Tim. That is just the answer I needed. I wanted to manage expectations appropriately and this allows me to do so. It's a wiki I want...!

Glad I was able to help out :)

Suggest an answer

Log in or Sign up to answer
Community showcase
Posted Sep 17, 2018 in Confluence

Why start from scratch? Introducing four new templates for Confluence Cloud

Hi my Community friends!  For those who don't know me, I'm a product marketer on the Confluence Cloud team - nice to meet you! For those of you who do, you know that I've been all up in your Co...

547 views 7 6
Join discussion

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you