Can I use Confluence to index and serve a large collection of document files with formats such as PDF and MS Office ?

Nic Hart December 27, 2012

Although not the main use for Confluence I would like to import, index and serve a large existing document library, with the intention of importing more and more content over time, meanwhile linking as needed from new pages and the search engine. Clearly this can be done piecemeal using the Office Connector + document attachments, however I would like to slurp an entire library structure, maintaining the existing folder structure. Is this feasible and what performance or scaling limitations might I expect? I have read about the WebDAV plugin and also the SOAP API to add attachments, but it is not clear if these would allow to preserve the directory structure and also support very large file numbers.

Please note - the ability to index and search the foreign content is vital for this application.

thanks

1 answer

1 accepted

1 vote
Answer accepted
twong_atlassian
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
December 28, 2012

Hi Nic,

Confluence isn't designed to be a Document Management System: It has capabilities/features that mirror some of the DMS feature set, but we're certainly not feature complete.

Attachments in Confluence are not first-class entities; you can't have a standalone document have child resources. You could use pages to emulate "folders", and I suppose you use pages as a metadata storage format, but... it's a bit of stretch.

Performance/scalability wise, we do a best try to index certain documents, but we rely on the cabilities of third-party libraries from the open source community to do this. Indexing is not perfect, and doesn't work well on extremely large instances. Mass import isn't very fast either: we have no good way to stuff huge amounts of attachments into a space. Webdav is really designed for casual use: it's not implemented in such a way to be robust enough to en-masse import huge file numbers.

I'd suggest simply playing with the webDav and indexing capabilities to ascertain for yourself whether Confluence does an adequate job, but I will certainly maintain that Confluence is a first-class collaboration wiki first with attachment capability rather than a Document Management System with a wiki.

Tim

Nic Hart December 29, 2012
Thanks Tim. That is just the answer I needed. I wanted to manage expectations appropriately and this allows me to do so. It's a wiki I want...!
twong_atlassian
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
December 30, 2012

Glad I was able to help out :)

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events