Reorganising existing content

Hello everyone,

I've inherited a wiki that has 1000s of pages of content of all types. At the moment I have to take a couple of the spaces that are two or more years old and reorganise the content in the following ways:

  • remove/archive the old info
  • update the page designs to get them standardised
  • introduce a standard design throughout the spaces (using templates where necessary)
  • do whatever I can to help improve the search results.

The problem I'm having is that there are 1000s of pages to go through, which is obviously very time consuming and in reality not very likely to happen. Some of the content is highly organised, some has just been thrown together. It's all a bit mix n match as far as page design and layout is concerned.

My approach at the moment is to look at the various areas that make up the spaces, get a feel for what the dozens of contributors have done and see how I can improve things by making a variety of changes. My next step will be to recommend various changes that pages owners (is there really such a thing after two or more people have edited a page?) have to make. After that, it's pretty much up to them to make the changes, or do it all myself.

Has anyone else done this before, and if so how did you go about it? At the moment, I'm looking at climbing a mountain as I can't see too many people having the time to make the changes I think are necessary.

Thanks in advance.

3 answers

1 accepted

0 votes
Answer accepted

This is a pretty broad topic with lots of follow up questions necessary.

I'll kick off the discussion by addressing the issue of using standardized designs... if by this you mean the header/footer and general appearance of each page, then forget templates. Those are for when you create a new page and want some macros or other content auto-inserted into it before a user creates a new page.

To the topic of design: how many spaces are we talking about here? It's not trivial, but not overly complicated either, to modify the decorators to create a standard design across multiple spaces. You can use the main, global, page, or space decorators for each space to create a standard look and feel.

A good way to tie lots of disparate doc together is to use the pagetree macro. Add that to your left gutter. You can stack them or combine content from multiple spaces to make it all appear as if its part of one large doc set. This can get a little tricky, but its doable.

As for improving search results, I find using consistent labels is one of the best ways to achieve this. Confluence uses Lucene for its internal search, and much of what you might know about SEO (Search Engine Optimization) for Google and other major search providers applies to it: use good headings (h1, h2, h3), and use important keywords in teh first paragraph or so.

Matthew, thanks for taking the time to answer. It was more a case of how do I review all this info than what comes afterwards. There's so much of it to go through, so is it sensible to go through it alll, or to review a percentage and extrapolate an answer from that? If so, how accurate is that likely to be? Cheers.

oh, I see. I know from experience reviewing docs that you can probably identify which users' documents require more review than others... so maybe sort the content by the creator, and then you can start by reviewing the docs that were created by people with the worst "track record"

Mick, you should definitely start your work with the Archiving Plugin.

As you wrote the first step should be archiving (not deleting!) the pages that are not needed anymore. The Archiving Plugin will give you a lot of help in this finding what needs to be archived (what is out of date? what is not viewed by anyone?), and then actually doing the archiving.

Even if you decide making the work manually, the plugin will show you accurate statistics about the "outdatedness" per space. It gives you a good insight which are the most problematic spaces and space categories in your Confluence site:

I don't think that the refactoring part of the work can be easily automated, as your organization may have its custom policies and structures. You should define the rules here, and then either apply them manually, or maybe write some scripts and utilize the Confluence CLI plugin to automate it. (In case of 1000s of pages it may well worth the efforts.)

After you cleaned up and re-ogranized the current content, it is a very good idea to let the Archiving Plugin run continuously (aka continuous wiki gardening), to avoid getting into this situation in the future again.

Make sure you read the manual.


Thanks for that info. I'll have a look at the plugin: it would be interesting to find out which pages aren't being viewed at all.

I've actually done a lot of this work now, mainly archiving really, and am about to start talking to all the page owners about the pages that are left. Wish me luck. :)


I'm sorry I haven't found this question faster. I could have saved a lot of really tedious work for you... Hope it's not too late though.

Thanks, but no worries, that's my problem! :)

Suggest an answer

Log in or Sign up to answer
Community showcase
Published Mar 12, 2019 in Confluence

Confluence Admin Certification now $150 for Community Members

More and more people are building their careers with Atlassian, and we want you to be at the front of this wave! Important Dates Start the Certification Prep Course by 2 April 2019 Take your e...

279 views 2 10
Read article

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you