Maintain a Confluence instance

Gary Spross
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 11, 2024

We have a Confluence Cloud instance that has been in use for at least 6 years. There are tons of Spaces and Pages. We want to "clean" it up and reduce outdated, inactive, unnecessary clutter. We are finding this EXTREMLEY difficult for many reasons such as:

  • The "last viewed" Page attribute only goes back 1 year
    • Anything beyond that is listed as "N/A"
  • Page restrictions
    • Unlike Jira, Confluence (as far as I'm aware) doesn't provide a way for admins to override Page restrictions to be able to "see" restricted pages via automation
      • I'm aware of the Admin key, but that doesn't help when performing Page branching in an automation rule
    • This is necessary as I'm not planning to go through the Spaces manually to perform this cleanup effort...
  • Lack of an "Automation for Confluence" account
    • Not a dealbreaker, but it would be nice if the actions performed were associated with a common account as opposed to the individual tasked with creating the automation rules
  • 1,000 Page limitation within automation rules
    • I understand there has to be limitations so an automation isn't running for several minutes or even hours, but...
    • We created a rule (even with the limitations provided), to label the Pages to impacted
    • On subsequent runs, the rule just loops through the same 1,000 Pages, so there's no way to "advance"
  • Inability to export impacted Pages
    • Confluence's search UI is ugly
    • That and there is no ability to export the Pages
    • We wanted to be able to export the Pages to be impacted by this effort to provide communication to our userbase that these are the Pages to be archived/trashed before actually performing the maintenance effort

I tried looking for an add-on that could assist with this maintenance effort, but wasn't finding anything. That said, for the cost of Atlassian products, I would expect to be able to perform this level of maintenance without needing to pay extra for 3rd party tooling...

Any advice would be much appreciated!

7 answers

3 accepted

10 votes
Answer accepted
Levente Szabo _Midori_
Atlassian Partner
November 12, 2024

@Gary Spross When it comes to maintaining Confluence and content lifecycle management specifically, Better Content Archiving and Analytics is a popular solution that checks your requirements already today.  The Data Center version has been on the market since 2008. When we at Midori worked on the cloud version, we used all the knowledge we gathered from working with large enterprises with huge Confluence instances, like this well-known professional social platform or this household brand animation studio.

We also summarized our thoughts about Confluence content lifecycle management in this article. I recommend reading this to learn more about our philosophy on this subject.

1. Better Content Archiving and Analytics implements its own automation engine available for all tiers, including free. You are not restricted by any limitation of Automation for Confluence.

2. The app allows you to simply create your own statuses, up to 20 instead of the 5 built-in.

3. You can create your custom rules for what is not viewed, not updated, to review, etc status. The statuses are updating automatically or can be adjusted manually.

4. You can automate archiving or delete actions within the app that will take action based on your statuses and the schedule you specify.

5. It's not exporting, but the app delivers an advanced notification system that automatically sends emails to the designated recipients (based on roles like page owner, group membership, or individual email addresses). Note that the app also implements its own page owner concept that has advantages over the built-in one. The email content can be customized, but there are many templates built in.

6. There is also a comprehensive Confluence Analytics solution built also in the app that gives insights similar to Confluence Analytics and even more as it contains content lifecycle-related information, like statuses and status changes over time, status overview, and more. Watch this overview about the reporting dashboards available for all tiers:

I suggest you look into it and give it a try for free.

Especially if you considered Premium only for the automation and content lifecycle features, I think you may be better off staying on Standard and using Better Content Archiving and Analytics.

I'm available for a consultation anytime to look at your CLM needs and how we can address them today. Reach out to us here or shoot me an email directly at levente.szabo@midori-global.com

(I'm part of the Midori team developing Better Content Archiving and Analytics since 2008.)

Gary Spross
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 13, 2024

Thanks @Levente Szabo _Midori_. At this time we are attempting a workaround, without an add-on, using what's available to us. If we are unsuccessful, I will look into your app.

Gary Spross
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
December 6, 2024

@Levente Szabo _Midori_, I'm definitely realizing that Confluence on it's own just doesn't provide any legitimate ways to be maintained (which is kind of sad...). I started looking into the Better Content Archiving and Analytics app that you suggested and it seems to align perfectly with our use case.

Before I attempt to get budgeting approval, I want to be sure I understand the capabilities of the app and can provide good reasoning why this is something we should get. So I installed the app in a test instance I have, however I have 2 questions:

  1. I'm not seeing the content statuses get added to any of my Spaces
    1. I've ensured a scheme is applied, ensured the Spaces aren't excluded, and done both an initialize of the content events and refresh of the content statuses
      1. The initialization & refresh jobs complete successfully
    2. What am I missing?
  2. Is there anywhere to see the configuration of the schedule of the jobs?
    1. I see they can be run manually, but how often are they scheduled to run and can this be changed?
Like Levente Szabo _Midori_ likes this
Gary Spross
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
December 6, 2024

I figured it out about the content statuses. I was conflating Page status & Content Status. The content statuses used by this app are separate and appear inline with the page stats just under the title of the page.

I'm still unsure about the job schedules and how they are configured.

Like Levente Szabo _Midori_ likes this
Levente Szabo _Midori_
Atlassian Partner
December 8, 2024

@Gary Spross Yes, we implement our own statuses, so those can be more flexible, diverse and not tied to Atlassian's decisions.

Here you can read more about every job. The status refresh runs every 6 hours (a manual update or refresh changes the status immediately, of course). The rest is scheduled by you inside a scheme.

For example, if you schedule a notification email automation, you also specify the schedule. There you can also verify when the job will run in the future.

job-schedule.png

It works the same for archiving/deletion automation. If you have further questions, feel free to reach out to us via our support or to me personally (levente.szabo@midori-global.com), so we can help you implement the app faster and provide tailored suggestions!

1 vote
Answer accepted
Jim Knepley - ReleaseTEAM
Atlassian Partner
November 11, 2024

Hi @Gary Spross 

I wouldn't use automation for this, I would write something by hand (probably with Python+Pandas, but whatever) that interacts with Confluence via the API. It would take some work.

If you have access to Rovo, you might be able to use it to identify clusters of pages that cover similar topics. I was able to get some sensible results with this prompt:

"You are an AI agent that analyzes Confluence pages. Using a list of page summaries, identify pages that address the same topics and list those pages."

It generated a list of page categories and the pages that fit those categories. I think that could be a useful start toward consolidating content or removing redundant pages.

(edit: you get a VERY similar response to that prompt from Atlassian Intelligence)

Gary Spross
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 11, 2024

Thanks for the suggestions @Jim Knepley - ReleaseTEAM. I was trying to avoid having to go down the path of basically creating my own app.

We also don't have Rovo at the moment. Can't get past the cost associated with that...

Like Humashankar VJ likes this
Jim Knepley - ReleaseTEAM
Atlassian Partner
November 11, 2024

@Gary Spross I edited my previous answer, but you likely won't be notified of the update. 

You get a very similar response to that prompt from Atlassian Intelligence.

Like # people like this
0 votes
Answer accepted
Kristian Klima
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 11, 2024

Hi @Gary Spross 

Some of your issues can be dealt with by apps like Panorama by Kolekti or Pages Manager by Ricksoft. For example, Panorama will display Last Viewed data that go way back than one year and will show you what restrictions are applied on the page.

You can also archive individual pages directly from the app.

Also, Pages manager (free app) lets you bulk edit parameters and labels en masse.

I can think of a solution in which you'd identify 'old' pages, apply labels (or statuses) en masse and then use automation (or an app), to manipulate the pages.

Say that you bulk apply label Dump to specific pages - combine with an automation rule - notify owner when label Dump applied - you warn them.

Taking it to a new level Space sync for Confluence (Ricksoft) can 'sync' (copy) pages with specific status/labels to another space. So you can create a dumping space for all useless pages from multiple spaces - and you can trigger that from Pages manager by bulk changing statuses.

Then, your archive the Dumping space (automate notifcations when archiving), and then detelete it.

 

The idea is to create a chain of specific apps - actions they can perform (view parameters, bulk change parameters, bulk copy), and combine them with automation.

Details, of course, depend on the specifics of your setup but I think you should be able to achieve quite a lot with the aforementioned apps.

(Disclaimer - I'm not affiliated with Kolekti/Adaptavist or Ricksoft)

Gary Spross
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 13, 2024

"Panorama will display Last Viewed data that go way back than one year and will show you what restrictions are applied on the page."

It baffles me that a 3rd party app can gather that data, yet Atlassian themselves don't provide a way (that I can find or that Support has informed me of) for users to gather it themselves.

"I can think of a solution in which you'd identify 'old' pages, apply labels (or statuses) en masse and then use automation (or an app), to manipulate the pages."

This is exactly the approach we're taking. We're using automation to label all Pages that have been "inactive" for more than 1 year (the furthest back according to Atlassian). We're then using Atlassian Analytics (luckily we have access to this tool) to query pages with the applied label that also haven't been edited in the last 2.5 years. From there we are communicating to the masses and then will use the Delete page API call to delete all of the impacted pages.

After this initial "purge", we should be able to utilize an automation rule to keep up with the maintenance so that we don't fall into this trap of having many thousands of inactive/outdated Pages existing in our instance.

Note: Because of the restrictions feature, there will be Pages that meet the criteria, but aren't "found", therefore aren't "cleaned up". We've accepted this, because we have to, but I really wish Atlassian would implement an ability to default add users/groups to be able to bypass restrictions. At the very least, the confluence-admin group. Confluence isn't a personal tool. It's Enterprise software. Admins should be able to access any and all data (and not via an Admin key!).

Ideally, I'd like to avoid the need for another add-on. Mainly because the cost (on top of what the Atlassian products cost already) is not going to be an easy sell...

I appreciate your response!

Like Kristian Klima likes this
Kristian Klima
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 13, 2024

Hi @Gary Spross 

Confluence Premium does have Content manager which can do some bulk ops...

It's weird that you have all Content manager, Pages manager, Panorama (and probably many more)... each with the same goal in mind, they overlap and yet, frustratingly, they each miss one different thing.

Each doing a different 80% chunk of features and you need all three to achieve a goal that you feel should be the obvious primary use-case :) 

(Content manager is fast improving, though - I got used to using third party tools but I was surprised to see many changes).

Gary Spross
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 13, 2024

I do agree that Content Manager has had many improvements. A few additional improvements I'd like to see them make to it are:

  1. Ability to sort by column
  2. Ability to filter by date on specific columns
  3. Ability to export
  4. Ability to Expand/Collapse all

That said, I was looking for an automated way to perform this cleanup. The last thing I want to do is manually go through the Content Manager in a couple hundred Spaces. Yuck...

Like Kristian Klima likes this
3 votes
Matthew_Christiansen_Adaptavist
Atlassian Partner
November 11, 2024

Good afternoon @Gary Spross 

I am a marketer at Kolekti (part of Adaptavist) for a third-party app that will soon be released on the marketplace (Aiming for next week). 

The main purpose of this app is to improve the efficiency of space management and eliminate the often tedious tasks of cleaning up content, including updating labels, status, etc. It also allows for a more holistic view of the content's property information.

 

Given your current challenge with clutter and outdated content across your instance, it sounds like this could be something our app could very much support you with.

I would be more than happy to jump on a call and demonstrate the new application to you.

Let me know, and we can set something up when it's convenient for you or your team.

Thanks 




Gary Spross
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
November 13, 2024

Thanks @Matthew_Christiansen_Adaptavist. At this time we are attempting a workaround, without an add-on, using what's available to us. If we are unsuccessful I will reach out to learn more about your app.

0 votes
Adrian Hülsmann - B1NARY
Atlassian Partner
January 24, 2025

Hi @Gary Spross

You may also take a look at Breeze, which provides automated review workflows, that seem to fit your use case.

community_breeze_overview.png

Breeze automatically identifies outdated pages for you, creates reports, sends notifications to responsible persons, and provides dedicated review UIs for users to track the overall content quality and progress.

Would be too much to mention all its features here, but if you like, you can you try it for free in the Atlassian Marketplace or schedule a demo with me, in which I can show you how it works and discuss whether it fits your needs.

All the best,
Adrian from B1NARY (we are the developers of Breeze)

0 votes
Mattia _bitvoodoo ag_
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
November 20, 2024

Hello Gary,

I have read that you are currently looking for a workaround without an add-on. I can't suggest a satisfactory native solution either.


Should consider a solution via add-on in the future, I can offer you to demonstrate to you our Viewtracker app.

Kind regards,

Mattia

0 votes
Stavros_Rougas_EasyApps
Atlassian Partner
November 18, 2024

@Gary Spross spot on post. And good for you to appreciate that herding cats (cleaning up spaces) makes a difference.

Take a look at Space Content Manager, it's all in Forge. It's built around the concept of bulk managing content. I come from the content side of things, and some things should be easy to fix like 'confluence' not 'Confluence' on public documentation.

Screenshot 2024-11-18 at 15.48.17.png

We keep adding modules. Feel free to contact our support with your shopping list.

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PRODUCT PLAN
PREMIUM
PERMISSIONS LEVEL
Product Admin
TAGS
AUG Leaders

Atlassian Community Events