I'm doing an analysis of how many confluence pages are open currently. To do this, I did an export from the confluence space I'm working in and put it into excel to add a few formulas to to remove duplicates. I wasn't sure what unique identifier is to show what confluence pages are currently live and which are archived.
My first go around I used "Navigationtype" as that seemed like it would work, but upon running another export later of the same space, "Navigationtype" gave me a wildly different number of pages. My next thought was using "spaceid" but I'm not sure that is correct either.
So my question is, what is the correct unique identifier to use within the Confluence Space Export to use to identify and analyze how many pages we currently have live?
If you want to see the number of pages per space and per status, then the Status Overview report gives you exactly that (it excludes archived and deleted pages!):
If you want to know who are the owners, when were they last updated, you can get that by displaying the "Owners" and the "Last updated" columns on the content list:
If you want to track how the status distribution changes over time (to confirm that the quality is improving), use the Status report:
To me it sounds that your use case is a classic example of content lifecycle management, updating or archiving content systematically. And the Better Content Archiving app has been developed for that for more than a decade!
You can try it free any time.
(Discl. this paid and supported app is developed by our team. Free for 10 users.)
Hi @Jason_Tenney and welcome to the Community
I'm not sure if I'm reading your question correctly, but if you just need to identify live documents and pages in the space, you can use Content Manager - it comes with your Confluence as standard.
You need to click Columns and select Type, but then any live doc will be identified as such (note the different icon in the Type column).
Archived pages are listed in the Settings > Content, and they do not show up in the space tree.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks, and that's good to know about the content manager, that would give me some more information to use.
My goal is to identify
While I can view how what documents are currently live via the content manager, that creates some challenges to get that information into excel to create reports that can be passed to management. I was hoping there was something in the export that could be a marker to indicate which documents in the CSV export (since that seems to contain all the data anyway) are currently live and which aren't. But if there isn't a reliable way to do so I might just need to scrape what I can from content manager and put that into excel.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I thought I could easily copy the cells from content manager, but it looks like I cannot. So Content Manager wouldn't be a viable option for getting what I need into excel to run reports.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Jason_Tenney what do you mean by 'live'. This is confusing, you used it in the original question and in the reply. Live is not a term in Confluence by itself.
- page
- live doc
- database
- whiteboard
Does live mean a total of these that are not archived?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Yes, live meaning the total pages that are not archived, deleted, or in draft form. All documents that all users can see within our organization within a space. Sorry about the confusion.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Jason_Tenney I asked as this from our app Space Content Manager might help. Select the spaces you want and get statistics on page numbers by type.
Note that these numbers do not include deleted or achieved pages.
We do not currently have an export but it is something we are considering. It would be great if you could send me an email stavros@easyapps.app and I could tell you more and get your feedback on exactly what you need it to do.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
If your goal is to count the pages that are currently live, I'd avoid using NavigationType since it's not intended to be a unique page identifier, and can vary between exports.
A better option is to use the page's unique Content ID (or Page ID, depending on the export format), as it remains consistent for each page. You can then filter based on the page status (current vs. archived) to determine which pages are live. SpaceID identifies the space itself, not individual pages, so it won't help distinguish unique pages within the same space.
If you can share which type of Confluence export you're using (CSV, XML, or another format), it would be easier to recommend the exact field to use.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks for the input. I'm currently using the CSV to put everything into some easy to read pivot tables in excel.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
So I did try filtering by "Page Status" (current vs. archived) but that still gives me pages that have been deleted resulting in being led to a "page not found" when I attempt to find the documents that are still listed as "current".
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.