Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

It's not the same without you

Join the community to find out what other Atlassian users are discussing, debating and creating.

Atlassian Community Hero Image Collage

Find and identify all empty pages in Confluence

As a user of out company wiki based on Confluence I want to find and itentify pages that are empty apart from a title so than I can decide whether to trash them or ping peers so that they can fill them with our knowledge.

5 answers

1 accepted

2 votes
Answer accepted

Here's a macro that will do it. Not the most efficient macro, but it sounds like you just need to run this once to get a list. I can't vouch for it being perfect, but I did a quick test and it seems to work.

## Macro title: Find Empty Pages
## Macro has a body: N
## Body processing: n/a
## Output: HTML
##
## Developed by: Matthew J. Horn
## Date created: 07/31/2013
## @noparams

#set ($pageListArray = [])
#set ($spaceHome = $space.getHomePage())

#macro ( process $rp )
  #set ($pagelist = $rp.getSortedChildren() )  ## returns List<Page>
  #foreach( $child in $pagelist )
    #set($p = $pageListArray.add( $child ) )
    #if( $child.hasChildren() )
      #process ( $child )
    #end
  #end
#end

#process ( $spaceHome )

<table class="confluenceTable">
 <tbody>
 <tr>
  <th class="confluenceTh">Title</th>
  <th class="confluenceTh">Size</th>
 </tr>

 #foreach( $child in $pageListArray)   ## child is of type Page
   <tr>
     <td class="confluenceTd">$child.getTitle()</td>
     <td class="confluenceTd">$child.getBodyAsStringWithoutMarkup().length()</td>
   </tr>
 #end 

</tbody>
</table>

Matthew, this is working well....however how can i see this at complete instance level? I have some 20 odd spaces and want to know how many empty pages i have in the space.

Hello,

maybe a late answer , but we had the same problem / requests from our users.  I will post my solution, maybe somebody can use it.

 

I checked Matthew J. Horn great answer, but it has some problems :

* it lists all the pages in a space, not only the empty ones

* only the page title is used , links to pages would be convenient

 

I did some tests and noted that it is hard to identify empty pages with 100% certainty when using the size of the content as a String. Note I didn't find a better way to check for an empty page, so that idd seems the best tool at our disposal.

So I tested with empty pages which have some layout (like sections) , the usage of macro's (without other text) : with or without ouput, adding very little text , very small images, etc...

I found that if we use a threshold of 10 (length of the String) almost all of the non-empty pages are filtered out , some false positives can remain

 

So starting from Matthew J. Horn solution I made this:

## Macro title: Find Empty Pages
## Macro has a body: N
## Body processing: n/a
## Output: HTML
##
## Original by: Matthew J. Horn : https://community.atlassian.com/t5/Confluence-questions/Find-and-identify-all-empty-pages-in-Confluence/qaq-p/131649
## Updated by: Loïc Dewerchin
## Date created: 07/31/2013
## @noparams

#set ($pageListArray = [])
#set ($spaceHome = $space.getHomePage())

#macro ( process $rp )
  #set ($pagelist = $rp.getSortedChildren() )  ## returns List<Page>
  #foreach( $child in $pagelist )
    #set($p = $pageListArray.add( $child ) )
    #if( $child.hasChildren() )
      #process ( $child )
    #end
  #end
#end

#process ( $spaceHome )

<ac:macro ac:name="note">
<ac:rich-text-body>
    <p>Add a warning about possible false positives</p>
  </ac:rich-text-body>
</ac:macro>

<table class="confluenceTable">
 <tbody>
 <tr>
  <th class="confluenceTh">Page</th>
  <th class="confluenceTh">Author</th>
  <th class="confluenceTh">Creation date</th>
  <th class="confluenceTh">Update date</th>
 </tr>

 #foreach( $child in $pageListArray)   ## child is of type Page

  #if( $child.getBodyAsStringWithoutMarkup().length() <= 10 )
     <tr>
       <td class="confluenceTd"><a href="$child.getUrlPath()">$child.getTitle()</a></td>
       <td class="confluenceTd">$child.getCreatorName()</td>
       <td class="confluenceTd">$child.getCreationDate()</td>
       <td class="confluenceTd">$child.getLastModificationDate()</td>
     </tr>
   #end
 #end

</tbody>
</table>

 

This user macro only shows the empty pages , and provides the link to the page + some additional info.

Great Macro...

Is it possible to limit the search by labels?

Hi Stefan

did u tried an simple select on the database?

"select contentid from bodycontent where body is NULL" will show you all contentid's which don't have a body

"select * from content where contentid = XYZ" should list you some more information of that page(s).

Sure, you can combine those sql-querys within some joins or subselects, but thats sth i'm not into :-)

Kind regards
André

EDIT:

Hmm Confluence is tricky...
Body-column is CLOB and can't be combined out of the box...
I searched around and made some try+error and found:

SQL: select contentid from bodycontent where to_char(substr(body,0,100)) is NULL;

that should list all pages/contentid's where the first 100 chars are NULL :-)

Hi Andre, thank you for your reply. I forgot to say that I do not have database access at the moment. This could be a solution anyway but I would prefer a solution integrated on the advanced page for a given space for example. This seems not to exists yet?

I have modified the code and this gets me list of all the spaces with empty page names.

## Macro title: Find Empty Pages
## Macro has a body: N
## Body processing: n/a
## Output: HTML
##
## Developed by: Matthew J. Horn
## Date created: 07/31/2013
## @noparams
## Modified by: Pranjal Shukla on 13/1/2016
 
#set ($spaces = $spaceManager.getAllSpaces())
#foreach( $space in $spaces )
#set ($spaceHome = $space.getHomePage())
#set ($pageListArray = [])
 
#macro ( process $rp )
  #set ($pagelist = $rp.getSortedChildren() )  ## returns List&lt;Page&gt;
  #foreach( $child in $pagelist )
    #set($p = $pageListArray.add( $child ) )
    #if( $child.hasChildren() )
      #process ( $child )
    #end
  #end
#end
 
#process ( $spaceHome )
 
&lt;h1&gt;$space.getName()&lt;/h1&gt;
&lt;table class="confluenceTable"&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
  &lt;th class="confluenceTh"&gt;Title&lt;/th&gt;
  &lt;th class="confluenceTh"&gt;Size&lt;/th&gt;
 &lt;/tr&gt;
 
 #foreach( $child in $pageListArray)   ## child is of type Page
 #if( $child.getBodyAsStringWithoutMarkup().length()==0 )
   &lt;tr&gt;
     &lt;td class="confluenceTd"&gt;$child.getTitle()&lt;/td&gt;
     &lt;td class="confluenceTd"&gt;$child.getBodyAsStringWithoutMarkup().length()&lt;/td&gt;
   &lt;/tr&gt;
 #end
 #end
 
&lt;/tbody&gt;
&lt;/table&gt;
#end

I'm admittedly rather late to the party, but this would be a good use case for ScriptRunner for Confluence's Search Extractors.

A custom search extractor with the following code:

import com.atlassian.confluence.pages.Page
import org.apache.lucene.document.Field
import org.apache.lucene.document.StringField

if (searchable instanceof Page) {
Page page = searchable as Page
if (page.bodyAsStringWithoutMarkup.isEmpty() || page.bodyAsStringWithoutMarkup.isAllWhitespace()) {
document.add(new StringField("empty", "true", Field.Store.YES))
}
}

Will find all pages where the body is either empty or all whitespace. Of course, you can tweak the above script to match your own ideas about what constitutes an "empty" page.

You'll need to rebuild Confluence's indexes afterward, but then a simple Confluence search for empty : true should find any empty pages.

Suggest an answer

Log in or Sign up to answer
TAGS
Community showcase
Published in Confluence

🏑 Atlympic Event: Confluence

Hello Community!  Quick disclaimer: We are running a contest on Community (The Atlympics!) from July 23rd - August 8th of 2021. If you are interested in participating in this contest (prizes! ...

449 views 18 15
Read article

Community Events

Connect with like-minded Atlassian users at free events near you!

Find an event

Connect with like-minded Atlassian users at free events near you!

Unfortunately there are no Community Events near you at the moment.

Host an event

You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events

Events near you