find orphaned pages with cli

Does anybody know a way how to find orphaned pages with Confluence CLI?

1 answer

Using the Confluence CLI and the "getPageList" action, it appears you can determine orphaned pages if the "Parent Id" of a page is "0".

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2"

"–outputFormat 2" renders output in CSV format.  There would need to be some post-CLI processing to find those rows (pages) where the second CSV value is "0".

For spaces with a large number of pages, the CLI command errored out with "Client error: java.net.ConnectException: Operation timed out", FYI.

Output with a "parent" and "child" page (output edited, removing URL information):

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2"
5 pages in list
"Title","Id","Parent id","Author","Created","Modifier","Modified","Version","Url"
"Child","104007125","104007123" [truncated]
"Getting started","104007120","104007119"[truncated]
"JSANDTEST","104007119","0" [truncated]
"Making a template","104007121","104007120"[truncated]
"Parent","104007123","104007119"[truncated]

Output after "parent" page was deleted (output edited, removing URL information):

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2"
4 pages in list
"Title","Id","Parent id","Author","Created","Modifier","Modified","Version","Url"
"Child","104007125","0"[truncated]
"Getting started","104007120","104007119"[truncated]
"JSANDTEST","104007119","0"[truncated]
"Making a template","104007121","104007120"[truncated]

Note how the "Parent Id" for the "Child" page went from "104007123" to "0".

Here's the CLI command with the "post processing":

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2" | awk -F"\",\"" '{print $2":"$3}'
Id:Parent id
104007138:104007119
104007125:0
104007120:104007119
104007119:0
104007121:104007120

There would need to be some further refinement to determine which page is the "Home" page (Parent Id = 0) so you wouldn't remove the Home page.  You would have to run the CLI with the "getSpace" action to discover the page ID of the Home page.

Net:  there ought to be a CLI feature to return a list of orphaned pages; but, hope the above helps.

Suggest an answer

Log in or Sign up to answer
Atlassian Community Anniversary

Happy Anniversary, Atlassian Community!

This community is celebrating its one-year anniversary and Atlassian co-founder Mike Cannon-Brookes has all the feels.

Read more
Community showcase
Bridget Sauer
Published Thursday in Marketplace Apps

Calling all developers––You're invited to Atlas Camp 2018

 Atlas Camp   is our developer event which will take place in Barcelona, Spain  from the 6th -7th of   September . This is a great opportunity to meet other developers and get n...

77 views 0 5
Read article

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you