find orphaned pages with cli

Does anybody know a way how to find orphaned pages with Confluence CLI?

1 answer

Using the Confluence CLI and the "getPageList" action, it appears you can determine orphaned pages if the "Parent Id" of a page is "0".

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2"

"–outputFormat 2" renders output in CSV format.  There would need to be some post-CLI processing to find those rows (pages) where the second CSV value is "0".

For spaces with a large number of pages, the CLI command errored out with "Client error: java.net.ConnectException: Operation timed out", FYI.

Output with a "parent" and "child" page (output edited, removing URL information):

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2"
5 pages in list
"Title","Id","Parent id","Author","Created","Modifier","Modified","Version","Url"
"Child","104007125","104007123" [truncated]
"Getting started","104007120","104007119"[truncated]
"JSANDTEST","104007119","0" [truncated]
"Making a template","104007121","104007120"[truncated]
"Parent","104007123","104007119"[truncated]

Output after "parent" page was deleted (output edited, removing URL information):

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2"
4 pages in list
"Title","Id","Parent id","Author","Created","Modifier","Modified","Version","Url"
"Child","104007125","0"[truncated]
"Getting started","104007120","104007119"[truncated]
"JSANDTEST","104007119","0"[truncated]
"Making a template","104007121","104007120"[truncated]

Note how the "Parent Id" for the "Child" page went from "104007123" to "0".

Here's the CLI command with the "post processing":

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2" | awk -F"\",\"" '{print $2":"$3}'
Id:Parent id
104007138:104007119
104007125:0
104007120:104007119
104007119:0
104007121:104007120

There would need to be some further refinement to determine which page is the "Home" page (Parent Id = 0) so you wouldn't remove the Home page.  You would have to run the CLI with the "getSpace" action to discover the page ID of the Home page.

Net:  there ought to be a CLI feature to return a list of orphaned pages; but, hope the above helps.

Suggest an answer

Log in or Sign up to answer
How to earn badges on the Atlassian Community

How to earn badges on the Atlassian Community

Badges are a great way to show off community activity, whether you’re a newbie or a Champion.

Learn more
Community showcase
Published May 30, 2018 in Marketplace Apps

Three tips for boosting your board's efficiency with Story Maps

Trello is one of the most effective tools for driving your sprints. It's customizable for every Agile team and product owners and Scrum masters (SM) love it. However, Agile teams often struggle with:...

852 views 2 9
Read article

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you