find orphaned pages with cli

Does anybody know a way how to find orphaned pages with Confluence CLI?

1 answer

Using the Confluence CLI and the "getPageList" action, it appears you can determine orphaned pages if the "Parent Id" of a page is "0".

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2"

"–outputFormat 2" renders output in CSV format.  There would need to be some post-CLI processing to find those rows (pages) where the second CSV value is "0".

For spaces with a large number of pages, the CLI command errored out with "Client error: java.net.ConnectException: Operation timed out", FYI.

Output with a "parent" and "child" page (output edited, removing URL information):

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2"
5 pages in list
"Title","Id","Parent id","Author","Created","Modifier","Modified","Version","Url"
"Child","104007125","104007123" [truncated]
"Getting started","104007120","104007119"[truncated]
"JSANDTEST","104007119","0" [truncated]
"Making a template","104007121","104007120"[truncated]
"Parent","104007123","104007119"[truncated]

Output after "parent" page was deleted (output edited, removing URL information):

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2"
4 pages in list
"Title","Id","Parent id","Author","Created","Modifier","Modified","Version","Url"
"Child","104007125","0"[truncated]
"Getting started","104007120","104007119"[truncated]
"JSANDTEST","104007119","0"[truncated]
"Making a template","104007121","104007120"[truncated]

Note how the "Parent Id" for the "Child" page went from "104007123" to "0".

Here's the CLI command with the "post processing":

./confluence.sh --server <url>
 --user <user> --password <password> --connectionTimeout 0 
--action getPageList --space <space> --ancestors --outputFormat 
"2" | awk -F"\",\"" '{print $2":"$3}'
Id:Parent id
104007138:104007119
104007125:0
104007120:104007119
104007119:0
104007121:104007120

There would need to be some further refinement to determine which page is the "Home" page (Parent Id = 0) so you wouldn't remove the Home page.  You would have to run the CLI with the "getSpace" action to discover the page ID of the Home page.

Net:  there ought to be a CLI feature to return a list of orphaned pages; but, hope the above helps.

Suggest an answer

Log in or Sign up to answer
Community showcase
Published Oct 31, 2018 in Marketplace Apps

Marketplace Spotlight: Zephyr

Hello Atlassian Community! Each month, we run a series of Spotlights to highlight Marketplace vendors and apps that our team thinks this Community would find valuable. In last month's Spotlig...

344 views 0 1
Read article

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you