I found a nice python program written by a Confluence tech writer to check for broken links in Confluence pages.
I've gotten it working and it's finding broken links, but as far as I can tell it won't find any links that are inside an HTML macro, which represents most of the links I'm interested in checking.
After examining the program I realized that this makes sense perhaps, since the python program seems to rely on internal Confluence "broken red link errors" that are only issued regarding regular Wiki markup links, at least that's what it seems is happening. So in other words if Confluence isn't including any error messages about links that are inside an HTML macro, then there's nothing for the program to find.
Does anyone have any ideas about how to run something to validate links inside HTML macros, which contain the full URL of the Confluence target page? The pages can't be accessed anonymously, they require user name and pwd. I'll include examples here; our links look like this (shortened to show just one link inside an HTML macro) :
<ac:structured-macro ac:name="html"> <ul><li><a href=http://confluence_url.com:18090/name/name>link label text</a></li></ul> <ac:plain-text-body></ac:structured-macro>
These are links to other Confluence pages that are in the same space. To make them work in the HTML macro, we have them as full http paths, which works well. I just want to spot the ones that are broken links.
If I put a regular Confluence link on the same page, but outside the HTML macro, one like this:
[link to a page]
with no destination, then the python program will find it as a broken link. However it won't spot any broken links inside the HTML macro section.
Thanks,
Community moderators have prevented the ability to post new answers.
It looks like the above script is using the API to figure out the broken links. Since the HTML macro is not using built-in API functionality to create the links these will not be caught. Instead you will need something that can crawl your site and check each link.
Okay thanks Davin, that's what I thought it might be. Do you have any suggestions offhand for what to use? Several restrictions: the pages need a login, no anonymous viewing, and I can't install Confluence plugins. I have space admin rights but not Confluence admin. If there's a plugin that works I can submit a request to get it installed, but would be great to have something that doesn't require a plugin. I'm looking into writing an Ant build to do it, I'm using those to publish to Confluence using Webdav, but will need to research how to create that. Thanks again
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Unfortunately, I don't know anything off the top of my head.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.