Hello,
The company I work for, among with many others, has been affected by this Confluence bug (https://jira.atlassian.com/browse/CONFSERVER-55928) that causes images in Confluence pages to not appear within the page. Atlassian has released a fix in the 7.7.3 version of Confluence that will prevent this from happening again, but won't fix existing pages, or identify pages that are affected.
Does anyone know if it would be possible to write code using Scriptrunner for Confluence that would identify pages that are affected by this issue, so we can quickly find all the affected pages and correct them?
Thank you,
Anja Brkljacic
Hey there Anja! :D
In short, yes absolutely you can do this! :)
I've reproduced the bug that you linked and it looks like any page that is experiencing the bug will contain some place-holder "unknown-attachment" image in the Storage Format that looks something like this:
<p><br /></p>
<p><ac:image><ri:url ri:value="{hiddenPersonalUrl}/plugins/servlet/confluence/placeholder/unknown-attachment?locale=en_US&version=2" /></ac:image></p>
So, one thing we can do is search all of your pages and their content to see if the page contains this "unknown-attachment" image. You could technically just create a custom script that searches every single page in your instance, but something like that could take a looooooong long time to run...so it's probably not the best approach. Instead, an easier way to do this would be to just create a simple Search Extractor. Doing this will allow you to go through each Space individually (instead of every space all at once) and identify troublesome pages. I tested this locally and used a search extractor with the following code as my Inline script:
import com.atlassian.confluence.pages.Page
import org.apache.lucene.document.Field
import org.apache.lucene.document.StringField
if (searchable instanceof Page) {
Page page = searchable as Page
def pageBodyContent = page.bodyContents
def containsUnknown = pageBodyContent.find { it.body.contains("unknown-attachment") }
if (containsUnknown) {
document.add(new StringField("containsUnknown", "true", Field.Store.YES))
} else {
document.add(new StringField("containsUnknown", "false", Field.Store.YES))
}
}
Keep in mind, after creating this extractor you'll need to reindex your instance so that all of your content is appropriately flagged with the "containsUnknown" field. But after indexing, you should be able to run an Advanced Search like the following and specify which space(s) you'd like to search:
Now, disclaimer, I only tested this on a very small group of test pages and in an instance that's basically empty, so your mileage may vary. But I'd give that a shot and see if it returns the problem pages that you're looking for.
Hope that helps! :D
Best,
Aidan
Thanks, Aidan! We will try this out and let you know how it worked/mark it as the answer :)
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
We tried this and it seems to have worked so far :) thanks, again!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hey there Anja! :D
In short, yes absolutely you can do this! :)
I've reproduced the bug that you linked and it looks like any page that is experiencing the bug will contain some place-holder "unknown-attachment" image in the Storage Format that looks something like this:
<p><br /></p>
<p><ac:image><ri:url ri:value="{hiddenPersonalUrl}/plugins/servlet/confluence/placeholder/unknown-attachment?locale=en_US&version=2" /></ac:image></p>
So, one thing we can do is search all of your pages and their content to see if the page contains this "unknown-attachment" image. You could technically just create a custom script that searches every single page in your instance, but something like that could take a looooooong long time to run...so it's probably not the best approach. Instead, an easier way to do this would be to just create a simple Search Extractor. Doing this will allow you to go through each Space individually (instead of every space all at once) and identify troublesome pages. I tested this locally and used a search extractor with the following code as my Inline script:
import com.atlassian.confluence.pages.Page
import org.apache.lucene.document.Field
import org.apache.lucene.document.StringField
if (searchable instanceof Page) {
Page page = searchable as Page
def pageBodyContent = page.bodyContents
def containsUnknown = pageBodyContent.find { it.body.contains("unknown-attachment") }
if (containsUnknown) {
document.add(new StringField("containsUnknown", "true", Field.Store.YES))
} else {
document.add(new StringField("containsUnknown", "false", Field.Store.YES))
}
}
Keep in mind, after creating this extractor you'll need to reindex your instance so that all of your content is appropriately flagged with the "containsUnknown" field. But after indexing, you should be able to run an Advanced Search like the following and specify which space(s) you'd like to search:
Now, disclaimer, I only tested this on a very small group of test pages and in an instance that's basically empty, so your mileage may vary. But I'd give that a shot and see if it returns the problem pages that you're looking for.
Hope that helps! :D
Best,
Aidan
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi all, this script is now available in our script library for ScriptRunner for Confluence Server/DC (tested by our engineers).
Feel free to copy or customise it as you wish https://library.adaptavist.com/entity/identify-pages-with-broken-image-links
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.