Bulk Updating Page Content in Confluence Server using ScriptRunner

UPDATE: If you're looking to update Macros in pages, give a look to our Update Macro feature and Josh's sweet post about it.

One question we get fairly frequently is how to bulk update page content using ScriptRunner.

One of the complexities of doing so is dealing with the Confluence storage format, which is an XHTML-based document. Doing simple string replacements on HTML using familiar tools like regular expressions can be a bit fraught, but that doesn't mean we can't easily get things done. The convenience of the Groovy programming language and powerful libraries like jsoup combine to make these sort of changes feasible.

As a simple example, suppose that we have several different pages in our space that each contain a table. That table has a header column with some placeholder text that we want to change.

Example Table.png

Suppose further that all of these pages are children of a known parent page. A script like the one below can be run from the Script Console to perform the update, while leaving placeholder text that's not in a table header alone.

import com.atlassian.confluence.pages.PageManager
import com.atlassian.sal.api.component.ComponentLocator
import org.jsoup.Jsoup

def pageManager = ComponentLocator.getComponent(PageManager)
def rootPage = pageManager.getPage('DS', 'Welcome to Confluence') //You can change this line to point to the right parent page.
rootPage.children.each { page ->
    log.debug "Inspecting page ${page.title}"
    def body = page.bodyContent.body
    def parsedBody = Jsoup.parse(body)
    def tableHeaderWithPlaceholder = parsedBody.select('th:contains(placeholdertext)')
    if (!tableHeaderWithPlaceholder.empty) {
        log.debug "Found table header with placeholder text: ${tableHeaderWithPlaceholder}"
        pageManager.saveNewVersion(page) { pageObject ->
            tableHeaderWithPlaceholder.html("${page.creator.fullName}'s Column")
            pageObject.setBodyAsString(parsedBody.toString())
        }
    }
}

 

That transforms the table as you might expect to something like this:

Transformed Table.png

Of course, the above script can be updated for more complex use cases, such as iterating through the results of a CQL search, changing content besides placeholders, and so on.

Happy scripting!

15 comments

Comments for this post are closed

Community moderators have prevented the ability to post new comments.

Kurt Rosivatz May 22, 2020

Great article! Thank you.

For those who want to manipulate macros I have an example for changing a status macro via Jsoup:

import com.atlassian.confluence.pages.PageManager
import com.atlassian.sal.api.component.ComponentLocator
import org.jsoup.Jsoup

def pageManager = ComponentLocator.getComponent(PageManager)
def page = pageManager.getPage('scriptrunner', 'Change Status')

def soup = Jsoup.parse(page.bodyAsString)
soup.outputSettings.prettyPrint(false) // prevents whitespaces in the macro parameters

// assuming first status macro should be changed
def status_macro = soup.selectFirst("ac|structured-macro[ac:name='status']")

// change the title
def status_macro_title = status_macro.selectFirst("ac|parameter[ac:name='title']")
status_macro_title.text("OK")

// change the colour
def status_macro_value = status_macro.selectFirst("ac|parameter[ac:name='colour']")
status_macro_value.text("Green")

pageManager.saveNewVersion(page) {
newVersion ->
newVersion.bodyAsString = soup.toString()
}
Like Jonny Carter likes this
Michael Scholze May 27, 2020

Hey there, 

this is great news! I've spent weeks and weeks to find a marketplace app or "non-error-prone" solution like trying to manipulate space exports or outdated addons to no avail and way already giving up on this topic as we don't really get much support on Atlassians side either. This is a huge gap in the current app market for Confluence and I still wonder why there hasn't been a more user-friendly solution to a common issue such as this one.

With that said the solution above looks feasible, but also risky with no prior coding experience and a test-instance to run this on. That in mind, would it be possible to get this as part of the Scriptrunner included scripts functionality some day? Being able to select a space / page tree to recursively replace content storage format within in would be tremendously useful to have.

Like Jonny Carter likes this
Kurt Rosivatz May 27, 2020

Hi @Michael_Scholz

the ScriptRunner guys won't like this answer (sorry), but Bob Swift's Atlassian CLI for Confluence has that functionality built-in. If you have no coding experience a tool like CLI is easier to handle.

You can issue a command like

acli -a runFromPageList --labels toModify --common "-a modifyPage --findReplace this:that"

This will replace all 'this' with 'that' in all pages labelled 'toModify'.

The problem with CLI is (that's what the ScriptRunner guys want to hear) that you replace the "coding nightmare" with the "quoting nightmare" (see this old post on a CLI problem).

I would love to see ScriptRunner going in a direction where they offer more abstraction of the Confluence API. Something like the above CLI command in a kind of Groovy domain specific language:

Confluence.eachPageWithLabels('toModify') { it.replace('this', 'that') }

or at least opening some of there "canned scripts" for use in own scripts.

Regards,

Kurt

Like Jonny Carter likes this
Michael Kuhl _Appfire_
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 27, 2020

Hi @Kurt Rosivatz and @Michael Scholze  - You can sometimes avoid the "quoting nightmare" by using the --special parameter.  This lets you specify alternate characters for the default delimiters the CLI uses.  I use this often to specify an alternate to ":" for example.

The CLI support team is happy to help and quick to respond if needed.

FYI - I'm the product manager for the Bob Swift CLI product line.

Michael Scholze May 27, 2020

@Michael Kuhl  / @Kurt Rosivatz  Thank you both for the in-depth explanations. Indeed I had CLI and Scriptrunner on the table before, but wondered where the intersections between both apps are.

Both solutions come down to having a working Confluence test-instance running, so that's something to discuss with our IT in future. I'll bookmark his helpful conversation until then. :)

Like Jonny Carter likes this
Jonny Carter
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 6, 2020

A bit late getting back to the party, but shout-out to y'all for a great conversation!

@Michael Scholze - the short answer is, "Yup, we've already got a feature on the backlog." See https://productsupport.adaptavist.com/browse/SRCONF-1238 ; Definitely watch/vote for that issue if you'd like to see it. If there are additional use cases you'd like to see, hit up our support portal to request them as features!

Jonny Carter
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 6, 2020

To reply to Kurt & Michael, no offense taken. :)

To say a bit more about one of the suggestions:

> I would love to see ScriptRunner going in a direction where they offer more abstraction of the Confluence API. Something like the above CLI command in a kind of Groovy domain specific language

This touches a topic that we talk about a lot internally. There are a few courses we can take when trying to make ScriptRunner more accessible:

  1. Provide DSLs, convenience APIs, and so on (as you described)
  2. Just make another built-in script
  3. Provide example code in the documentation & the script library
  4. Provide better tooling (like the in-browser code editor) to make writing code a bit less intimidating

Historically, we've tended to focus on those last three. DLSs & APIs are attractive for a lot of reasons, and they can scratch itches that the other paths can't quite reach. Our intuition has been that for the vast majority of people who find code intimidating, a DSL won't lower the barrier to entry as well as a built-in script, while the potential maintenance and support problems will be considerably higher. Still, it's good to see your interest, and I'm definitely going to be plugging this comment thread with our product owner.

All that said, I would love to see you hit up our support portal with a feature request describing a few use cases like the one above.

Dominic Lagger
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 17, 2020

This is just awesome! 
Thanks a lot! 

Michael Scholze July 23, 2020

Hello @Jonny Carter 

I just tried your script on a test space.

On standalone pages it runs fine, but on pages with "Metadata for Confluence" used in conjunction with a Page Properties box it seems to severely destroy the layout after the batch processing.

Green = Target to be renamed, also shown in editing view

Red = The "th:contains(Massnahmen-Name)" part of your script seems to severely hamper with the Metadata macro content.

 

 Regards, 

Michael23_07_2020_2419.png23_07_2020_2420.png23_07_2020_2421.png

Kurt Rosivatz July 23, 2020

Hi @Michael Scholze ,

the script manipulates the source code of the page. To diagnose what's going wrong it would be helpful to compare the page source before and after the script has run. I recommend installing the https://marketplace.atlassian.com/apps/1210722/confluence-source-editor (free and from Atlassian, but unsupported) if you not already have done so.

One possible source of the problem could be the pretty printing behaviour of Jsoup, so check that your script contains

parsedBody.outputSettings.prettyPrint(false)

 HTH

Kurt

Like Jonny Carter likes this
Michael Scholze July 27, 2020

Hello @Kurt Rosivatz - Thanks for getting back so quickly. We only had 30 pages to update so I checked them manually in this case as the matter was pressing. I'm not a coder so I abstained from making changes on this script (especially since we don't have a test environment).

At least I now know that the routine works for plain text in der Header. A table that contains macros or metadata might require more testing. 

Overall this is a VERY useful script to have and it would be really handy to have something on the long run that covers find-replace in a more detailed fashion.

Expanded Use Case ideas from my side could be (in priority order) 

  1. Specify where to replace (in table headers, table body, content, macros)
  2. Define multiple search and replace-pairs beforehand
  3. Support for Regex Find&Replace (powerful with the previous case)

Something more sophisticated like this could easily justify a own UI / App for Confluence Space tools since it's a real gap in the Atlassian market.

Regards, 
Michael

Like Jonny Carter likes this
Stephen Letch December 9, 2020

Since there's some big wigs in this thread I'd just like to add that I wish Add on makers such as adaptavist would make far more obvious functions easier. I've spent the last 2 hours trying to find a way or example script that will allow me to bulk move pages from one space to another based on the label. You'd think something like this would be a built in script :)

Like # people like this
Jonny Carter
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 13, 2021

Bigwigs? Where!?! ;)

Joking aside, @Stephen Letch , that's a valid point. While that's technically possible with a CQL Script Job, it could be made considerably easier. Thanks for the suggestion!

Peter-Dave Sheehan
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
March 19, 2021

@Jonny Carter do you have any thoughts on how to deal with shared drafts that might not have been published when doing this sort of bulk update? 

The draft version will not have these updates. So the next time someone opens the page for editing and they publish, the changes introduced with this method will be lost.

I'm guessing it would involve deleteDrat(ContentId) from ContentDraftServiceImpl or SharedContentDraftServiceImpl

But I haven't found a way to use either.

Jonny Carter
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 25, 2021

@Peter-Dave Sheehan- I reckon you're on the right track with the ContentDraftService. I imagine

import com.atlassian.confluence.api.service.content.ContentDraftService
def draftService = ComponentLocator.getComponent(ContentDraftService)

 would get you the service, and the publishNewDraft method would let you create a new draft with the content you wanted. I've not tested that, but if you want to start a new post around that specific case, feel free to @ me. :)

As an update to @Stephen Letch @Kurt Rosivatz, and anyone else watching this thread, we've made a first step in the 6.22.0 release to make updating macros easier. Check out the Update Macro built-in script. It only supports macro parameters at the moment, but we hope to expand the means of manipulating macros in the future.

Like Kurt Rosivatz likes this

Comments for this post are closed

Community moderators have prevented the ability to post new comments.

TAGS
AUG Leaders

Atlassian Community Events