Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Extracting content from Confluence Page

Patrick December 3, 2023

Hi,

I am looking to extract information from a specific section within a Confluence page. I am having trouble with which Smart Values to utilize.

For example, I have an automation that triggers, and when it does, a Slack message is sent with the page title using `{{page.title}}`. I have another section/heading within the page called Minutes and was wondering if there's a way to extract the information.

I created an excerpt and added the content in there, but I'm not sure how to query that information or if there's another way to do so.

Any assistance would be greatly appreciated!

1 answer

1 accepted

0 votes
Answer accepted
Darryl Lee
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
December 3, 2023

Hi @Patrick - Unfortunately Atlassian has not seen fit to give us access to the content of Confluence pages in Automation.

You could use Automation's Send web request action to access the REST API for Confluence which would enable you to get the content of the page returned as a smart value.

You could then use the various smart values text functions to search for your section and grab the content therein.

The specific endpoint you'd want to hit is something like:

https://your-domain.atlassian.net/wiki/rest/api/content/3965072?expand=body.storage

But yeah, it's a wee bit of work. There's a good tutorial on this here:

It says it's for Jira, but it should work for Confluence as well.

The page data ought to be accessible via this smart value:

{{webResponse.body.storage.value}}

So you could find stuff by looking for the excerpt like here:

<ac:structured-macro ac:name=\"excerpt\" 

Or looking for text between Headings: 

<ac:rich-text-body><h1>Minutes</h1><p>Here are the minutes</p></ac:rich-text-body></ac:structured-macro><h1>Not Minutes</h1>

But yeah, it's a little tricky. If you're up for the challenge, give it a shot, and let us know if you run into any problems. :-}

Patrick December 4, 2023

ANOTHER UPDATE:

I feel like i'm in my software engineering days where I've looked at JSON file for so long. It turns out that it's nested in another body. So, I finally got my value but had to put {{webResponse.body.body.storage.value}}.

Now the only thing I need to do is format it because the message is being sent as HTML text.

 

UPDATE:

So it looks like i'm now able to get what I need but only using {{webResponse.body}}. If I go any deeper, {{webResponse.body.storage.value}} for example, the audit log just shows an empty log.

What I'm really hoping for is to get that value and use the Send Slack Message trigger. I feel like I'm close!

 

Hi @Darryl Lee

Thank you so much for this! While following your instructions in addition to the other link you provided, I think I'm on the right track. Within {{body.storage.value}} I see the excerpt values which is what I need. I'm a little lost in what I would provide for the Custom Data portion of the automation rule.

Darryl Lee
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
December 4, 2023

Hey yeah, you don't need Custom Data unless you are making a web request to update something with the REST API. So you can leave that empty.

I was able to parse this on a page:

Minutes

These are the minutes of the meeting

Not Minutes

This is the next section, which is not minutes.

Using this code:

{{webResponse.body.body.storage.value.match(".*<h1>Minutes</h1>(.+?)<h1>.*")}}

What this means is:

"Search for any character including a non-character before <h1>Minutes</h1>, and then capture any text you see up until you find another <h1>"

So, assuming the minutes of your meeting fall between a Heading 1 of "Minutes" and some other Heading 1 text, that should capture it.

BTW, unless you're using it for something else, I would skip using an Excerpt, because then you get rid of all of the "<ac:structured-macro..." stuff too.

ANYWAYS, assuming the Slack action can accept HTML, you could put the code above into the Message section of the "Send Slack message" action.

Oh dang, Slack probably can't deal with HTML, so you'll end up with paragraph tags (<p>), and possibly other formatting you don't want.

Ugh, stripping HTML can be a pain, but yeah, you could tack on some replaceAll commands, like:

{{webResponse.body.body.storage.value.match(".*<h1>Minutes</h1>(.+?)<h1>.*").replaceAll("</p>","\n").replaceAll("</*.+?>","")}}

That replaces the closing paragraph tag with a newline, and then strips out every other tag. It looks like Slack supports mrkdwn syntax, so if you wanted to get fancy, you could maybe add add these to support some formatting:

.replaceAll("</*strong>","*")

.replaceAll("</*em>","_")

.replaceAll("</*del>","~")

.replaceAll("<li>","- ")

Yeah, that seems to have worked:

{{webResponse.body.body.storage.value.match(".*<h1>Minutes</h1>(.+?)<h1>.*").replaceAll("</p>","\n").replaceAll("</*strong>","*").replaceAll("</*em>","_").replaceAll("</*del>","~").replaceAll("<li>","- ").replaceAll("</*.+?>","")}}

It's gnarly though, and parsing HTML with Regex is dangerous territory. :-}

Patrick December 5, 2023

Thank you so much for this @Darryl Lee ! 

Like Darryl Lee likes this

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events