Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Confluence XML manipulation

JamieA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 6, 2013

Afternoon...

Vague question coming up. I want to do stuff in a confluence plugin, like run xpath queries on page(s), insert child nodes, replace values etc.

Actually I can do all that, and it works OK. My problem is I seem to have to jump through a silly number of hoops, and clearly there is some confluence APIs for this that I haven't found or don't know how to use.

For example, in order to use xpath I need to wrap the page's storage in an <xml> tag that contains namespace definitions for ac, ri, and at. Therefore, after manipulating it I need to unwrap it again before storing. Likewise I have problems importing nodes from other documents - if I don't print the xml declaration then each node has the xmlns tag.

There must be a simpler way but I don't know what it is. Has anyone got a pointer to preferably a simple, self-contained class or plugin that does manipulation of page content using the java 6 XML APIs?

2 answers

1 accepted

Comments for this post are closed

Community moderators have prevented the ability to post new answers.

Post a new question

1 vote
Answer accepted
JamieA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 12, 2013

I worked it out in the end... key things were that confluence uses StAX, which is more complex than DOM-based APIs, and I wasn't overly familiar with it.

The other thing was that when feeding a page to the parser it gets wrapped in an xml declation which contains namespace mappings for several Atlassian namespaces, plus DTDs which allow you to use &nbsp; etc. This content is then itself wrapped in an XML reader that ignores events for the <xml> element, hence I never saw them when debugging. I was struggling to work out how a page's content could be treated as valid xml until this enlightenment.

I used the XhtmlLinkUpdater class to work this out...

Once I got over these hurdles I was OK.

Joe Clark
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
June 12, 2013

Definitely the way to go. StAX also means you can avoid loading the entire page body into memory all in one go, which can have pretty negative performance implications on large or busy instances.

The reason that the API for this isn't super awesome is because Confluence rarely manipulates its own storage format directly - the editor transforms the storage format into editor format (HTML), manipulates that, and then sends it back to storage format.

Glad you worked it out!

Eddie Stanley June 27, 2013

Hi Jamie, I'm trying to do something similar.

I'm writing a footnotes macro plugin (I know there's one already, but it doesn't meet our needs for various reasons).

The idea is that each time the footnote macro is invoked I'll capture the body text which was fed to the macro. Then the plugin will check the XHTML storage DOM for a custom element (footnote definitions), creating it if it doesn't already exist (first footnote).It will count the number of entries already in the footnote definitions element to arrive at the "next footnote number". It will add an entry (footnote number + footnote text) into the footnote definitions element. Finally, the macro will return something like "(1)" (i.e. the footnote number) as the macro text.

Can you provide details on how you went about using StAX within your Confluence plugin to manipulate the storage?

Can you see any reasons why what I'm trying to do won't work or why it might be a really bad idea?

JamieA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 27, 2013

Hello Eddie,

The fruits of my labours are at Confluence Script Runner, see also Installation.

You can download the jar - most of the source is in there. I hear what Joe is saying, but personally I am surprised that the overhead of using DOM is that significant, considering the tradeoff in complexity of code when using a streaming API. Most confluence pages are small by XML doc terms, so am surprised if DOM would cause a great memory usage.

Anyway, I used StAX for updating page links, look at the NonWikiLinkConverter, or something. But there is also an example of using DOM and xpath, which might be easier for you. Not sure if I included an example of adding nodes with DOM, think it's in my test code area.

> Can you see any reasons why what I'm trying to do won't work or why it might be a really bad idea?

I'm not sure I fully understand what you are trying to do... you won't want to modify the content when the macro is "invoked" I think? Unless just the first time?

1 vote
LucasA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 12, 2013

Hi Jamie,

I'm not a development expert, but I believe that this document can help you: https://developer.atlassian.com/display/CONFDEV/Confluence+XML-RPC+and+SOAP+APIs.

There you'll find all the available documentation regarding Confluence remote API.

Cheers,

Lucas Lima

JamieA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 12, 2013

thanks, but... not really what I was looking for.

TAGS
AUG Leaders

Atlassian Community Events