JDOM2 parsing problems on RemotePage.getContent() in the v2 SOAP API

I am trying to parse the RemotePage.getContent into a JDOM2 document and then output that document again.

I have so far run into the following problems:

  1. The results of RemotePage.getContent() is the content of XHTML <body> but without a top element. I got around this by wrapping the getContent() results in a <body> tag
  2. The next error message I got was 'The entity "oslash" was referenced, but not declared.'. I got around this by slurping xhtml-lat1.ent into the string lat1Entities and prefixing "<body>" with "<!DOCTYPE body [" + lat1Entities + "]> "
  3. The third error message was a parsing error caused by the "as" namespace for a macro (this one I haven't figured out a workaround for yet)

I also perceive a possible problem with getting XMLOutputter to output the content without the body element (possible workaround: iterate over all element children of Document and output each element separately (but more cumbersome than it has to be)).

Also, I suspect that Confluence will prefer the lat1 (and other special characters) as character entities, and I don't know what XMLOutputter will do here.

Is there a simpler approach? Is there a way I can get the full XHTML for the page, including namespace, and DOCTYPE declarations?

Are there more correct ways of handling the problems I have encountered than the ones I've used so far?

3 answers

1 accepted

0 votes
Accepted answer

Ok, now I have successfully round-tripped the XHTML content using the SOAP, API

First: the element to use to surround the RemotePage.getContents() before sending to the SAXReader, is:

&lt;ac:confluence
  xmlns:ac="http://www.atlassian.com/schema/confluence/4/ac/"
  xmlns:ri="http://www.atlassian.com/schema/confluence/4/ri/"
  xmlns="http://www.atlassian.com/schema/confluence/4/"&gt;

The round-trip goes like this:

  • Get a remote page Java bean with service.getPage(securityToken, spaceKey, pageTitle)
  • Get the page content as a string using RemotePage.getContent()
  • Wrap the page content in an <ac:confluence> element with namespace declarations, and a DOCTYPE adding the necessary character entities
  • Parse the results into JDOM
  • Output the children of the root element into a string
  • Prepend the string with an extra "<p>"
  • Update the RemotePage bean
  • Send the updated bean to Confluence with the service.updatePage() method

The re-serialized JDOM contained xmlns declarations not in the parsed input, and did not encode latin1 characters as character entities, but none of these created any problems that I could see.

A useful resource for me was the blog post "How to build a Confluence SOAP client in 5 minutes". I used m2e in eclipse to build the POM, so I had to add some extra stuff to make eclipse generate Java from the WSDL file and then compile the generated Java code.

Here is the class. The ApplicationProperties class is just a class that reads the META-INF/application.properties and provides static getters for the properties. In the getters System.property is checked for the same property, to let -D flags override the application.properties values:

public class PageReader
{
    public static void main(String[] args) throws ServiceException, JDOMException, IOException
    {
        final ConfluenceSoapService service;
        ConfluenceSoapServiceServiceLocator serviceLocator = new ConfluenceSoapServiceServiceLocator();
        service = serviceLocator.getConfluenceserviceV2();
 
        // insert your account data here
        String token = service.login(ApplicationProperties.getSoapUser(), ApplicationProperties.getSoapPassword());
 
        // Fetch a page to see what we have.
        String spaceKey = "~STEINARB";
        String pageTitle = "This is a test page";
        RemotePage page = service.getPage(token, spaceKey, pageTitle);
        String content = wrapContentInBodyElement(page.getContent());
        System.out.println("content: " + page.getContent());
 
        SAXBuilder builder = new SAXBuilder();
 
        Document doc = builder.build(new StringReader(content));
 
        XMLOutputter xmlOutputter = new XMLOutputter();
        String processedContent = xmlOutputter.outputElementContentString(doc.getRootElement());
        processedContent = "&lt;p&gt;Hello&lt;/p&gt;" + processedContent;
        System.out.println();
        System.out.println("processedContent: " + processedContent);
        page.setContent(processedContent);
        service.updatePage(token, page, new RemotePageUpdateOptions(true, "Says Hello"));
    }
 
    private static String wrapContentInBodyElement(String content) throws IOException {
        String lat1Entities = IOUtils.toString(PageReader.class.getResourceAsStream("/xhtml/xhtml-lat1.ent"));
        return "&lt;!DOCTYPE ac:confluence [ "
                + lat1Entities
                + "]&gt; &lt;ac:confluence xmlns:ac=\"http://www.atlassian.com/schema/confluence/4/ac/\" xmlns:ri=\"http://www.atlassian.com/schema/confluence/4/ri/\" xmlns=\"http://www.atlassian.com/schema/confluence/4/\" &gt;"
                + content + "&lt;/ac:confluence&gt;";
    }
}

Here is the POM file that will generate Java from WSDL and compile the generated Java:

&lt;project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"&gt;
  &lt;modelVersion&gt;4.0.0&lt;/modelVersion&gt;

  &lt;groupId&gt;no.steria.steinarb&lt;/groupId&gt;
  &lt;artifactId&gt;confluence-soap-client&lt;/artifactId&gt;
  &lt;version&gt;0.0.1-SNAPSHOT&lt;/version&gt;
  &lt;packaging&gt;jar&lt;/packaging&gt;

  &lt;name&gt;confluence-soap-client&lt;/name&gt;
  &lt;url&gt;http://maven.apache.org&lt;/url&gt;

  &lt;properties&gt;
    &lt;project.build.sourceEncoding&gt;UTF-8&lt;/project.build.sourceEncoding&gt;
  &lt;/properties&gt;

  &lt;dependencies&gt;
    &lt;dependency&gt;
      &lt;groupId&gt;junit&lt;/groupId&gt;
      &lt;artifactId&gt;junit&lt;/artifactId&gt;
      &lt;version&gt;4.10&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;axis&lt;/groupId&gt;
    	&lt;artifactId&gt;axis&lt;/artifactId&gt;
    	&lt;version&gt;1.4&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;javax.xml&lt;/groupId&gt;
    	&lt;artifactId&gt;jaxrpc-api&lt;/artifactId&gt;
    	&lt;version&gt;1.1&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;commons-discovery&lt;/groupId&gt;
    	&lt;artifactId&gt;commons-discovery&lt;/artifactId&gt;
    	&lt;version&gt;0.4&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;javax.xml.soap&lt;/groupId&gt;
    	&lt;artifactId&gt;saaj-api&lt;/artifactId&gt;
    	&lt;version&gt;1.3&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;axis&lt;/groupId&gt;
    	&lt;artifactId&gt;axis-wsdl4j&lt;/artifactId&gt;
    	&lt;version&gt;1.5.1&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;commons-io&lt;/groupId&gt;
    	&lt;artifactId&gt;commons-io&lt;/artifactId&gt;
    	&lt;version&gt;2.4&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;org.jdom&lt;/groupId&gt;
    	&lt;artifactId&gt;jdom&lt;/artifactId&gt;
    	&lt;version&gt;2.0.2&lt;/version&gt;
    &lt;/dependency&gt;
  &lt;/dependencies&gt;
 
  &lt;build&gt;
    &lt;pluginManagement&gt;
      &lt;plugins&gt;
        &lt;plugin&gt;
          &lt;groupId&gt;org.eclipse.m2e&lt;/groupId&gt;
          &lt;artifactId&gt;lifecycle-mapping&lt;/artifactId&gt;
          &lt;version&gt;1.0.0&lt;/version&gt;
          &lt;configuration&gt;
            &lt;lifecycleMappingMetadata&gt;
              &lt;pluginExecutions&gt;
                &lt;pluginExecution&gt;
                  &lt;pluginExecutionFilter&gt;
                    &lt;groupId&gt;org.codehaus.mojo&lt;/groupId&gt;
                    &lt;artifactId&gt;axistools-maven-plugin&lt;/artifactId&gt;
                    &lt;versionRange&gt;[1.0,)&lt;/versionRange&gt;
                    &lt;goals&gt;
                      &lt;goal&gt;wsdl2java&lt;/goal&gt;
                    &lt;/goals&gt;
                  &lt;/pluginExecutionFilter&gt;
                  &lt;action&gt;
                    &lt;execute /&gt;
                  &lt;/action&gt;
                &lt;/pluginExecution&gt;
                &lt;pluginExecution&gt;
                  &lt;pluginExecutionFilter&gt;
                    &lt;groupId&gt;org.codehaus.mojo&lt;/groupId&gt;
                    &lt;artifactId&gt;build-helper-maven-plugin&lt;/artifactId&gt;
                    &lt;versionRange&gt;[1.0,)&lt;/versionRange&gt;
                    &lt;goals&gt;
                      &lt;goal&gt;parse-version&lt;/goal&gt;
                      &lt;goal&gt;add-source&lt;/goal&gt;
                      &lt;goal&gt;maven-version&lt;/goal&gt;
                      &lt;goal&gt;add-resource&lt;/goal&gt;
                      &lt;goal&gt;add-test-resource&lt;/goal&gt;
                      &lt;goal&gt;add-test-source&lt;/goal&gt;
                    &lt;/goals&gt;
                  &lt;/pluginExecutionFilter&gt;
                  &lt;action&gt;
                    &lt;execute&gt;
                      &lt;runOnConfiguration&gt;true&lt;/runOnConfiguration&gt;
                      &lt;runOnIncremental&gt;true&lt;/runOnIncremental&gt;
                    &lt;/execute&gt;
                  &lt;/action&gt;
                &lt;/pluginExecution&gt;
              &lt;/pluginExecutions&gt;
            &lt;/lifecycleMappingMetadata&gt;
          &lt;/configuration&gt;
        &lt;/plugin&gt;
      &lt;/plugins&gt;
    &lt;/pluginManagement&gt;

    &lt;plugins&gt;
      &lt;plugin&gt;
        &lt;groupId&gt;org.apache.maven.plugins&lt;/groupId&gt;
        &lt;artifactId&gt;maven-compiler-plugin&lt;/artifactId&gt;
        &lt;configuration&gt;
          &lt;source&gt;1.7&lt;/source&gt;
          &lt;target&gt;1.7&lt;/target&gt;
        &lt;/configuration&gt;
      &lt;/plugin&gt;
      &lt;plugin&gt;
        &lt;groupId&gt;org.codehaus.mojo&lt;/groupId&gt;
        &lt;artifactId&gt;axistools-maven-plugin&lt;/artifactId&gt;
        &lt;executions&gt;
          &lt;execution&gt;
            &lt;phase&gt;generate-sources&lt;/phase&gt;
            &lt;goals&gt;
              &lt;goal&gt;wsdl2java&lt;/goal&gt;
            &lt;/goals&gt;
          &lt;/execution&gt;
        &lt;/executions&gt;
      &lt;/plugin&gt;
      &lt;plugin&gt;
        &lt;groupId&gt;org.codehaus.mojo&lt;/groupId&gt;
        &lt;artifactId&gt;build-helper-maven-plugin&lt;/artifactId&gt;
        &lt;executions&gt;
          &lt;execution&gt;
            &lt;id&gt;add-source&lt;/id&gt;
            &lt;phase&gt;generate-sources&lt;/phase&gt;
            &lt;goals&gt;
              &lt;goal&gt;add-source&lt;/goal&gt;
            &lt;/goals&gt;
            &lt;configuration&gt;
              &lt;sources&gt;
                &lt;source&gt;${project.build.directory}/generated-sources/axistools/wsdl2java&lt;/source&gt;
              &lt;/sources&gt;
            &lt;/configuration&gt;
          &lt;/execution&gt;
        &lt;/executions&gt;
      &lt;/plugin&gt;
    &lt;/plugins&gt;
  &lt;/build&gt;
&lt;/project&gt;

The first bit of the output part (output only children of the root element) was easy:

String processedContent = xmlOutputter.outputElementContentString(doc.getRootElement());

I haven't figured out how to make it output character entities yet, though.

Not entirely there: I get an xmlns:ac declaration directly on the macro elements, that isn't there in the input.

In addition to not being on the input it exposes my dummy namespace URL.

Suggest an answer

Log in or Sign up to answer
Community showcase
Posted Dec 10, 2018 in Confluence

Organizing your space just got easier - Page Tree Drag & Drop is here

Hi Community! I’m Elaine, Confluence Product Manager. You may have read my earlier post about page tree in space navigation sidebar. I'm excited to share another improvement that helps you organize ...

188 views 4 6
Join discussion

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you