JDOM2 parsing problems on RemotePage.getContent() in the v2 SOAP API

Steinar Bang
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 6, 2014

I am trying to parse the RemotePage.getContent into a JDOM2 document and then output that document again.

I have so far run into the following problems:

  1. The results of RemotePage.getContent() is the content of XHTML <body> but without a top element. I got around this by wrapping the getContent() results in a <body> tag
  2. The next error message I got was 'The entity "oslash" was referenced, but not declared.'. I got around this by slurping xhtml-lat1.ent into the string lat1Entities and prefixing "<body>" with "<!DOCTYPE body [" + lat1Entities + "]> "
  3. The third error message was a parsing error caused by the "as" namespace for a macro (this one I haven't figured out a workaround for yet)

I also perceive a possible problem with getting XMLOutputter to output the content without the body element (possible workaround: iterate over all element children of Document and output each element separately (but more cumbersome than it has to be)).

Also, I suspect that Confluence will prefer the lat1 (and other special characters) as character entities, and I don't know what XMLOutputter will do here.

Is there a simpler approach? Is there a way I can get the full XHTML for the page, including namespace, and DOCTYPE declarations?

Are there more correct ways of handling the problems I have encountered than the ones I've used so far?

3 answers

1 accepted

0 votes
Answer accepted
Steinar Bang
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 6, 2014

Ok, now I have successfully round-tripped the XHTML content using the SOAP, API

First: the element to use to surround the RemotePage.getContents() before sending to the SAXReader, is:

&lt;ac:confluence
  xmlns:ac="http://www.atlassian.com/schema/confluence/4/ac/"
  xmlns:ri="http://www.atlassian.com/schema/confluence/4/ri/"
  xmlns="http://www.atlassian.com/schema/confluence/4/"&gt;

The round-trip goes like this:

  • Get a remote page Java bean with service.getPage(securityToken, spaceKey, pageTitle)
  • Get the page content as a string using RemotePage.getContent()
  • Wrap the page content in an <ac:confluence> element with namespace declarations, and a DOCTYPE adding the necessary character entities
  • Parse the results into JDOM
  • Output the children of the root element into a string
  • Prepend the string with an extra "<p>"
  • Update the RemotePage bean
  • Send the updated bean to Confluence with the service.updatePage() method

The re-serialized JDOM contained xmlns declarations not in the parsed input, and did not encode latin1 characters as character entities, but none of these created any problems that I could see.

A useful resource for me was the blog post "How to build a Confluence SOAP client in 5 minutes". I used m2e in eclipse to build the POM, so I had to add some extra stuff to make eclipse generate Java from the WSDL file and then compile the generated Java code.

Here is the class. The ApplicationProperties class is just a class that reads the META-INF/application.properties and provides static getters for the properties. In the getters System.property is checked for the same property, to let -D flags override the application.properties values:

public class PageReader
{
    public static void main(String[] args) throws ServiceException, JDOMException, IOException
    {
        final ConfluenceSoapService service;
        ConfluenceSoapServiceServiceLocator serviceLocator = new ConfluenceSoapServiceServiceLocator();
        service = serviceLocator.getConfluenceserviceV2();
 
        // insert your account data here
        String token = service.login(ApplicationProperties.getSoapUser(), ApplicationProperties.getSoapPassword());
 
        // Fetch a page to see what we have.
        String spaceKey = "~STEINARB";
        String pageTitle = "This is a test page";
        RemotePage page = service.getPage(token, spaceKey, pageTitle);
        String content = wrapContentInBodyElement(page.getContent());
        System.out.println("content: " + page.getContent());
 
        SAXBuilder builder = new SAXBuilder();
 
        Document doc = builder.build(new StringReader(content));
 
        XMLOutputter xmlOutputter = new XMLOutputter();
        String processedContent = xmlOutputter.outputElementContentString(doc.getRootElement());
        processedContent = "&lt;p&gt;Hello&lt;/p&gt;" + processedContent;
        System.out.println();
        System.out.println("processedContent: " + processedContent);
        page.setContent(processedContent);
        service.updatePage(token, page, new RemotePageUpdateOptions(true, "Says Hello"));
    }
 
    private static String wrapContentInBodyElement(String content) throws IOException {
        String lat1Entities = IOUtils.toString(PageReader.class.getResourceAsStream("/xhtml/xhtml-lat1.ent"));
        return "&lt;!DOCTYPE ac:confluence [ "
                + lat1Entities
                + "]&gt; &lt;ac:confluence xmlns:ac=\"http://www.atlassian.com/schema/confluence/4/ac/\" xmlns:ri=\"http://www.atlassian.com/schema/confluence/4/ri/\" xmlns=\"http://www.atlassian.com/schema/confluence/4/\" &gt;"
                + content + "&lt;/ac:confluence&gt;";
    }
}

Here is the POM file that will generate Java from WSDL and compile the generated Java:

&lt;project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"&gt;
  &lt;modelVersion&gt;4.0.0&lt;/modelVersion&gt;

  &lt;groupId&gt;no.steria.steinarb&lt;/groupId&gt;
  &lt;artifactId&gt;confluence-soap-client&lt;/artifactId&gt;
  &lt;version&gt;0.0.1-SNAPSHOT&lt;/version&gt;
  &lt;packaging&gt;jar&lt;/packaging&gt;

  &lt;name&gt;confluence-soap-client&lt;/name&gt;
  &lt;url&gt;http://maven.apache.org&lt;/url&gt;

  &lt;properties&gt;
    &lt;project.build.sourceEncoding&gt;UTF-8&lt;/project.build.sourceEncoding&gt;
  &lt;/properties&gt;

  &lt;dependencies&gt;
    &lt;dependency&gt;
      &lt;groupId&gt;junit&lt;/groupId&gt;
      &lt;artifactId&gt;junit&lt;/artifactId&gt;
      &lt;version&gt;4.10&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;axis&lt;/groupId&gt;
    	&lt;artifactId&gt;axis&lt;/artifactId&gt;
    	&lt;version&gt;1.4&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;javax.xml&lt;/groupId&gt;
    	&lt;artifactId&gt;jaxrpc-api&lt;/artifactId&gt;
    	&lt;version&gt;1.1&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;commons-discovery&lt;/groupId&gt;
    	&lt;artifactId&gt;commons-discovery&lt;/artifactId&gt;
    	&lt;version&gt;0.4&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;javax.xml.soap&lt;/groupId&gt;
    	&lt;artifactId&gt;saaj-api&lt;/artifactId&gt;
    	&lt;version&gt;1.3&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;axis&lt;/groupId&gt;
    	&lt;artifactId&gt;axis-wsdl4j&lt;/artifactId&gt;
    	&lt;version&gt;1.5.1&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;commons-io&lt;/groupId&gt;
    	&lt;artifactId&gt;commons-io&lt;/artifactId&gt;
    	&lt;version&gt;2.4&lt;/version&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
    	&lt;groupId&gt;org.jdom&lt;/groupId&gt;
    	&lt;artifactId&gt;jdom&lt;/artifactId&gt;
    	&lt;version&gt;2.0.2&lt;/version&gt;
    &lt;/dependency&gt;
  &lt;/dependencies&gt;
 
  &lt;build&gt;
    &lt;pluginManagement&gt;
      &lt;plugins&gt;
        &lt;plugin&gt;
          &lt;groupId&gt;org.eclipse.m2e&lt;/groupId&gt;
          &lt;artifactId&gt;lifecycle-mapping&lt;/artifactId&gt;
          &lt;version&gt;1.0.0&lt;/version&gt;
          &lt;configuration&gt;
            &lt;lifecycleMappingMetadata&gt;
              &lt;pluginExecutions&gt;
                &lt;pluginExecution&gt;
                  &lt;pluginExecutionFilter&gt;
                    &lt;groupId&gt;org.codehaus.mojo&lt;/groupId&gt;
                    &lt;artifactId&gt;axistools-maven-plugin&lt;/artifactId&gt;
                    &lt;versionRange&gt;[1.0,)&lt;/versionRange&gt;
                    &lt;goals&gt;
                      &lt;goal&gt;wsdl2java&lt;/goal&gt;
                    &lt;/goals&gt;
                  &lt;/pluginExecutionFilter&gt;
                  &lt;action&gt;
                    &lt;execute /&gt;
                  &lt;/action&gt;
                &lt;/pluginExecution&gt;
                &lt;pluginExecution&gt;
                  &lt;pluginExecutionFilter&gt;
                    &lt;groupId&gt;org.codehaus.mojo&lt;/groupId&gt;
                    &lt;artifactId&gt;build-helper-maven-plugin&lt;/artifactId&gt;
                    &lt;versionRange&gt;[1.0,)&lt;/versionRange&gt;
                    &lt;goals&gt;
                      &lt;goal&gt;parse-version&lt;/goal&gt;
                      &lt;goal&gt;add-source&lt;/goal&gt;
                      &lt;goal&gt;maven-version&lt;/goal&gt;
                      &lt;goal&gt;add-resource&lt;/goal&gt;
                      &lt;goal&gt;add-test-resource&lt;/goal&gt;
                      &lt;goal&gt;add-test-source&lt;/goal&gt;
                    &lt;/goals&gt;
                  &lt;/pluginExecutionFilter&gt;
                  &lt;action&gt;
                    &lt;execute&gt;
                      &lt;runOnConfiguration&gt;true&lt;/runOnConfiguration&gt;
                      &lt;runOnIncremental&gt;true&lt;/runOnIncremental&gt;
                    &lt;/execute&gt;
                  &lt;/action&gt;
                &lt;/pluginExecution&gt;
              &lt;/pluginExecutions&gt;
            &lt;/lifecycleMappingMetadata&gt;
          &lt;/configuration&gt;
        &lt;/plugin&gt;
      &lt;/plugins&gt;
    &lt;/pluginManagement&gt;

    &lt;plugins&gt;
      &lt;plugin&gt;
        &lt;groupId&gt;org.apache.maven.plugins&lt;/groupId&gt;
        &lt;artifactId&gt;maven-compiler-plugin&lt;/artifactId&gt;
        &lt;configuration&gt;
          &lt;source&gt;1.7&lt;/source&gt;
          &lt;target&gt;1.7&lt;/target&gt;
        &lt;/configuration&gt;
      &lt;/plugin&gt;
      &lt;plugin&gt;
        &lt;groupId&gt;org.codehaus.mojo&lt;/groupId&gt;
        &lt;artifactId&gt;axistools-maven-plugin&lt;/artifactId&gt;
        &lt;executions&gt;
          &lt;execution&gt;
            &lt;phase&gt;generate-sources&lt;/phase&gt;
            &lt;goals&gt;
              &lt;goal&gt;wsdl2java&lt;/goal&gt;
            &lt;/goals&gt;
          &lt;/execution&gt;
        &lt;/executions&gt;
      &lt;/plugin&gt;
      &lt;plugin&gt;
        &lt;groupId&gt;org.codehaus.mojo&lt;/groupId&gt;
        &lt;artifactId&gt;build-helper-maven-plugin&lt;/artifactId&gt;
        &lt;executions&gt;
          &lt;execution&gt;
            &lt;id&gt;add-source&lt;/id&gt;
            &lt;phase&gt;generate-sources&lt;/phase&gt;
            &lt;goals&gt;
              &lt;goal&gt;add-source&lt;/goal&gt;
            &lt;/goals&gt;
            &lt;configuration&gt;
              &lt;sources&gt;
                &lt;source&gt;${project.build.directory}/generated-sources/axistools/wsdl2java&lt;/source&gt;
              &lt;/sources&gt;
            &lt;/configuration&gt;
          &lt;/execution&gt;
        &lt;/executions&gt;
      &lt;/plugin&gt;
    &lt;/plugins&gt;
  &lt;/build&gt;
&lt;/project&gt;

0 votes
Steinar Bang
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 6, 2014

Not entirely there: I get an xmlns:ac declaration directly on the macro elements, that isn't there in the input.

In addition to not being on the input it exposes my dummy namespace URL.

0 votes
Steinar Bang
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 6, 2014

The first bit of the output part (output only children of the root element) was easy:

String processedContent = xmlOutputter.outputElementContentString(doc.getRootElement());

I haven't figured out how to make it output character entities yet, though.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events