Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Next challenges

Recent achievements

  • Global
  • Personal

Recognition

  • Give kudos
  • Received
  • Given

Leaderboard

  • Global

Trophy case

Kudos (beta program)

Kudos logo

You've been invited into the Kudos (beta program) private group. Chat with others in the program, or give feedback to Atlassian.

View group

It's not the same without you

Join the community to find out what other Atlassian users are discussing, debating and creating.

Atlassian Community Hero Image Collage

Junk HTML develops over time in Confluence pages

Is anyone else seeing this problem? Over time, I see a lot of junk HTML inserted into my Confluence pages.

I will work on a page, then come back to it and work on it further (applying the styles from the menu or just typing directly in Confluence). Over time, the HTML source (as viewed in the Confluence editor) collects "junk". For example, a randomly inserted non-breaking space. Part of a paragraph will change to a different color, the font will change to a different style (bold, italic), letter spacing will change, etc. Rather than simple HTML, complex formatting gets inserted seemingly at random. The only way I could see this "junk" entering Confluence is if someone were copying text from a pre-formatted source into the page, which I've already ruled out: as the sole editor of a page I can close and re-open it moments  or days later and see new junk in the HTML that I know I didn't enter (and there were no other editors according to the page history).

Are you seeing this problem?

Examples of simple HTML tags that got corrupted over time: (corrupted parts underlined)

  • <p><spanstyle="color: rgb(0,0,0);">blah blah blah...
  • <h2><span style="letter-spacing: -0.008em;">Features</span></h2>
  • <p>Refer to the chapter&nbsp;<em>blah blah blah...
  • <p class="BodyText1">Blah blah blah&nbsp;blah blah blah...
  • <h6><strong><span style="color: rgb(94,108,132);">Blah blah blah...</span></strong></h6>
  • <p><span style="letter-spacing: 0.0px;">Blah blah blah...
  • <h2><span style="font-size: 20.0px;letter-spacing: -0.008em;">Blah blah blah...
  • Blah blah blah...<strong style="letter-spacing: 0.0px;">blah blah blah</strong>
  • Blah blah blah <em>blah </em>blah blah blah...
  • <li><span style="letter-spacing: 0.0px;">Blah blah blah

This junk, when exported to Word, creates an even bigger mess of styles that I have to manually correct and map to a single (example) Heading 1 style or body style.

Atlassian's position is that the only solution is to buy a third-party extension (not in my budget) that will fix the junk output (according to the sales pitches of the extension manufacturers...).

1 answer

2 votes
Bill Bailey Community Leader Aug 18, 2020

I have been a Confluence user for probably 8 years now. And the only time I have seen this type of thing is from people copying and pasting from Word or from another HTML view (even of a Confluence page).

Have you seen this with a different browser? Just trying to think of what could be affecting the entry of formatted text. Browser extension? TinyMCE extension?

My fix for this is the free version of the source editor and Regex to go through and cleanse a page.

Hm. That's a thought... I can't swear nobody's been copying from one Confluence page to another... We know better than to copy from Word or the Internet or anything, but I suppose I haven't specifically prohibited copying from one page to another within Confluence. I'll look into that, thanks!

It's definitely not browser (or platform)-specific, I'm seeing this on hundreds of pages. Regex fixing each page <shudder> I suppose that'll be my best solution. Ouch.

Thanks, Bill! I'll check into this.

Bill Bailey Community Leader Aug 18, 2020

Well maybe after you educate users and then clean up pages, it will stop. I have a long page of various RegEx patterns I use to clean out crap, for example, span tags:

</?span.*?>

Have fun!

Like # people like this

Hi Bill and Laura,

could you share your regexes and additional wisdom?

We're in a similar spot (hundreds of requirements copy/pasted from Excel, Word, PDF and HTML, including tables - plus text first colored red, then black ("Black must be the default!?")). You can't imagine the mess...

Thanks!

Hi Roman,

I do feel your pain. Unfortunately, I have no additional wisdom to add beyond what Bill Bailey said: no copying-pasting or you'll get junk. It's an epic fail on the part of Atlassian, because training developers to not copy-paste is not a solution to bad software design on the part of Atlassian. Developers are not technical writers and have no idea what a style or stylesheet is or how it should be used. And they shouldn't need to know this.

Unfortunately, our respective organizations are using a tool not mean for requirements or tech docs development/storage/output. It is for collaboration on ideas, taking meeting notes, etc. Here is their page describing how it could be used (https://www.atlassian.com/software/confluence) , but it is overstating the usefulness when it comes to collaboration: what good is collaborating on creating information if it cannot be output to common tools like Word? And cannot take input from any outside sources without creating stylistic chaos? <shrug>

This is a failure to use the tool as it was intended: taking meeting notes and basic blogging with no intention of output or formatting consistency. Sorry I don't have better news for you.

Best of luck,

Laura

Suggest an answer

Log in or Sign up to answer
TAGS
Community showcase
Published in Confluence

Announcing Team Calendars in Confluence Data Center

Hi Community! We're thrilled to share that Team Calendars for Confluence is now a built-in feature for Confluence Data Center releases 7.11 and beyond.  A long time favorite,  Team Cale...

156 views 0 5
Read article

Community Events

Connect with like-minded Atlassian users at free events near you!

Find an event

Connect with like-minded Atlassian users at free events near you!

Unfortunately there are no Community Events near you at the moment.

Host an event

You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events

Events near you