ViewSource bug? View Source versions of Confluence pages appearing in Google Search Results

In checking Google search results for our public Confluence space today, I came across some strange results, which are shown in the attached screenshot. The base URLs for all these is:

<cite>docs.delphix.com/plugins/viewsource/viewpagesrc.action</cite>

with the page GUID appended. If you click through, you see a version of this page without any wrapper around it, just the content (which is formatted as it should be).

This is a very small subset of the total number of topics, and I cannot find anything that they have in common.

On further investigation, I was able to find many other examples of these search results for other publiclly accessible Confluence spaces, as shown in the second screen shot (which includes the search terms). What's slightly distressing is that some of these are https sites, so it seems that this bug might enable someone to see a secured page, even if it's in "source" format.

Can anyone explain to me why these pages are showing up in Google, and what the viewsource plugin has to do with it? I would really prefer that my content not show up in Google without the other information attached to it.

1 answer

You can limit the search engine spider to getting to those pages by setting a robots.txt.

In the first comment of Prevent Search Engine Indexing Using Robots.txt there is a great example of one.

But the issue is that I want my site indexed and to show up in search results, I just don't know why these particular pages are showing up with the viewsource url. And, as I pointed out, there are other random pages from Confluence sites out there that you can find by searching on the View Source keywords, some of which are supposedly behind login firewalls. It don't think it's a question of setting up a robots.txt file, I think there's something strange going on with Confluence that exposes these pages in View Source mode to search engine indexing. Could it be something to do with Confluence caching pages in some way when they are being edited that exposes those page versions to indexing?

Suggest an answer

Log in or Sign up to answer
Atlassian Community Anniversary

Happy Anniversary, Atlassian Community!

This community is celebrating its one-year anniversary and Atlassian co-founder Mike Cannon-Brookes has all the feels.

Read more
Community showcase
Kesha Thillainayagam
Posted Apr 13, 2018 in Confluence

We want to hear how your non-technical teams are using Confluence!

Hi Community! Kesha (kay-sha) from the Confluence marketing team here! Can you share stories with us on how your non-technical (think Marketing, Sales, HR, legal, etc.) teams are using Confluen...

1,010 views 22 10
Join discussion

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you