Google Cannot Index Public Spaces: Confluence does not permit indexing

James Bourne
Contributor
September 29, 2024

Hello all,

Atlassian Confluence, hosted in cloud. Trialing it to determine whether it is usable for Internet facing product documentation and other knowledge base articles.

Whilst pages can be crawled (per robots.txt directive), they cannot be indexed (per per page robots meta directive).

Where can this setting be adjusted?

Thanks in advance.

 

 

2 answers

1 vote
Kristian Klima
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
September 30, 2024

Hi @James Bourne 

We're using Confluence for our public facing documentation..... Confluence works as a CMS/LCM tool, our website is made in an app called Scroll Viewport.

The great thing is full Google Analytics integration and excellent SEO. Many of our pages are in the top 5 search results for some generic terms in the industry, often beating our corporate site and I'm not moving a finger to achieve that - except for writing good content. 

There are other similar apps in the marketplace that de-confluence your doc site, improve experience, integrate with GA and, yes, make your site visible (Refined, Spacecraft, Instant Websites)

James Bourne
Contributor
September 30, 2024

Thanks, Kristian. So what you are saying is that Confluence Public Spaces are hopeless from an Internet-facing document and knowledge base perspective and you have to buy third-party plugins to make them look OK and for them to work as expected (i.e. SEO and spidering). I've been down this route in the past with WordPress and the horror of that product's plugins. It just becomes a headache to manage and maintain.

James Bourne
Contributor
September 30, 2024

Sorry, what are/is LCM, Refined, Spacecraft, and Instant Websites?

Chris Matkaris
Contributor
October 11, 2024

Hello @James Bourne

LCM stands for Learning Content Management (System)

Refined, Spacecraft, and Instant Websites are Marketplace Apps that allow you to create a separate website and connect the existing Confluence content.

The documentation link that @Kristian Klima, the great, shared is not under atlassian.net if you notice. It is another website created with Scroll Viewport as Kristian said, and as you can also see at the end of the footer! But that content is created and managed in Confluence.

Regarding your initial question, I am suprised to see a public page containing the noindex directive! Could you please share an example URL?

 

Like Kristian Klima likes this
James Bourne
Contributor
October 11, 2024

https://firedaemon.atlassian.net/wiki/external/NGM2NjM4NzA1YjBkNGE3ODg4MTg0ZTE5OThiNDI5ZTc

View source (of course!) to see:

<meta name="robots" content="noindex">

Chris Matkaris
Contributor
October 13, 2024

Thank you James!

It is weird that it appears there..
however, IT does not apear in any of the 2 pages in your public space here

Could it be a plugin adding it there? Or maybe the macros that are forbiding their content to appear?

Screenshot 2024-10-14 at 08.54.02.png

James Bourne
Contributor
October 14, 2024

That was just the default theme I chose for the space. I deleted that space and created another ostensibly blank one and you still have the same problem.

https://firedaemon.atlassian.net/wiki/external/NjY1YWY0MWVhZjgxNDcwMmI1YWFkZjM1MDg3MGNiM2E

So I suspect this is intentional. Plus there's no way to set any form of SEO. So how does one request a feature or report a bug?

Chris Matkaris
Contributor
October 14, 2024
Kristian Klima
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
October 14, 2024

@James Bourne 

I wonder, would this help?

2024-10-14_11-10-59.png

It's in the Confluence site settings under Security.

Kristian Klima
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
October 14, 2024

Actually, @James Bourne one more thing.... That link that you shared appears to be a Public link.

Public links are intended for ad hoc sharing of individual pages with specific people/entities outside of your org. 

If you want to share the entire space as a standalone documentation portal, you have to go to the specific space's settings and allow anonymous access to that space (preferably, view only).

This way, you're sharing the entire space as a whole unit - links between the pages will work and, my hunch is, the site will get indexed etc.

 

Like Chris Matkaris likes this
James Bourne
Contributor
October 14, 2024

I've allowed anonymous access. The security configuration settings do nothing meaningful in the context of a public space with anonymous access. From my reading they relate to mitigating blog spamming.

Like Chris Matkaris likes this
Kristian Klima
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
October 14, 2024

Do you have both Public links and Anonymous access enabled in that space?

If so, disable public links. 

A public link is, by definition, redundant to if Anonymous access is enabled (a couple of cases here showed there are conflicts between the two).

 

Example of the public site created on Confluence:

https://appfire.atlassian.net/wiki/spaces/CDACL/overview

Like Chris Matkaris likes this
0 votes
Brant Schroeder
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
September 29, 2024

@James Bourne I am not aware of a feature that allows you to do this within Confluence.  

What you could do is manually request that Google remove the specific URL from their indexing using the Google Webmaster tools.

There are some feature requests that relate to this here that you could vote for.

James Bourne
Contributor
September 29, 2024

Thanks for this. This is not about URL removal. It's about getting webpages spidered and indexed by Google. The problem is Confluence itself, and the inclusion of a specific meta tag in the page header which contradicts robots.txt.

Robots.txt contains the following:

Allow: /wiki/

The Public page contains:

<meta name="robots" content="noindex">

 How do I configure Confluence so that it does not contain the meta tag and Public Spaces are fully indexable and searchable via Google?

Brant Schroeder
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
September 29, 2024

@James Bourne sorry misread that.  I am not aware of a way to remove that.  Did you set the space up for anonymous access?  https://support.atlassian.com/confluence-cloud/docs/make-a-space-public/ 

James Bourne
Contributor
September 29, 2024

No worries. I don't think I was clear in my initial question. Apologies. Yes, it's setup publicly for anonymous access and I can access it anonymously via a private browser tab (or other non-tracked browser instance). You can't actually test the URL in Google Webmaster tools as the URL is <site>.atlassian.net/wiki and clearly, I don't have access to that TLD.

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PRODUCT PLAN
STANDARD
TAGS
AUG Leaders

Atlassian Community Events