How to avoid bad URLs getting indexed by Google?

Matthias Gidda August 21, 2017

With Confluence Server, Google indexes some bad URLs which shouldn't be visibile to users (like https://myconfluencesite/plugins/recently-updated/).

How can I avoid that?
Is the only solution to try and use a robots.txt and disallow /plugins/?

If so, how do you add a robots.txt to a Confluence Server install?

Any help would be appreciated, thanks!

1 answer

0 votes
AnnWorley
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
August 21, 2017

Hi Matthias,

Please see Prevent Search Engine Indexing Using Robots.txt. I look forward to any follow up questions and to hearing whether it works for your instance.

Thanks,

Ann

Matthias Gidda August 22, 2017

Hi Ann,

thanks for your reply and the link. It says that if I can't upload a robots.txt file (which is the case for my setup, unfortunately), then I should use the meta tag. But as I understand it, the meta tag doesn't let me block a URL like https://myconfluencesite/plugins/recently-updated/ from getting indexed, right? Or how would a meta tag look like that does that?

Thanks,

Matthias

Matthias Gidda August 28, 2017

Hi Ann, do you have an update for me?
Thanks,

Matthias

AnnWorley
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
August 28, 2017

My understanding is that you can block your entire instance from Google's indexing: Block search indexing with 'noindex'. Here are the examples of what the meta tag would look like, which you would paste in under Confluence Admin>Custom HTML>At the end of the head:

<meta> tag

To prevent most search engine web crawlers from indexing a page on your site, place the following meta tag into the <head> section of your page:

<meta name="robots" content="noindex">

To prevent only Google web crawlers from indexing a page:

<meta name="googlebot" content="noindex">

You should be aware that some search engine web crawlers might interpret the noindex directive differently. As a result, it is possible that your page might still appear in results from other search engines.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events