Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in
Celebration

Earn badges and make progress

You're on your way to the next level! Join the Kudos program to earn points and save your progress.

Deleted user Avatar
Deleted user

Level 1: Seed

25 / 150 points

Next: Root

Avatar

1 badge earned

Collect

Participate in fun challenges

Challenges come and go, but your rewards stay with you. Do more to earn more!

Challenges
Coins

Gift kudos to your peers

What goes around comes around! Share the love by gifting kudos to your peers.

Recognition
Ribbon

Rise up in the ranks

Keep earning points to reach the top of the leaderboard. It resets every quarter so you always have a chance!

Leaderboard

Come for the products,
stay for the community

The Atlassian Community can help you and your team get more value out of Atlassian products and practices.

Atlassian Community about banner
4,455,369
Community Members
 
Community Events
175
Community Groups

How do I search Confluence for an external URL?

I need to find all the instances of an external URL in our wiki. Is there any way to do this through an advanced search?

6 answers

1 accepted

1 vote
Answer accepted

Hi Mandy,

There are some limitations with the Confluence search.  We had an existing feature request to extend the search capabilities, but this has since been closed.  Your best bet will be to use the workaround described here: https://confluence.atlassian.com/confkb/how-to-perform-a-confluence-site-search-for-keywords-and-links-through-the-database-830284252.html.  This involves using the Confluence database instead.

Kind regards,

Miranda Rawson

 

The link above is not working.

AnnWorley Atlassian Team Jul 12, 2018

Here is an updated link: How to perform a Confluence site search for keywords and links through the database

In case there is an issue accessing it, here is an excerpt:

For an example, an external URL was inserted to numerous pages but as time goes by, those URL might point to a dead link as there might be some changes in the subdomain/URL path.

Solution
To search for these contents, run the SQL query below on your Confluence database. Replace the <INSERT_KEYWORD_HERE> with your keyword. The % symbol represents a wildcard search.
The SQL results will return the content type, title along with the space details including the spacestatus (either CURRENT or ARCHIVED). If a space is Archived, it won't be searchable in Confluence's User Interface.

select c.CONTENTTYPE,c.TITLE, s.SPACENAME, s.SPACEKEY, s.SPACETYPE, s.SPACESTATUS 
from content c join spaces s on c.SPACEID=s.SPACEID
where CONTENTID in
(select CONTENTID from bodycontent where BODY like '%<INSERT_KEYWORD_HERE>%')
Like Michael Kortrey likes this
13 votes

There is a little known feature of Confluence search in that is can also do regular expression searches. Try doing this for your search ...

/.*{your url here}.*/

Now, you will have to format the url to escape out any regular expression reserved characters. The biggest one would be periods. but if you have any of these characters in the url you would need to change them. See below for replacements

\    ->    \\
. -> \.
( -> \(
) -> \)
[ -> \[
^    ->    \^
$    ->    \$
|    ->    \|
*    ->    \*
+    ->    \+
?    ->    \?
{    ->    \{

So if you url was https://www.google.com/stuff+things you would search using this syntax ...

/.*https://www\.google\.com/stuff\+things.*/

@Davin Studer

This is really interesting (and I didn't know about regex in Confluence at all). But would this show me the instances of URL if the full URL appeared on the page OR would this show me if the URL occurred in the source?

I tried that and I got a system error and a HUGE stack trace; am self-hosted running Confluence 6.9.3 on CentOS6.

 

logo System Error

Cause

java.lang.IllegalArgumentException: integer expected at position 3
    at org.apache.lucene.util.automaton.RegExp.parseRepeatExp(RegExp.java:896)

Stack Trace:[hide]

java.lang.IllegalArgumentException: integer expected at position 3
 at org.apache.lucene.util.automaton.RegExp.parseRepeatExp(RegExp.java:896)
 at org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:880)

... etc ....

    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684)
    at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1539)
    at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1495)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:748)


 

I also am trying to search for a URL in our Confluence site. 

We are migrating a Twiki to Confluence.

The Twiki URLs are of the form:

https://twiki.company.com

I have tried this search which works:

/.*twiki.*/

Which will find those URLS, but also every other instance of Twiki.

This search does not work:

/.*twiki\.company.*/

Which doesn't make sense. I've escaped the full stop.

Any ideas?

For example if external site url to search is https://demo.site.com/..... . This worked in confluence for me : 

http*demo*site*com*

 

Regards,

Ankit

This solution works and is simple in application.

Like olisteadman likes this

I can't thank you enough for this solution! This worked great!

Does this search still work for anyone?

We are migrating a Twiki to Confluence.

The Twiki URLs are of the form:

https://twiki.company.com

I have tried this which does not work. 

*twiki*

There are instances of twiki in the pages being searched.

Any ideas?

Try, for example, https*twiki*company*com or any combination of words that make up your url. The more words you use, the more accurate results you will get.

You guys understand that not being able to find a basic text string in your pages is a pretty major failing of a data repository in this day and age, right?  Sure, there are workarounds.  Sure, you can use a regex.  But critically, when I type a string I know is in there and can't find it, I start asking some very fundamental questions about this product.  Strongly recommend you get the search tools up to a point that they meet basic expectations of a search tool.  This has been one of my biggest issues with Confluence, and the kind of thing that would lead us to consider alternatives.

Even Sharepoint has a way to dredge the entire contents and Find and Replace.  This is 1980's technology, and it is shocking that it is not possible in this tool.

Its just a show stopper : to have a wiki where You can not find all occurrences of links You want to update. How are You supposed to keep links updated in Your system?

Again this is just a result of the childish "ephemeral chat" perspective and not a serious, professional system to produce and maintain knowledge.

 

The response "We had an existing feature request to extend the search capabilities, but this has since been closed." is just unacceptable and rude. I know You don't care, don't need to say it in my face...

If you have access to the Reporting add-on, you can create a report to find all these links. I wrote about doing that here:  https://community.atlassian.com/t5/Confluence-articles/Finding-and-fixing-broken-links-with-Reporting-for-mere-mortals/ba-p/1334589 

Ironically, the link to a recipe to fix broken links was broken by some trailing characters ;-)

Searching for it found it https://community.atlassian.com/t5/Confluence-articles/Finding-and-fixing-broken-links-with-Reporting-for-mere-mortals/ba-p/1334589 

Could someone PLEASE expand on how you created a report, what Service Rocket is and how to DO all these things?

This article assumes so many bits of knowledge I have no idea where to start. 

What Macros are required? And / or what addons are required?

I am pretty new to Confluence, and landed here because I need to try to find URLs in a site we are migrating. 

Links to information is totally fine. 

Service Rocket is the producer of the Confluence plugin Reporting for Confluence.

It is very powerful but not easy to use for a newbie or non-developer - it is basically "programming by macro design" - and quite expensive.

The article shows you all the macros required and how to nest them for this specific use case, but this requires the plugin. The plugin documentation contains plenty of other use cases with "recipes".

Thank you. 

Since Atlassian search should work to be able to find things like URLs in a trivial manner, I will keep bugging them about actually getting Search to work.

A plugin as described above is far too much overkill for something that should work anyway.

Like robert_seeger likes this

As a follow up to this, I found a section in the help for Confluence search that says basically, you can't use an * or ? at the start of a search term. The search will just fail silently.

Hence my attempts at *twiki*company*com were just failing silently.

@Grzegorz Pitek suggested https*twiki*company*com which does indeed work.

But the basic takeaway is that an * or ? at the start will silently fail with Confluences search implementation.

Suggest an answer

Log in or Sign up to answer
TAGS

Atlassian Community Events