Community
Q&A
Confluence
Questions
Generate a report of all links in a space (Cloud)

Generate a report of all links in a space (Cloud)

Hi Folks,

We have a documentation space which is published to our clients, and a private drafting space which we use internally to draft pages prior to publishing. Occasionally links to the the draft space are left in the published pages which means that clients are unable to use them. We also have the occasional dead link that makes it through.

I'd love to be able to generate a report of all links in a given space so that we could scan through that and find the ones that need to be cleaned up. I know you can see this on a page by page basis, but we have a large collection of pages and this would be quite time consuming. It would be much easier if we could generate a report for the whole space and then search it.

Is there a native way to do this, or an app we could use to make it work?

-Brendan

2 answers

0 votes

Hi @brendan_long

I'd use Find and Replace app by Easy Apps to find and modify URLs in one go. I left a commend with a screenshot in this thread.

More imporantly...

If I may ask, what method to you use to move content from the drafting to the client facing space? Manual copy paste or a space sync app?

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

0 votes

Hi @brendan_long ,

There are some app that I can think of below:

Links Hierarchy
Space Content Manager
Better Content Archiving
Scroll Documents

I would personally go via API route to have more control over what kind of report I want to generate for the links.

import requests
from bs4 import BeautifulSoup
import csv

# ==============================
# CONFIGURATION
# ==============================
BASE_URL = "https://your-domain.atlassian.net/wiki"
SPACE_KEY = "SPACEKEY"
EMAIL = "your-email@example.com"
API_TOKEN = "your-api-token"

OUTPUT_FILE = "confluence_space_links.csv"
PAGE_LIMIT = 50

# ==============================
# AUTH & HEADERS
# ==============================
auth = (EMAIL, API_TOKEN)
headers = {
"Accept": "application/json"
}

# ==============================
# FETCH ALL PAGES IN SPACE
# ==============================
def get_all_pages():
pages = []
start = 0

while True:
url = f"{BASE_URL}/rest/api/content"
params = {
"spaceKey": SPACE_KEY,
"limit": PAGE_LIMIT,
"start": start,
"expand": "body.storage,version"
}

response = requests.get(url, headers=headers, auth=auth, params=params)
response.raise_for_status()
data = response.json()

pages.extend(data["results"])

if "_links" not in data or "next" not in data["_links"]:
break

start += PAGE_LIMIT

return pages

# ==============================
# EXTRACT LINKS FROM HTML
# ==============================
def extract_links(html):
soup = BeautifulSoup(html, "html.parser")
links = []

for a in soup.find_all("a", href=True):
href = a["href"]
if href.startswith("http"):
link_type = "External"
elif href.startswith("/wiki"):
link_type = "Internal"
else:
link_type = "Other"

links.append((href, link_type))

return links

# ==============================
# MAIN EXECUTION
# ==============================
def main():
pages = get_all_pages()

with open(OUTPUT_FILE, mode="w", newline="", encoding="utf-8") as csvfile:
writer = csv.writer(csvfile)
writer.writerow([
"Space",
"Page Title",
"Page ID",
"Page URL",
"Link URL",
"Link Type"
])

for page in pages:
title = page["title"]
page_id = page["id"]
page_url = f"{BASE_URL}{page['_links']['webui']}"
html = page["body"]["storage"]["value"]

links = extract_links(html)

for link_url, link_type in links:
writer.writerow([
SPACE_KEY,
title,
page_id,
page_url,
link_url,
link_type
])

print(f"✅ Report generated: {OUTPUT_FILE}")

if __name__ == "__main__":
main()

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

👆🏻Above is sample python code! Let me know if you need anything else.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Suggest an answer

Was this helpful?

Thanks!

Confluence

DEPLOYMENT TYPE

CLOUD

PRODUCT PLAN

STANDARD

PERMISSIONS LEVEL

Product Admin

Forums

Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Generate a report of all links in a space (Cloud)

2 answers

Suggest an answer

Was this helpful?

Thanks!

DEPLOYMENT TYPE

PRODUCT PLAN

PERMISSIONS LEVEL

TAGS

Atlassian Community Events