Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Generate a report of all links in a space (Cloud)

brendan_long
February 1, 2026

Hi Folks,

We have a documentation space which is published to our clients, and a private drafting space which we use internally to draft pages prior to publishing. Occasionally links to the the draft space are left in the published pages which means that clients are unable to use them. We also have the occasional dead link that makes it through.

I'd love to be able to generate a report of all links in a given space so that we could scan through that and find the ones that need to be cleaned up. I know you can see this on a page by page basis, but we have a large collection of pages and this would be quite time consuming. It would be much easier if we could generate a report for the whole space and then search it.

Is there a native way to do this, or an app we could use to make it work?

-Brendan

2 answers

0 votes
Kris Klima _K15t_
Community Champion
February 1, 2026

Hi @brendan_long 

I'd use Find and Replace app by Easy Apps to find and modify URLs in one go. I left a commend with a screenshot in this thread.

More imporantly...

If I may ask, what method to you use to move content from the drafting to the client facing space? Manual copy paste or a space sync app?

 

0 votes
Bibek Behera
Community Champion
February 1, 2026

Hi @brendan_long ,

There are some app that I can think of below:

  • Links Hierarchy

  • Space Content Manager

  • Better Content Archiving

  • Scroll Documents

I would personally go via API route to have more control over what kind of report I want to generate for the links.

import requests
from bs4 import BeautifulSoup
import csv

# ==============================
# CONFIGURATION
# ==============================
BASE_URL = "https://your-domain.atlassian.net/wiki"
SPACE_KEY = "SPACEKEY"
EMAIL = "your-email@example.com"
API_TOKEN = "your-api-token"

OUTPUT_FILE = "confluence_space_links.csv"
PAGE_LIMIT = 50

# ==============================
# AUTH & HEADERS
# ==============================
auth = (EMAIL, API_TOKEN)
headers = {
"Accept": "application/json"
}

# ==============================
# FETCH ALL PAGES IN SPACE
# ==============================
def get_all_pages():
pages = []
start = 0

while True:
url = f"{BASE_URL}/rest/api/content"
params = {
"spaceKey": SPACE_KEY,
"limit": PAGE_LIMIT,
"start": start,
"expand": "body.storage,version"
}

response = requests.get(url, headers=headers, auth=auth, params=params)
response.raise_for_status()
data = response.json()

pages.extend(data["results"])

if "_links" not in data or "next" not in data["_links"]:
break

start += PAGE_LIMIT

return pages

# ==============================
# EXTRACT LINKS FROM HTML
# ==============================
def extract_links(html):
soup = BeautifulSoup(html, "html.parser")
links = []

for a in soup.find_all("a", href=True):
href = a["href"]
if href.startswith("http"):
link_type = "External"
elif href.startswith("/wiki"):
link_type = "Internal"
else:
link_type = "Other"

links.append((href, link_type))

return links

# ==============================
# MAIN EXECUTION
# ==============================
def main():
pages = get_all_pages()

with open(OUTPUT_FILE, mode="w", newline="", encoding="utf-8") as csvfile:
writer = csv.writer(csvfile)
writer.writerow([
"Space",
"Page Title",
"Page ID",
"Page URL",
"Link URL",
"Link Type"
])

for page in pages:
title = page["title"]
page_id = page["id"]
page_url = f"{BASE_URL}{page['_links']['webui']}"
html = page["body"]["storage"]["value"]

links = extract_links(html)

for link_url, link_type in links:
writer.writerow([
SPACE_KEY,
title,
page_id,
page_url,
link_url,
link_type
])

print(f"✅ Report generated: {OUTPUT_FILE}")

if __name__ == "__main__":
main()
Bibek Behera
Community Champion
February 1, 2026

👆🏻Above is sample python code! Let me know if you need anything else.

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PRODUCT PLAN
STANDARD
PERMISSIONS LEVEL
Product Admin
TAGS
AUG Leaders

Atlassian Community Events