Community
Products
Confluence
Questions
Finding storage usage of Confluence space and page using REST API

Finding storage usage of Confluence space and page using REST API

Hi,

I try to fetch the consumed storage of our spaces by using the python script as documented here: https://confluence.atlassian.com/confkb/finding-storage-usage-of-confluence-space-and-page-using-rest-api-1063555292.html

The script is running fine but stops after 25 spaces. I can't find any limiter within the script, so maybe the REST API is limiting.

Can somebody help me out how to fetch all results?

Regards,

Jonny

2 answers

1 vote

It seems the python script does not follow the `_next` links, if there are more than 25 results.

I believe the script needs to be modified for this.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Hi There,

is there anyone who has already updated the script?

Thanks

Patrick

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

0 votes

I got the Solution and edited the Script to get 500 results:

# This code sample uses the 'requests' 'json' 'csv' library:
import requests
import json
import csv

#INSERT "USER", "TOKEN", "BASE_URL" HERE
USER="User"
TOKEN="Token"
BASE_URL="https://xxxxxxxxx.atlassian.net"

with open('per_page.csv', 'w') as pagecsvfile, open('per_space.csv', 'w') as spacecsvfile:
perPageWriter = csv.writer(pagecsvfile, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
perSpaceWriter = csv.writer(spacecsvfile, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
perPageWriter.writerow(['pageid','attachment_size(byte)'])
perSpaceWriter.writerow(['space_name','space_key','attachment_size(byte)'])

headers = {
"Accept": "application/json"
}

response = requests.request(
"GET",
BASE_URL + "/wiki/rest/api/space?limit=500",
headers=headers,
auth=(USER,TOKEN)
)

site_attachment_volume = 0

#Get all space keys
space_key_results = json.loads(response.text)["results"]
for space in space_key_results:
space_attachment_volume = 0
#Get related page IDs from space keys
response = requests.request(
"GET",
BASE_URL + "/wiki/rest/api/space/" + space["key"] + "/content",
headers=headers,
auth=(USER,TOKEN)

)
print("Space Key: " + space["key"])
page_results = json.loads(response.text)["page"]["results"]
for page in page_results:
page_attachment_volume = 0
#Get attachments from each page
print(" " + "Page ID: " + page["id"])
response = requests.request(
"GET",
BASE_URL + "/wiki/rest/api/content/" + page["id"] + "/child/attachment",
headers=headers,
auth=(USER,TOKEN)
)
attachment_results = json.loads(response.text)["results"]
for attachment in attachment_results:
print(" " + "Attachment Name: " + json.dumps(attachment["title"]) + ", " + json.dumps(attachment["extensions"]["fileSize"]) + " bytes")
page_attachment_volume += int(json.dumps(attachment["extensions"]["fileSize"]))

space_attachment_volume += page_attachment_volume
print(" -->" + "PAGE TOTAL: " + str(page_attachment_volume))

#Write to CSV
perPageWriter.writerow([page["id"],str(page_attachment_volume)])

print("\n " + "SPACE TOTAL: " + str(space_attachment_volume) + " bytes")
print("----------")

#Write to CSV
perSpaceWriter.writerow([space["name"],space["key"],str(space_attachment_volume)])

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Hi there,

Would anyone be able to assist with updating this script to work with the v2 api from Atlassian for Confluence? Reason being I can use the pagination and response header to pull more than 250 requests which I need due to the amount of spaces.

I have updated the endpoints but am receiving the below error:

page_results = json.loads(response.text)["page"]["results"]
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 3 column 1 (char 10)

Thanks

Jan

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Hi,

Probably it is some programming work, as the v2 API is different from v1.

Do you need the output on the command line? Or is output in a table in Confluence also ok?

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Hi,

Ideally if it could be in the same format where it saves to a CSV would be great, I only actually require the per_space.csv and not individual pages if that helps.

I am also not sure how the response header works in python as I have already have the command working for Get/Spaces working in Postman with the v2 api and see the response header at the bottom.

Thanks

Jan

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Suggest an answer

Was this helpful?

Thanks!

Confluence

DEPLOYMENT TYPE

CLOUD

PRODUCT PLAN

STANDARD

PERMISSIONS LEVEL

Product Admin

Forums

Product Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Finding storage usage of Confluence space and page using REST API

2 answers

Suggest an answer

Was this helpful?

Thanks!

DEPLOYMENT TYPE

PRODUCT PLAN

PERMISSIONS LEVEL

TAGS

Atlassian Community Events