Best method for auditing/archiving/deleting attachments on Confluence?

Rory Aptekar April 10, 2018

Our Confluence Server instance is starting to see some significant increase over the past few months with attachment usage, and I'm a bit unsure of the best way to go to get an idea of how we can either go through and delete/archive attachments that haven't been accessed in x period of time or even get an idea of attachment storage per user so we can reach out to the teams they belong on to validate that all the data they've uploaded is still necessary.

Any advice on what's worked with attachment analytics or archival to help ensure we aren't just retaining stale data or being able to get on overview of the biggest attachments/etc?

2 answers

1 accepted

2 votes
Answer accepted
AnnWorley
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 10, 2018

Hi Rory,

I don't have a way to report on attachments by user but you can start by finding the largest attachments and seeing whose pages they are on: How to find the largest attachment files in your Confluence instance

Also, on this page: How to get more statistical data (disk space, contents created) from Confluence's usage there is a query to find the total size of attachments (in bytes) in all pages / blogs in each space.

Hopefully you can extrapolate who the users are from the permissions on the larger spaces.

Please let me know any follow up questions.

Thanks,

Ann

Rory Aptekar April 10, 2018

Thanks Ann - this should be helpful to start with!

AnnWorley
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 10, 2018

A colleague saw your question and let me know about this query that will identify files that are used in more than one place. If you see anything unusual in the results it could indicate a page with lots of attachments being copied multiple times, etc:


SELECT title, COUNT(title)
FROM CONTENT
WHERE contenttype = 'ATTACHMENT'
GROUP BY title
HAVING COUNT(title) > 1
ORDER BY COUNT(title) DESC;

Like Gabriel Points likes this
Rory Aptekar April 10, 2018

Thanks again - we'll run the queries to see what we can come up with in the short term.

Jaime Sotelo Trujillo August 6, 2020

Hello jhon.

 

My question focuses on how the data deletion process is carried out on cloud platforms after unsubscription and the retention period ends. What methods do they use? how is the proccess?regards.

2 votes
Darin - Opus Guard
Marketplace Partner
Marketplace Partners provide apps and integrations available on the Atlassian Marketplace that extend the power of Atlassian products.
April 12, 2024

@Rory Aptekar it sounds like you may want to consider an auto delete/purge feature and having an ongoing automation with a retention policy. Putting Confluence pages/blog posts into the Archive still counts against your total storage. Additionally, for compliance, Archive/Deleted content is still recoverable by users and considered discoverable to auditors.

Check out Content Retention Manager for Confluence in the Atlassian Marketplace. It's intended to be simple and straightforward content retention management that helps clean spaces up and help with ISO/SOC2 compliance at the same time. It also has the feature of marking content to live on beyond the retention period for stuff too good to purge.

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events