Bulk deletion of attachments using API

There are many use cases that can pop up at any time when being a Jira administrator. Therefore, it is good you acquaint yourself with a better understanding of how you can make the best use of Atlassian API. If you prefer working with scripts then this article is just right for you. This article assumes that you have a basic understanding of making an API call or following instructions to achieve the desired outcome outlined below.

Disclaimer: Please note that deleting attachments is a destructive action that is not recoverable, so take proper care before attempting to use the below method.

This article is going to focus on two use cases that apply the use of the REST API to administer Jira projects in terms of deleting unwanted attachments within issues.

  • Storage space cleanup

  • GDPR compliance

delete-attachments.png

If you have been administrating Jira (Jira Software, Jira Service Management(formerly Jira Service Desk) or Jira Work Management(formerly Jira Core)) for a while you might have come across a time where you’ve stored so much data and storage allocation is almost full and you want to get rid of those old files that are just taking too much space within your cloud instance. This use case employs the use of REST API to perform these tasks. Please note that with automation for Jira, you can perform bulk deletion of attachments by filenames with the limit of 1K issues. However, what you can use this API method factors in additional functions.

  • Delete by users

  • Delete by file extension

  • Delete by file size

  • Delete by date range

  • No issue query limit

All the above is possible with the REST API endpoint provided by Atlassian and by some little scripting, you can get your desired outcome using the API. Also, note that the above method can be used by those administrators who are trying to reduce their attachment size on their instance as well as those administrators who want to automate or even dynamically perform attachment deletion from certain projects at a certain interval. If you would like to automate this process, this API can help you do that just continue reading to find out more.

notice.png

GDPR.jpeg

The other use case surrounds GDPR compliance and how organizations would require to delete personal information from their project. Take, for example, you are an IT Service Desk manager and your organization has a regulation that each customer ticket that is created and has an attachment in it should be deleted within 60 - 90 days. If your project has been running for the past 2 or more years and you’ve got over 40K to 50K tickets. How do you go about finding every attachment and subsequently deleting each one or how do you separate deletion of attachments by certain customers who want their data deleted from your ticket?

Oh! you’re still waiting for an answer. Well, the simple fact is that it is possible to do all that. I will show you how you can achieve all that within your Jira projects and be GDPR compliant in terms of removing user-generated content that contains personal or identifiable details. I’ll also show you the basic usage of this API and how you can even integrate it within your local system (macOS or Linux) to perform automatic attachment deletion on a regular interval using cron jobs.

There are two methods when it comes to deleting attachments using this API. I will simplify the process here to make it as straightforward as possible. It is either you use the file method or the search method. For this use case, we’ll be using python as our scripting language and a library called jiraone which can be downloaded using pip install jiraone or python3 -m pip install jiraone

You should store your credentials in a JSON file as shown below e.g. config.json

{
"user": "prince.nyeche@example.com",
"password": "API-Token-here",
"url": "https://yourinstance.atlassian.net"
}

 

Using Search Method 

If the file argument is provided, the search argument is ignored. If the file argument is not provided the search argument has to be provided else an error will be returned. e.g.

from jiraone import LOGIN, delete_attachments
import json

config = json.load(open('config.json'))
LOGIN(**config)

ext = [".png", ".pdf"]
users = ["5abcXXX", "617bcvXXX"]
size = ">30MB"
jql = {"jql": "project in (COM) ORDER BY Rank DESC"}
delete_attachments(search=jql, extension=ext, by_user=users, by_size=size)

Let me explain to you what each argument does and how you can use it. The above example is one of the ways you can make a request. You can make a request using the below search criteria

# previous expression

key = "COM-120" # a string as issue key
key = "COM-120,TP-15" # a string separated by comma
key = ["COM-120", "IP-18", 10034] # a list of issue keys or issue id
key = {"jql": "project in (COM) ORDER BY Rank DESC"} # a dict with a valid jql

The above will enable you to search for viable issues that has an attachment value. The extension argument can be used as below

# previous expression

ext = ".png" # a string
ext = ".png,.pdf" a string separated by comma
ext = [".png", ".zip", ".csv"] # a list of extensions

You can also use it without the “dot” prefix on the extension but make sure that if the dot is not being used for multiple extensions either by string or list, the dot prefix is not maintained at all. E.g

Valid 

# previous expression
ext = [".png", ".zip", ".csv"] # a list of extensions

Valid

# previous expression
ext = ["png", "zip", "csv"] # a list of extensions
ext = "png,zip,pdf" # a string separated by comma

Invalid

# previous expression
ext = [".png", "zip", ".csv"] # a list of extension

In the case of the invalid example, notice that one of the extensions doesn’t have a “dot” prefix! When such happens the file extension is skipped and won’t be deleted in the final execution of the API.

The by_user argument allows you to use accountId to filter the deletion by such users. This argument expects a list of users

# previous expression

users = ["5abcXXX", "617bcvXXX"]

When the user that matches is found, then the deletion will occur.

The by_size argument helps with deletion by byte size. You can specify the limit by using the below format. The acceptable format for by_size uses this mechanism

size = [condition][number][byte type]

  • Condition uses the greater than (>) or lesser than (<) symbols

  • Number could be any digit that you can come up with.

  • Byte type refers to the byte size allocation. Either in kb, mb, gb or blank representing sizes in bytes

# previous expression

size = ">12mb" # greater than 12mb in size
size = "<150mb" # lesser than 150mb in size
size = ">400kb" # greater than 400kb in size
size = "<20000" # lesser than 20000 bytes without the suffix byte type specified

Using the by_date argument within this function helps to determine if and when an attachment should be removed. It uses the initiator's current local time derived from your machine to determine the time and date; down to the last second. Then it compares, that current time to the issue time when the attachment was created and then determine a time delta of the difference. If it can determine that the time period or range is lesser than the DateTime the attachment existed, then it returns true otherwise returns false. You can make the request by performing any of the below tasks.

# previous expression

dates = "3 days ago"
dates = "3 months ago" # you can say 3 months
dates = "15 weeks" # the ago part can be left out and it won't matter.

The accepted format of using this argument is given below and only strings are accepted.

dates = "[number] <space> [time_info] <space> ago"

The ago part is optional (i.e not needed but looks visually pleasing) but the number and time_info part are crucial. These are the expected values for time_info

  • minute or minutes , hour or hours, day or days, week or weeks, month or months, year or years

Depending on the context and which one makes the most accurate depiction in the English language.

# previous expression

dates = "14 hours ago"
dates = "1 year ago"

Besides using the standard way to call this function, you can always mix and match your search criteria using these four arguments. extension, by_user, by_size, by_date

The hierarchy follows the same way as they are arranged above.

E.g. By extension and by size

# previous expression

delete_attachments(search=jql, extension=ext, by_size=size)

E.g. By size only

# previous expression

delete_attachments(search=jql, by_size=size)

E.g. By user and by size

# previous expression

delete_attachments(search=jql, by_user=users, by_size=size)

E.g. By user and by date

# previous expression

delete_attachments(search=jql, by_user=users, by_date=date)

E.g. By date and by size

# previous expression

delete_attachments(search=jql, by_date=date, by_size=size)

I think you get a general idea. This means these four arguments can be used interchangeably and they can be used singularly on their own.

Using File Method

Subsequently, if you do not want to run a search, you can perform an entire export of your filter query from your Jira UI by navigating to your Filter > Advanced issue search, typing your query to get the desired result and click the export CSV all fields.

search.png

You do not have to edit the file or change the format in any way. If you’ve exported it as an xlsx file or you’ve modified the file by removing other columns. Please add a delimiter argument and use “;” as the delimiter. Always ensure that the headers are present and not removed. Also, ensure that the “Attachment” and “Issue key” columns are always present in the file.

# previous login statement

ext = [".csv", ".mov", ".png"]
file = "Jira-export.csv"
delete_attachments(file=file, extension=ext)

Example with delimiter parameter.

# previous login statement with variable options

delete_attachments(file=file, extension=ext, delimiter=";")

# You can only filter by extension when using the file method.

Turning on Safe mode

If you just want to test the function without actually deleting any attachments for both the file and search method, you can switch the argument delete into False and that will turn on safe mode. E.g.

# previous login statement with variable options

delete_attachments(file=file, extension=ext, delimiter=";", delete=False)
# result
# Safe mode on: Attachment will not be deleted "jira_workflow_vid.mp4" | Key: COM-701

The same argument is available when you use the search method.

# previous login statement with variable options

delete_attachments(search=jql, delete=False)
# result
# Safe mode on: Attachment will not be deleted "jira_workflow_vid.mp4" | Key: COM-701

When safe mode is on, all filtering is ignored.

Check Point

This API comes with a checkpoint, which means that if for any reason the script stops or is disconnected (i.e due to internet connection or termination of the script). Once you restart the script, it will resume back from whatever iteration it stopped. Even if you’re running a JQL search that has 50K+ issues or more and it stopped at 35K issue during the extraction. It will resume from the exact iteration it stopped and from the marginal record (i.e this is +1 or -1 variation from the relative point it stopped. In most instances it would likely be the exact record.) from which is stopped to complete the information extraction. As long as you provide the answer “yes” when asked to use “the save point data”. Then the script will resume from the last known iteration.

Making a cron script to auto-delete at Intervals

You can configure a simple shell script to run at certain intervals within your local device. The python script should be wrapped within a shell script to run based on your configurations. This is applicable to macOS or Linux users to set up your crontab. On macOS, you might want to go to your security and privacy settings and add the terminal app, and cron “Full disk access” in order to use the below.

Create a bash script. A simple one such as below would do. Assuming that you’ve already created the python script using a similar example above and placed it in a folder within your machine. You can give it any name e.g cronScript.sh ensure that you have permission right on this file. E.g. chmod u+x cronScript.sh

#!/usr/bin/env bash
usr/bin/python delete_attachment.py

Make sure the shell scripts work properly. Then open your terminal and type the below

crontab -e

Setting up cron requires that you familiarize yourself with the format. You can use crontab.guru to get accurate results and generate the expression that you need. Below is just an example.

The above command will open with vim, press i to go to INSERT mode and then enter your expression. e.g.

0,15,30,45 * * * * cd /home/name/Desktop/scripts && ./cronScript.sh

You can use the command pwd to check your present working directory name to the folder where your script is located, which you can use above.

The cronScript.sh is located in a folder called scripts and this folder should contain both your python and your credential files.

Once you’re done saving the expression, hit esc key and type :wq to save and exit the editor. You might want to ensure that you’re running your python executable on the file rather than the python alias. Find your python executable by running which python on your terminal. That way you can use the executable usr/bin/python directly in your shell script. Please note that you can only use python version greater than v3.6.x to operate this script.

To list your crontab configurations, type on your terminal crontab -l

The automation part is done. Now when the time reaches, your script will run and automatically delete the attachment based on your configuration. So if you have customer’s attachment data on tickets that need to be removed within 60 or 90 days after the tickets are closed, now you have a solution to doing such task automatically.

You can use this script to perform your bulk attachment deletions on Jira issues and you can filter the deletion anyhow you want or even integrate it with other systems to get your task done.

And that’s how you can automate the deletion of attachments and be GDPR compliant on your Jira project as an administrator.

 

22 comments

Hector Eduardo Calzada Tenorio June 16, 2022

Hi Prince, I hope you are fine!

I'm trying to get a report that tells me the storage used in my Jira instance by project in specific and then back up the projects with the most storage used and then delete them...

Can I get this report with jiraone? I'm looking information about it but can't find anything.

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
June 20, 2022

You can use a different API of jiraone to get a report of attachments on each issue called

PROJECT.get_attachments_on_projects

That API returns the total number of attachments based on JQL with a direct URL to where those attachments are on your instance. See an example code of how to request the API from this article. In your use case you can separate each request to the API by project key in your JQL search, so you can get a report per project; that way you can see the total number of storage used.

Hector Eduardo Calzada Tenorio June 28, 2022

Hi Prince!

 

The information you share with me in the previous comment is very helpful, I'm working as you mentioned it and it seems to work only that I have a problem, I can't get all the information because I get the following error: UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 6045: character maps to <undefined> 

 

Error.PNG

 

Can you help me with this error?

I currently have the following versions:

Python 3.10.3

Jiraone 0.6.3

Pip 22.1.2

 

I hope you can help me, I will thank you so much.

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 2, 2022

I think I know what the problem might be. It seems to be a character encoding with windows system, it has to be explicit when reading files too. I'll push an update to v0.6.4 soon. As I have new features that comes with that update.

Hector Eduardo Calzada Tenorio July 7, 2022

Hi Prince!

Thank you so much for your reply. I will be on the lookout for the update.

Hector Eduardo Calzada Tenorio August 18, 2022

Hi Prince, nice to greet you again.

I wanted to ask you if with Jiraone we can obtain a report that shows us all the Linked issues

to all the issues of a project.

 

Link.PNG

 

I have tried several options, but I don't get anything favorable. Do you know if with Jiraone I can get a report similar to the following?

 

Excel Link.PNG

 

 

I hope you can help me, I will be very grateful.

I await your comments.

Best regards.

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
August 22, 2022

Hey @Hector Eduardo Calzada Tenorio 

I don't have this yet on jiraone but a report can be constructed with jiraone by calling multiple APIs but I think you're referring to a simple call or function to get this kind of report data. If that's the case, I can add work log data as part of an additional feature report but all those might be in the future updates. Probably from v0.6.5 and above. Please can you create a request here, that way I can keep it in view?

About the update from before. I've been caught up in other tasks but I should have v0.6.4 updated before the end of this week.

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
August 26, 2022

Hey @Hector Eduardo Calzada Tenorio 

About the previous issue, you can download the latest version of jiraone to fix that problem. Let me know how that goes.

Kirill March 7, 2023

Please help me. My account has administrator rights , I got it . Why do I have such an error ?1.png

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 7, 2023

Hey @Kirill 

The error shows that you're not authenticated. This means probably the way you've connected might not be valid. Are you connecting to a Jira cloud or Jira Server/DC?

Kirill March 7, 2023

Jira 9.4.0 installed on our local server. 

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 7, 2023

Hey @Kirill 

Add this line LOGIN.api = False to your script.

from jiraone import LOGIN, delete_attachments
import json

config = json.load(open('config.json'))
LOGIN.api = False
LOGIN(**config)

Then attempt your connection again. 

Kirill March 7, 2023

again this error444.png

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 7, 2023

Okay, let's take a step back here and see if everything with your actual login is true. Use the below script

from jiraone import LOGIN, endpoint
import json

config = json.load(open('config.json'))
LOGIN.api = False
LOGIN(**config)

output = LOGIN.get(endpoint.myself())
print("Connection:", output.status_code, output.reason)
print(output.json())

See what output you get, and then we'll continue from there later on. 

Kirill March 7, 2023

@Prince Nyeche not bad) 200.png

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 7, 2023

Okay, now we know that works. Let's try the function again

from jiraone import LOGIN, delete_attachments
import json

config = json.load(open('config.json'))
LOGIN.api = False
LOGIN(**config)

jql = {"jql": "project = ABC ORDER BY Rank DESC"}
# ensure that the above project is accessible by you.
dates = "14 hours ago"
delete_attachments(search=jql, by_date=dates)

See if you still get the authentication error problem.

Kirill March 7, 2023

@Prince Nyeche it worked! Tell me how to write conditions if I want to delete attachments older than n years?work.png

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 7, 2023

Nice, just add the condition

# for the last 1 year from your current time
dates = "1 year ago"
# or referring to the last 3 years from your current time
dates = "3 years ago"

Use the by_date argument and that should do it. Please be careful when using this function as it will delete all attachments based on the action set.

Kirill March 7, 2023

@Prince Nyeche I asked wrongly. Delete attachments created later than 2 years ago .

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 8, 2023

I think I get what you mean, so the function doesn't have a "from" or "range" time point, probably something I can add as an improvement later on.

Kirill March 8, 2023

@Prince Nyeche  Thanks. If it becomes available, please let us know

Prince Nyeche
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
March 9, 2023

Hey @Kirill so I don't forget this ask, please can you kindly create it as a request here

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events