Community
Products
Jira
Questions
Find Excel XLS or CSV file attachments within Jira and confluence- to check for PI.

Find Excel XLS or CSV file attachments within Jira and confluence- to check for PI.

Hi, our organisation wants to find and remove any PI - personal information uploaded into Jira or confluence. This is hard, as anyone can load anything in descriptive fields. The typical method we have seen our staff do, is to attach a CSV or XLS file to a Jira task. I'm looking for a method for finding all issue types with a CSV or XLS file attached. Any extra APPs will have to allow data residency to be stored in Australia. I got excited about 'scriptrunner' but their data residency options are limited to EU or US locations. Would appreciate any suggestions. thank you

2 answers

1 accepted

2 votes

Answer accepted

For anyone who finds this and is curious, here's how we'd do it with ScriptRunner:

def maxResults = 100

def startAt = 0

def attachmentArr = []

def stillPaginating = true

//Define the global variables

while (stillPaginating == true){

//We need to paginate through the results

//We're iterating by 100 results at a time

def getIssueKeys = get("/rest/api/3/search?jql=project%20is%20not%20EMPTY&maxResults=${maxResults}&startAt=${startAt}")

.header('Content-Type', 'application/json')

.asObject(Map)

//We start by returning all of the issues in the instance with a JQL search

getIssueKeys.body.issues.each{ issueKey ->

//Next we're iterating through each result (issue) that was returned

def issueObject = get("/rest/api/3/issue/${issueKey.key}/")

.header('Content-Type', 'application/json')

.asObject(Map)

//We query the system for more information about the issue in question

issueObject.body.fields.attachment.each { attachmentID ->

//Specifically, we're after the ID of any attachment on the issue

def attachment = get("/rest/api/3/attachment/${attachmentID.id}")

.header('Content-Type', 'application/json')

.asObject(Map)

//Once we have the ID of the attachment, we can use that ID to get more information about the attachment

if(attachment.body.filename.toString.contains(".csv") || attachment.body.filename.toString.contains(".xlsx")){

attachmentArr.add("Issue ${issueKey.key} has an attachment: ${attachment.body.filename}")

}

}//End attachment-each loop

}//End getIssueKeys loop

if(getIssueKeys.body.total < maxResults){

stillPaginating = false;

}

//Finally, we check to see if we're still paginating

//If the total number of results is less than the total number of maximum possible results,

//We must be at the end of the line and can stop paginating by terminating the loop

}

return attachmentArr

//Return the results

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

1 vote

Hi Andrew:

Is this jira server or jira cloud? I can write you a Python script do it, if you're just looking for a report.

- Ken

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

We are using jira cloud. Yes we are just looking for a report, then will do a manual review of XLS and CSV attachments for PI

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Here's a Python script that will do it.

Please note: the steps required to generate the authorization token:

1. Go to https://id.atlassian.com/manage-profile/security/api-tokens and generate a token
2. Go to https://www.base64encode.net/ and encode it in this style:
YourLoginEmail@domain.com:APIToken

You can read more about that process here:
https://developer.atlassian.com/cloud/jira/platform/basic-auth-for-rest-apis/

Here is the script:

import requests

max_results = 100

start_at = 0

attachment_arr = []

still_paginating = True

yourDomain = "<domain>"

headers = {

'Authorization': 'Basic <base-64 encoded string>',

'Content-Type': 'application/json',

'Accept': 'application/json',

}

# Define the global variables

while still_paginating:

# We need to paginate through the results

# We're iterating by 100 results at a time

response = requests.get(f"https://{yourDomain}.atlassian.net/rest/api/3/search?jql=project%20is%20not%20EMPTY&maxResults={max_results}&startAt={start_at}",

headers=headers)

#print(response.content)

issue_keys = response.json().get("issues")

# We start by returning all of the issues in the instance with a JQL search

for issue in issue_keys:

# Next, we're iterating through each result (issue) that was returned

issue_key = issue.get("key")

issue_response = requests.get(f"https://{yourDomain}.atlassian.net/rest/api/3/issue/{issue_key}",

headers=headers)

#print(issue_response.content)

issue_data = issue_response.json()

# We query the system for more information about the issue in question

attachments = issue_data.get("fields", {}).get("attachment", [])

for attachment in attachments:

# Specifically, we're after the ID of any attachment on the issue

attachment_id = attachment.get("id")

attachment_response = requests.get(f"https://{yourDomain}.atlassian.net/rest/api/3/attachment/{attachment_id}",

headers=headers)

#print(attachment_response.content)

attachment_data = attachment_response.json()

# Once we have the ID of the attachment, we can use that ID to get more information about the attachment

filename = attachment_data.get("filename")

if filename and (".csv" in filename or ".xlsx" in filename):

attachment_arr.append(f"Issue {issue_key} has an attachment: {filename}")

if len(issue_keys) < max_results:

still_paginating = False

# Finally, we check to see if we're still paginating

# If the total number of results is less than the total number of maximum possible results,

# we must be at the end of the line and can stop paginating by terminating the loop

print(attachment_arr)

# Print the results

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Like • like this

Thank you Ken, much appreciated, I will give it a run.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Like • Susan Waldrip likes this

Suggest an answer

Was this helpful?

Thanks!

Jira

DEPLOYMENT TYPE

CLOUD

PRODUCT PLAN

PREMIUM

PERMISSIONS LEVEL

Product Admin

Forums

Product Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Find Excel XLS or CSV file attachments within Jira and confluence- to check for PI.

2 answers

1 accepted

Suggest an answer

Was this helpful?

Thanks!

DEPLOYMENT TYPE

PRODUCT PLAN

PERMISSIONS LEVEL

TAGS

Community showcase

Atlassian Community Events

Forums

Have more fun with Kudos

Product Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Find Excel XLS or CSV file attachments within Jira and confluence- to check for PI.

2 answers

1 accepted

Suggest an answer

Was this helpful?

Thanks!

DEPLOYMENT TYPE

PRODUCT PLAN

PERMISSIONS LEVEL

TAGS

Community showcase

Atlassian Community Events

Welcome to the new Atlassian Community