Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

The hidden risk of PII stored in Jira attachments (and how to find it)

Ask a Jira admin what's inside their attachments and you'll usually get a silence. That silence is the problem.

Most data-protection effort in Jira goes into the things you can read: issue fields, comments, permission schemes. But the riskiest content tends to hide where text-based tools can't see it - inside the files people attach. A password pasted into a screenshot. A customer's ID scan on a service-desk ticket. Card numbers hidden in an attached spreadsheet. It piles up quietly, and it usually surfaces at the worst possible moment: an audit, a security review, or a data-subject access request.

attachment-scanner-hero 1.png

This article explains where personally identifiable information (PII) hides in Jira attachments, why it's so easy to miss, and a practical way to scan Jira attachments for PII and clean them up.

Why PII ends up in Jira attachments in the first place

Over a few years, a single instance accumulates thousands of files, and a surprising share contain sensitive data nobody meant to store long-term:

  • Screenshots with passwords or API keys. Someone shares a config screen or a terminal window to reproduce a bug - and the secret is right there in the image.

  • Service-desk uploads. In Jira Service Management, customers attach ID documents, invoices, and screenshots directly to tickets. You didn't choose to collect that data; they handed it to you.

  • Imported spreadsheets and CSVs. Bug reports and data tickets often carry exports full of names, emails, account numbers, or card data.

  • Scanned PDFs. Contracts, forms, and signed documents - scanned as images, so the text isn't selectable or searchable.

None of this is malicious. It's the normal byproduct of using Jira for real work. The risk is simply that it's invisible to the tools most teams rely on.

Why field-level controls and search don't catch it

Here's the uncomfortable part: the usual defenses stop at the boundary of the file.

  • Jira search and JQL index issue fields, not the contents of attached files. You can't search for a password that lives inside a PNG.

  • Permission schemes control who can see an issue. They do nothing about what's inside the attachments on it.

  • Most data-loss-prevention (DLP) tooling reads text in fields, comments, plain documents. The moment sensitive data is trapped inside an image or a scanned PDF, text-based scanning goes blind.

That last point is the crux. A password in a screenshot and an SSN on a scanned form are, to a text scanner, just pixels. To read them, you need OCR (optical character recognition) - the ability to extract text from images and scanned documents. Without it, your attachment "coverage" has a hole exactly where the highest-risk content lives.

 

The compliance stakes

This isn't just hygiene. Depending on your sector, undiscovered PII in attachments can put you on the wrong side of:

  • GDPR - you're expected to know what personal data you hold and where, and to honor deletion and access requests. "We don't know what's in our attachments" is not a defensible answer.

  • PCI-DSS - cardholder data sitting in a Jira spreadsheet or screenshot is a finding waiting to happen.

  • Internal security policy - secrets sprawl (passwords, tokens, API keys pasted into issues) is one of the most common audit flags, and screenshots are a favorite hiding place.

The recurring theme: compliance starts with knowing what's actually in there. You can't protect, redact, or delete data you can't see.

A practical workflow to find and remove sensitive data in Jira attachments

This works regardless of tooling. The goal is coverage without chaos - find the real risks, review them, and remediate with a trail.

  1. Decide what you're looking for. Define patterns for the data that matters to you: passwords, API keys, credit card numbers, SSNs, IBANs, or any custom identifier. Simple keyword patterns catch the obvious cases; regex catches the structured ones (a card number, a national ID format).

  2. Decide where to look. Scope the scan so it's meaningful and proportionate. For example, one project, the last 90 days, or only open tickets. In Jira, JQL is the natural way to express this.

  3. Read every file type (including images and scanned PDFs). This is the step most approaches skip. If your method can't OCR screenshots and scanned PDFs, assume the highest-risk content is going unscanned.

  4. Review matches before you touch anything. Each hit should show the issue, the file, the matched text, and surrounding context, so a human can confirm it's a real problem and not a false positive.

  5. Remediate with an audit trail. Delete the offending attachments deliberately, and keep a record of what was removed and by whom. Avoid anything that deletes automatically - you want a person in the loop.

Where Attachment Scanner fit

This is exactly the gap we built Attachment Scanner – OCR, PII & Password Detection for Jira to close. You define a pattern (simple text or regex) and a JQL scope; it reads every supported file type and reports every match with the issue key, file name, matched text, and surrounding context. You review the findings, then bulk-delete what shouldn't be there, with every deletion captured in an audit log. It works with both Jira Cloud and Jira Service Management.

If you want to try it on your own data, evaluation licenses include a monthly credit allowance so you can test OCR scanning before committing.

NO OCR - FORGE. OCR - GPU SERVER MANAGED BY US

The takeaway

PII in Jira attachments is a blind spot precisely because it sits inside files rather than in searchable fields - and the worst of it hides in images and scanned PDFs that ordinary scanning can't read. The fix isn't complicated: define what you're looking for, scope it sensibly, read the files (not just the fields) including images via OCR, review the matches, and remediate with an audit trail - using a tool that doesn't become a data risk itself.

What's the most surprising thing you've found in a Jira attachment? I'd genuinely like to hear it in the comments.

2 comments

Ulrich Kuhnhardt _IzymesCo_
Atlassian Partner
June 3, 2026

Interesting, especially for compliance. Can you expand on "NO OCR - FORGE. OCR - GPU SERVER MANAGED BY US" - Thanks

Nik Surmanidze _Actonic_
Contributor
June 4, 2026

Hi!

Thank you for the question!

The Attachment Scanner has two modes of detecting PII. Using only Forge, and using service outside of Atlassian ecosystem - our GPU servers. The last one is on demand only, and for OCR.

image_2026-06-04_141805450.png

1. Full scan - including OCR

This method checks every supported file and also does OCR. Meaning that non-image files are fully scanned internally via Forge architecture, and when the OCR is needed (images, pdf), app calls our GPU server to detect the PII inside the file. Upon arriving, the file is only processed in VRAM and discarded as soon as the processing finishes. We work with locally installed AI OCR model, that doesn't communicate with outside world and never sends any data, not even telemetric data, outside of our servers. We don't store ANY information inside our servers. We do have comprehensive policies and ISO certification is coming in a month.

2. Document-only scan - no OCR

This methods checks every supported text file (including office files) for PII and runs entirely inside Atlassian Ecosystem. Meaning no data leaves it, ever. Only downside is that it doesn't include OCR, as we can not run such models inside Atlassian Forge. With this scan, customer can safely run any scan or check without worrying about additional data processors or creating new compliance agreements with third parties. Safe, secure and data left in Atlassian. 

 

Lastly, FULL-OCR scan is only done upon request of a customer. This can be skipped or not used at all if not needed. We can add a feature for administrators to disable this mode entirely to make sure no one accidentally uses that if it is required by compliance teams. 

Document-only scan is a powerful feature covering any known format of a text file and is perfect in detecting information inside those.

 

If you will have any questions please let me know!

Thank you!

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events