What is the best way to scan documents into JIRA?

Pei April 7, 2013

What are some ways JIRA users are scanning documents into JIRA and is anyone using Optical Character Recognition (OCR) for the scans?

Here's the ideal scenario... A JIRA user scans a request form using some type of software or tool and then send an email to JIRA that'll create a ticket with an attachment. The user then has to go into the ticket to fill out the customer's information. It'll be nice if there's a way or a plugin that can transfer a specific content using OCR, from the attachment into a custom field, e.g. customer's user id? Or capturing part of a Meta Data?

Where I am now with this plan is to create an Issue Type just for the scan form's email address. This way, each form creation from an unique email address will be created under an Issue Type.

Thoughts?

~Pei

2 answers

1 accepted

0 votes
Answer accepted
Radu Dumitriu
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 7, 2013

It can be automated for sure, provided you have the OCR tool. There are many ways to do that, I believe; one, in which you stage the JIRA issues received by mail (JEMH ?) then a service applying for each staged issue the procedure:

1. extract scan

2. OCRize

3. Get the text from it

4. fill in the issue and move it in the corresponding project / stage

That can be addressed either via a plugin (java) or via direct scripting, I suppose.

Here's the engine to do OCRization, if you do not have anything else: http://code.google.com/p/tesseract-ocr/

Andy Brook [Plugin People]
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 7, 2013

JEMH has pluggable Field Processors, it would be interesting to implement an OCR Field Processor though I'd probably try to steer clear of native code, perhaps http://sourceforge.net/projects/javaocr/

I guess a similar approach of mapping 'found' fields to 'actual' JIRA fields similar to the Regexp Field Processor, would be flexible enough, but determining boundaries of what text belongs to what field could be fun...

Radu Dumitriu
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 8, 2013

@Andy

As I said, there are - I believe - many ways to automate that. :)

Regarding the native code, IMHO the problem is wrongly put: the rule should be "stay away from native code which do not have support for the platforms I'm hooked on". Anyway, Tesseract is the best free OCR I know ...

Pei April 8, 2013

Thank you everyone for your help. I'm leaning towards the OCR for now and will speak to my developer about it.

Andy Brook [Plugin People]
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 8, 2013

exactly, I dont want to be hooked to any platform. There are some java only solutions, I'll see what can be done!

0 votes
Tanner Wortham
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 7, 2013

i'll assume that the request form you mention is essentially a list of fields populated by the customer that requires some kind of action. so why not make JIRA that request form and cut out the middle man (the paper form)?

Have you considered using this, which is included in JIRA 5.1 and above:

http://www.youtube.com/watch?v=VplPRpmeJys

(I thought a video from Atlassian was less boring that a wall of text.)

Suggest an answer

Log in or Sign up to answer