One of my users just reported this one. He was trying to do a simple-mode search for all issues which contain the word "accessibility" and found that most of the results did not in fact contain this word. Instead they contain the word "access". After a bit more experimentation I discovered that entering "accessible" does the same thing. We cannot find a workaround - even enclosing the word in quote-marks doesn't help.
Does JIRA seriously think that these three words are synonyms?!
Searching for "accesses" also produces these results, although fortunately "accessor" is treated as being different.
This actually happens because JIRA, by default, stems words; which makes the JQL searches retrieve issues based on the 'root' forms of words instead of requiring exact matches. You can see more regarding Word Stemming here, and I'll quote below part of the text mentioned there:
For example, if you search for issues using the query term 'customise' on the Summary field, JIRA stems this word to its root form 'custom' and will retrieve all issues whose Summary field also contains any word that can be stemmed back to 'custom'. Hence, the following query:
summary ~ "customise"
will retrieve issues whose Summary field contains the following words:
- customised
- customising
- customs
- customer
- etc.
You can disable word stemming (so that JIRA will find issues based on exact matches with words) by changing the Indexing Language to Other (under <tt>Administration > System > General Configuration</tt>). Note that you'll need to rebuild JIRA's search index after making this change.
By changing it, all the discrepancies mentioned above should be fixed. See the examples below after I've changed the indexing language (I've created six issues for testing):
–
summary ~ "dele*"
–
summary ~ "delete"
–
summary ~ "deleted"
I hope this helps!
Yes, I know what stemming is :-) My point is that 'access' is by no stretch of the imagination the stem for 'accessibility'! I think we're going to have to turn this off - unless it's possible to configure it to work correctly?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Not when you're discussing software it isn't...
We actually use lucene ourselves, and we do not have this problem with our product - presumably because we're using a different configuration.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Alex, access is the actual stem (root) of the word accessibility or accessible :P
http://dictionary.reference.com/browse/accessibility?s=t
Indeed there is no way to configure this to work differently so my sugestion would be to disable this if needed.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Yes, I imagined that was what was happening; I've worked on search algorithms myself. So it's basically a bug in Lucene, then :-( Terrific...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Alex,
JIRA does not search the issue directly in the database as this is very expensive. It will scan the index directory for matching parameter .
In general JIRA use Lucene to tokenise the word that is stored in the index directory. You can use Luke to see the action
I haven't check how the "accessibility" is being tokenise in JIRA, but you may try it to understand it further. But, I believe that Lucene "Stem" the words.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.