One of our customers is experiencing a weird search issue in their Chinese department. I broke the issue down to the most simple case: one page has the label 其他产品 (other products) and one has the label 产品信息 (product information). When I do a label search like labelText:其他产品 both pages are found:
Does anyone have a clue why this happens?
Regards, Felix [Scandio]
Taking a look at the documentation for the tokenizer for Lucene that deals with CJK characters, it seems like it splits up the characters into two-character bundles:
That would explain why it matches the two characters for "product" in both strings.
Badges are a great way to show off community activity, whether you’re a newbie or a Champion.Learn more
Connect with like-minded Atlassian users at free events near you!Find a group
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no AUG chapters near you at the moment.Start an AUG