JIRA Mail handler rejecting mail comments containing certain characters

Adam Nachman October 30, 2017

Recently, since our upgrade from 7.0.4 to 7.4.4 the JIRA mail handler has been rejecting emails with unrecognized characters. These include general emoticons from outlook (basic :) after outlook converts the keystroke to an image).

The database collation is utf8_general_ci (and is unchanged), and the version of MySQL is 5.6.13 (unchanged).

The error message is

(SQL Exception while executing the following:INSERT INTO jiraaction (ID, issueid, AUTHOR, actiontype, actionlevel, rolelevel, actionbody, CREATED, UPDATEAUTHOR, UPDATED, actionnum) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) (Incorrect string value: '\xF0\x9F\x98\x81\x0A\x0A...' for column 'actionbody' at row 1))"

My expectation would be that these would be saved as "unprintables" and either rendered as "?", or stripped out if they could not be handled.

For obvious reasons, we cannot expect the users to only select a specific subset of characters, and cannot afford to be dealing with dropped messages.

What is this change in behaviour, and how do we rectify it?

1 answer

0 votes
Shannon S
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
October 31, 2017

Hi Adam,

I recommend reviewing the article below and make sure to change your system to match any of the requirements listed there in the resolution:

Unless your collation matches our requirements in connecting Jira applications to MySQL then you might run into issues such as this, even if you hadn't previously.

Have a look and do let us know if you have any trouble.

Kind Regards,
Shannon

Adam Nachman October 31, 2017

Shannon, I've reviewed the linked article,and have the following comments

  1. The JIRA database was created by the JIRA application with the selected collation at the time of installing JIRA for the first time. Any existing collation was programmatic ally selected by JIRA (version 5.x, I believe)
  2. The database collation - while not _bin, is still utf-8 and should support the character set, so I don't believe that this is likely to be the cause and changing collation is not likely to have any effect on the valid character set. The difference between general_ci and _bin is not the character set, but the internal comparison and sort order operations.
  3. In many cases for us, _general_ci is a preference to allow multi
  4. There have been no historical incidents in 4 years of use with the software of a single email with a smiley emoticon being rejected. We also have other examples of unicode characters in the database (which is understandable, since this is a utf-8 collation)

Again, this has only occurred since our upgrade to 7.4.4. Now this may be a direct result of the upgrade or a latent bug in the software, or part of the upgrade itself.

I have noted that there do not appear to be any encoding options under System Info as described in the linked article at all.

  1. Does that mean that the upgrade is likely to have reverted these settings?
  2. How do we re-introduce them?
  3. What is the impact of doing so?

Regards

Adam

Shannon S
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
November 1, 2017

Hi Adam,

I can tell you that the exact encoding your using was mentioned by our previous bug master in JRASERVER-59427 as saying that it was not supported by Jira, so it does need to be utf8_bin. 

Can you let me know what you mean when you say you use _general_ci to allow multi? I'm not familiar with that.

And even though there's no historical incidents, we develop Jira with the assumption that the user is using utf8_bin so inevitably there may be issues if you don't have the particular collation set.

I had a look at my Jira instance which I installed as 7.4.4 with the default settings, and it does appear that the encoding option should be listed as an option in System Info:

Screen Shot 2017-11-01 at 12.00.30 PM.png

However, this may still display as UTF-8 from you since that's technically what you're using. If this is the case, then you will want to proceed by setting the utf8_bin collation. 

It may also help if you have a look at the bug related to the error: JRASERVER-36135.

If you still have issues after this, let me know, and I'll create a support ticket for you and we can have a deeper look at your support logs. However, I can tell you that we will need the collation to match our recommendation before we proceed further.

Kind Regards,
Shannon

Adam Nachman November 1, 2017

"In many cases for us, _general_ci is a preference to allow multi"

Apologies. Happens when you get distracted while typing sometimes.

_general_ci is a preference for multi-language support in many cases to emulate anglicised sort order and string comparisons for search.

Interestingly, JIRA now seems to believe that the database encoding is utf8_bin on that screen though, even though no changes have been made and I've now checked the MYSQL instance directly and the database is utf8_bin. I pulled the utf8_general_ci setting from System Info in the first place. 

Other than our regular backups, JIRA is the only task that touches the database, so this has not been updated (and certainly not in the last 2 days!), and there are only two administrators that have access to the MySQL Instance.

In terms of the System Encoding, that screen simply does not show the entry for system encoding at all. In other words, there is no setting visible there (and therefore none that I can change). Does this mean that it defaults to utb8_bin silently, or would it defer to some default windows system code page instead?

Again, my questions are:

  1. How do I add this setting?
  2. What is the impact of doing so?
Shannon S
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
November 2, 2017

Adam,

That's very odd that you wouldn't have the System Encoding listed on the System Info Page, as it's hard-coded in. It's not a setting you can enable. You also cannot change the encoding from this page, it simply lets you know what the Jira database is running on.

Could you send me a screenshot of your System Info page?

Lastly, Jira is not able to change the encoding of your database, so I'm not sure why you would have experienced the encoding changing without you doing so. However, now that your database and all its tables are on utf8_bin are you still having the issue? Is there no way that you might be looking at a different database?

I would also recommend checking the encoding of your system. If you're on Linux you can do this by typing locale charmap in a terminal.

Lastly, I recommend you run the 2nd workaround from the bug I sent you:

SET GLOBAL innodb_file_format='Barracuda';

ALTER TABLE t1 ENGINE=INNODB ROW_FORMAT=DYNAMIC;


Let us know if you have any questions.

Kind Regards,
Shannon

Suggest an answer

Log in or Sign up to answer