Exporting data with accent marks to Splunk

Thomas Artopoulos January 12, 2021

Hey,

 

I'm trying to export names and Issues to Splunk. The problem is that im getting a rare encoding and I cannot work the data in Splunk.

 

Does anyone had the same issue? Should I transform the data when indexing?

 

 

1 answer

0 votes
Andy Heinzer
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 14, 2021

Hi Thomas,

If I gather correctly, you are trying to export some data from Jira Cloud into Splunk, but in the process of doing so, some characters are not appearing correctly.   This can happen when moving data between systems that utilizing different character encodings.

I know that Jira Cloud will using UTF-8 for encoding of text data.  I also did some digging into how Splunk can be setup on this topic, such as Configure character set encoding, which would also indicate that UTF-8 is a default supported encoding, in addition to other character sets that can be configured there.

If you were using a Jira Server or Jira Data Center, then it is possible that the host operating system for Jira could be setup to utilize some other character set.  In cases like that I'd recommend trying to follow Troubleshoot character display issues in Jira server  in order to setup Jira to use UTF-8 always.

But it sounds like you're using Jira Cloud, in which case that KB won't apply here.  Could you tell me more about your environment?  Such as what steps are you taking to extract this data from Jira?  And how do these accented characters appear in Jira? How do those same characters then appear in Splunk? (do they appear as a similar character or something completely different?) 

Perhaps if I can learn more about your situation, I can help troubleshoot this further.

Let me know.

Andy

Thomas Artopoulos January 15, 2021

Hi Andy,

Im using Jira Cloud.  The problem is when searching a name containing an accented mark in my Splunk index. It's strange because Splunk supports multiple encodings and in this case it's not working properly.

What I`ve tried is changing the encoding to latin-1, but it goes the same way. I don't know if there's something I can do in the Jira enviroment. This problem is also present when importing Issues and Project names with accent marks.

 

Hope you can help me Andy,

Thomas.

Soporte_1.pngSoporte_2.png

Andy Heinzer
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 15, 2021

Thanks for that info. It really helps to shed more light on the problem here.  I can see that the character \u00e9 appears to be a C or Java source representation for “é”.  Jira is originally built in Java, so that makes more sense now.

Jira Cloud has its own indexing, which you can choose which language should be used for your site.  More details in Configure Jira application options.  You could change this to a value such as 'Other' which we recommend if the site is using multiple languages.  However I'm not confident that this will change the way that data is exported out of Jira for Splunk to use.  That setting is specifically relevant to how Jira indexes data, not Splunk.

However this problem appears to be rather common in regards to Splunk.  I found a number of similar threads on the topic such as this one: Best practice for dealing with Unicode codepoints in Splunk ?  One suggestion there is to setup a props.conf file as explained in Configure character set encoding.  However I'm not certain exactly which encoding is necessary.  This appears to be a unicode character, but I would also try using some of the encodings listed from splunks page.  Specifically I would try

  • utf-16be (aka, ISO-10646-UCS-2, UCS-2, CSUNICODE, UCS-2BE, UNICODE-1-1, UNICODEBIG, CSUNICODE11, UTF-16)
  • utf-7 (aka, UNICODE-1-1-UTF-7, CSUNICODE11UTF7)
  • c99 (aka, java)

Try using one of those and let me know if this works.

Andy

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
PRODUCT PLAN
FREE
TAGS
AUG Leaders

Atlassian Community Events