I'm trying to export names and Issues to Splunk. The problem is that im getting a rare encoding and I cannot work the data in Splunk.
Does anyone had the same issue? Should I transform the data when indexing?
If I gather correctly, you are trying to export some data from Jira Cloud into Splunk, but in the process of doing so, some characters are not appearing correctly. This can happen when moving data between systems that utilizing different character encodings.
I know that Jira Cloud will using UTF-8 for encoding of text data. I also did some digging into how Splunk can be setup on this topic, such as Configure character set encoding, which would also indicate that UTF-8 is a default supported encoding, in addition to other character sets that can be configured there.
If you were using a Jira Server or Jira Data Center, then it is possible that the host operating system for Jira could be setup to utilize some other character set. In cases like that I'd recommend trying to follow Troubleshoot character display issues in Jira server in order to setup Jira to use UTF-8 always.
But it sounds like you're using Jira Cloud, in which case that KB won't apply here. Could you tell me more about your environment? Such as what steps are you taking to extract this data from Jira? And how do these accented characters appear in Jira? How do those same characters then appear in Splunk? (do they appear as a similar character or something completely different?)
Perhaps if I can learn more about your situation, I can help troubleshoot this further.
Let me know.
Im using Jira Cloud. The problem is when searching a name containing an accented mark in my Splunk index. It's strange because Splunk supports multiple encodings and in this case it's not working properly.
What I`ve tried is changing the encoding to latin-1, but it goes the same way. I don't know if there's something I can do in the Jira enviroment. This problem is also present when importing Issues and Project names with accent marks.
Hope you can help me Andy,
Thanks for that info. It really helps to shed more light on the problem here. I can see that the character \u00e9 appears to be a C or Java source representation for “é”. Jira is originally built in Java, so that makes more sense now.
Jira Cloud has its own indexing, which you can choose which language should be used for your site. More details in Configure Jira application options. You could change this to a value such as 'Other' which we recommend if the site is using multiple languages. However I'm not confident that this will change the way that data is exported out of Jira for Splunk to use. That setting is specifically relevant to how Jira indexes data, not Splunk.
However this problem appears to be rather common in regards to Splunk. I found a number of similar threads on the topic such as this one: Best practice for dealing with Unicode codepoints in Splunk ? One suggestion there is to setup a props.conf file as explained in Configure character set encoding. However I'm not certain exactly which encoding is necessary. This appears to be a unicode character, but I would also try using some of the encodings listed from splunks page. Specifically I would try
- utf-16be (aka, ISO-10646-UCS-2, UCS-2, CSUNICODE, UCS-2BE, UNICODE-1-1, UNICODEBIG, CSUNICODE11, UTF-16)
- utf-7 (aka, UNICODE-1-1-UTF-7, CSUNICODE11UTF7)
- c99 (aka, java)
Try using one of those and let me know if this works.
Catch up with Atlassian Product Managers in our 2020 Demo Den round-up! From Advanced Roadmaps to Code in Jira to Next-Gen Workflows, check out the videos below to help up-level your work in the new ...
Connect with like-minded Atlassian users at free events near you!Find an event
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no Community Events near you at the moment.Host an event
You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events