How do I parse an Outlook HTML message with JIRA Email This Issue?
The email looks like this
Leading Text
Capture text
Plain text emails are easily parsed with Field Context regular expressions like this
(?i)lead in text\s*\r\n(.*)
However, when I have an email from Outlook it won't parse. The source code created by outlook looks like this
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/** Edited for brevity */
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link="#0563C1" vlink="#954F72"><div class=WordSection1><p class=MsoNormal>leading text<o:p></o:p></p><p class=MsoNormal>capture text<o:p></o:p></p></div></body></html>
The following regular expression will work when using the test, but fail on an incoming email.
(?i)leading text\s*(<.*?>)+(.*?)<.*?>
Or more restrictively
(?i)leading text\s*(<[\/:op]*?>|<p class=MsoNormal>)+(.*?)<[\/:op]*?>
What am I missing?
Thanks!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.