Getting intermittent SSL errors from the REST API when creating an issue

James Anziano February 27, 2020

We have a server that uses the python API to create tickets under certain conditions. Starting a few weeks ago, around 10% of the requests we make start failing with SSL errors. I can confirm it's not anything about the request itself, because due to the fact that we log all the requests coming to the server, I can re-send the exact same failed requests at a later time and it works fine. The python stacktrace is attached. Anyone else experiencing this or know how to resolve it?

 

[('SSL routines', 'ssl3_get_record', 'decryption failed or bad record mac')]
Traceback (most recent call last):
File "/var/www/TRportal/TRportal/backend/investigations_req.py", line 86, in make_ticket
jira_ticket = self.jira.create_ticket(issue_dict)
File "/var/www/TRportal/TRportal/backend/lib/jira_wrapper.py", line 22, in create_ticket
ticket = self.jira_api.create_issue(fields=issue_dict)
File "/var/www/TRportal/venv/lib/python3.7/site-packages/jira/client.py", line 1107, in create_issue
r = self._session.post(url, data=json.dumps(data))
File "/var/www/TRportal/venv/lib/python3.7/site-packages/jira/resilientsession.py", line 154, in post
return self.__verb('POST', url, **kwargs)
File "/var/www/TRportal/venv/lib/python3.7/site-packages/jira/resilientsession.py", line 125, in __verb
response = method(url, timeout=self.timeout, **kwargs)
File "/var/www/TRportal/venv/lib/python3.7/site-packages/requests/sessions.py", line 581, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/var/www/TRportal/venv/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/var/www/TRportal/venv/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/var/www/TRportal/venv/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/var/www/TRportal/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/var/www/TRportal/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 384, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/var/www/TRportal/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 380, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.7/http/client.py", line 1336, in getresponse
response.begin()
File "/usr/lib/python3.7/http/client.py", line 306, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.7/http/client.py", line 267, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
File "/var/www/TRportal/venv/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py", line 285, in recv_into
return self.connection.recv_into(*args, **kwargs)
File "/var/www/TRportal/venv/lib/python3.7/site-packages/OpenSSL/SSL.py", line 1822, in recv_into
self._raise_ssl_error(self._ssl, result)
File "/var/www/TRportal/venv/lib/python3.7/site-packages/OpenSSL/SSL.py", line 1647, in _raise_ssl_error
_raise_current_error()
File "/var/www/TRportal/venv/lib/python3.7/site-packages/OpenSSL/_util.py", line 54, in exception_from_error_queue
 raise exception_type(errors)
OpenSSL.SSL.Error: [('SSL routines', 'ssl3_get_record', 'decryption failed or bad record mac')]

1 answer

0 votes
Andy Heinzer
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
March 10, 2020

Hi James,

I understand that you are using python to make REST calls to a Jira Cloud site and that sometimes these are failing with an error such as

decryption failed or bad record mac

While I have not seen this specific error in relation to a Jira Cloud site before, I did find at least one other support case with this error.  In that case it was in relation to a Bitbucket customer that was trying to interface with an older git client.  In that case we found that the client was using a rather old library that could potentially sometimes try to connect using TLS v1 or v1.1, which Atlassian Cloud has deprecated support for since 2018.  More details on the deprecation notice in Deprecating TLSv1 and TLSv1.1 for Atlassian Cloud Products.  

I'm not 100% sure this is the cause in your case, but I thought it might at least worth investigating your python client here to see if perhaps you might be on an affected version.  You can find this via a terminal command of:

python -V

If your client is on an older version of python here, it could potentially explain this behavior.  If not, then we would want see what version of OpenSSL is in use here, with a command of:

openssl version

Please let me know the results.  I am interested to see if we can learn more about this environment to see how we can help.

Andy

James Anziano March 30, 2020

Hi Andy,

 

The machine is running python 3.7.3, and openssl is version 1.1.1b.

 

Thanks!

Andy Heinzer
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 1, 2020

Hi James,

I haven't found anything that stands out specific to those versions.  From what I have found, it seems like this error could be cause by the client trying to use TLSv1 or 1.1 when only 1.2 is currently supported by Atlassian Cloud.  But it's not clear to me yet why the client would try to do that. I'm also having a hard time finding any other reports with this error message. 

I did find a thread with the same error message.  Check out https://serverfault.com/a/861260

In that case, the user found that their network adapter had enabled TCP Segment Offloading, which was incidentally truncating some data in the process sometimes.  They suggested turning off TSO/GSO/GRO on the network adapter with a command such as:

ethtool -K eth0 tso off gro off gso off ufo off

I'd be interested to see if you can try that and see if that helps.

It's very strange that you can just re-run these commands later and it works no problem.  That makes me believe this problem is specific to your environment.  How often is this happening?  Do you have a different machine that makes these same kinds of REST calls?

James Anziano April 2, 2020

Hi Andy,

 

Thanks, I'll try that in regards to the network adapter.

 

As for frequency, it seems to vary quite a bit. A few days ago, out of ~15-20 requests (this includes the retries that worked), 6 failed. Some of these failures happened during the REST calls to add comments, so it's not just limited to creating tickets. The next day, we had 25 requests and only one failed. I would guesstimate on average it's a 10-15% failure rate.

 

We have multiple machines that make these kinds of REST calls, but only one is used with any real regularity. We haven't seen the issue from these other machines but I would guess that's more due to the fact that they're not used enough for the issue to happen, the other machines may make 10-15 requests in a month, combined.

James Anziano April 2, 2020

Also just for further info, all the machines in question are Ubuntu 19.04.

James Anziano April 6, 2020

Hey Andy, turning off TCP Segment Offloading does not appear to have worked, we had our first repeat of the OpenSSL error this morning. For sanity's sake I did confirm the setting is still turned off on the machine.

Andy Heinzer
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 14, 2020

Hi James,

Sorry for the delay.  Did some more research and I think I found a potential solution here.  Please check out Choosing The SSL Version In Python Requests

The suspicion here is that python is sometimes selecting a bad cypher when trying to establish the SSL connection.  And that article talks about you to can set the session to always specify TLS 1.2.  The only difference I think you would need to do from that article is to change the line of

ssl_version=ssl.PROTOCOL_TLSv1

into

ssl_version=ssl.PROTOCOL_TLSv1_2

Since at the time this article was created (2013) TLSv1 was likely the standard back then, but now Atlassian Cloud is only accepting TLSv1.2 today.  From reviewing this it looks like it will force the python session to use only that version of TLS when creating SSL sessions.

Let me know if this helps.

Andy

James Anziano April 15, 2020

Hi Andy,

I am not creating the session, since I am using the jira python library which handles all of that.

Regardless, a coworker tried to implement a test case of the code in that article, and he started getting errors suggesting that PoolManager no longer accepts an attribute for ssl_version. It looks like the article is out of date. He found this thread, which talks about doing exactly what your article discusses, and it seems you cannot force OpenSSL to use any particular version of TLS anymore. It uses the latest version automatically.

Additionally, I found this article which provides a way to check what version of TLS python is using (the requests.get('https://www.howsmyssl.com/a/check', verify=False).json()['tls_version'] part), and I get TLS 1.3.

James Anziano April 15, 2020

Hi Andy,

I can't really pass any parameters to the session, as I am using the jira python library which handles all of that behind the scenes. Regardless, I don't think that is the issue.

We tried to implement the code in that article even just as its own thing to test and ran into errors. It seems it is largely out of date. This link suggests that it is no longer possible to force a different version of TLS, and that python will always use the latest version that is available under the installed version of OpenSSL. I also found this link which provides a way to tell what version of TLS is being used, and running that in our environment shows that we're using TLS 1.3

Suggest an answer

Log in or Sign up to answer