Skip to content

Instantly share code, notes, and snippets.

@mgedmin
Last active December 29, 2015 07:39
Show Gist options
  • Save mgedmin/7637559 to your computer and use it in GitHub Desktop.
Save mgedmin/7637559 to your computer and use it in GitHub Desktop.
#!/usr/bin/python
import hashlib, time
try:
from urllib.request import urlopen
except ImportError:
from urllib import urlopen
url = 'https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.10.1.tar.gz'
md5sum_expected = '3a04aa2b32c76c83725ed4d9918e362e'
response = urlopen(url)
##import pdb; pdb.set_trace()
socket = getattr(response.fp, 'raw', response.fp)._sock
# If you set this to False, you get SSLError exceptions.
# If you set this to True, you get truncated downloads that fail md5sum checks.
socket.suppress_ragged_eofs = False
data = b''
while True:
chunk = socket.recv(8192)
print("%d %d" % (len(chunk), len(data) + len(chunk)))
if not chunk:
break
data += chunk
md5sum_actual = hashlib.md5(data).hexdigest()
filename = 'r%d' % time.time()
if md5sum_actual == md5sum_expected:
print("OK")
open(filename + '.ok', 'wb').write(data)
else:
print("BAD: %s" % md5sum_actual)
open(filename + '.bad', 'wb').write(data)
print("downloaded %d bytes" % len(data))
@rgbkrk
Copy link

rgbkrk commented Nov 25, 2013

Well well, I'm getting OKs and errors on Windows Server 2012 and a local Windows 7 box.

8000 1303670
Traceback (most recent call last):
  File ".\hashy.py", line 16, in <module>
    chunk = socket.recv(8192)
  File "C:\Python27\lib\ssl.py", line 241, in recv
    return self.read(buflen)
  File "C:\Python27\lib\ssl.py", line 160, in read
    return self._sslobj.read(len)
ssl.SSLError: [Errno 8] _ssl.c:1363: EOF occurred in violation of protocol

@smashwilson
Copy link

(One of @Rbkrk's co-workers here.) I'm getting OK and SSLErrors intermittently when I run this on my home Windows 7 box. No pattern that I can discern.

Here's the stack trace when it fails:

5634 1325303
Traceback (most recent call last):
  File ".\gistfile1.py", line 16, in <module>
    chunk = socket.recv(8192)
  File "C:\Python27\lib\ssl.py", line 241, in recv
    return self.read(buflen)
  File "C:\Python27\lib\ssl.py", line 160, in read
    return self._sslobj.read(len)
ssl.SSLError: [Errno 8] _ssl.c:1363: EOF occurred in violation of protocol

The EOF does seem to happen at a consistent offset when it does occur, though.

@mgedmin
Copy link
Author

mgedmin commented Dec 9, 2013

TIL: github doesn't send me emails when people comment on my gists. :(

So if I understand this correctly, @rgbkrk and @smashwilson saw this error on Windows, but outside the Rackspace network? IOW the issue is on the software side?

I'm also curious about why the ssl module suppresses this SSLError by default. @smashwilson, @rgbkrk, if you change socket.suppress_ragged_eofs in this script to True, do you see MD5 checksum errors (i.e. 'BAD: ...' messages)?

BTW another user also could reproduce this on Rackspace Cloud: https://mail.python.org/pipermail/distutils-sig/2013-December/023324.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment