Created
August 7, 2012 02:20
-
-
Save karlcow/3280765 to your computer and use it in GitHub Desktop.
Encoding is tough stuff
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
>>> import cssutils | |
>>> cssutils.parseUrl("http://m.vk.com/css/s_mb.css?177") | |
WARNING 'charmap' codec can't decode byte 0x98 in position 810: character maps to <undefined> | |
>>> import requests | |
>>> r = requests.get("http://m.vk.com/css/s_mb.css?177") | |
>>> r.headers | |
{'content-encoding': 'gzip', 'transfer-encoding': 'chunked', 'expires': 'Tue, 14 Aug 2012 02:14:05 GMT', 'vary': 'Host,Accept-Encoding', 'server': 'nginx/1.2.1', 'connection': 'keep-alive', 'pragma': 'no-cache', 'cache-control': 'max-age=604800', 'date': 'Tue, 07 Aug 2012 02:14:05 GMT', 'x-powered-by': 'PHP/5.3.3-7+squeeze3', 'content-type': 'text/css; charset=windows-1251'} | |
>>> r.headers['content-type'] | |
'text/css; charset=windows-1251' |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I guess I have to change strategy and always use requests for the HTTP part, then decode the string. That might even make the code simpler.