With lxml 4.5.0
β― python
Python 3.9.1 (default, Feb 5 2021, 17:04:50)
[Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> from io import StringIO
>>> etree.parse(StringIO('<h2>πΊ</h2>'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "src/lxml/etree.pyx", line 3519, in lxml.etree.parse
File "src/lxml/parser.pxi", line 1856, in lxml.etree._parseDocument
File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
File "src/lxml/parser.pxi", line 1757, in lxml.etree._parseDoc
File "src/lxml/parser.pxi", line 1068, in lxml.etree._BaseParser._parseUnicodeDoc
File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError
File "<string>", line 1
lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range, line 1, column 2
>>>
with lxml 4.6.1
β― python
Python 3.9.1 (default, Feb 5 2021, 17:04:50)
[Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> from io import StringIO
>>> etree.parse(StringIO('<h2>πΊ</h2>'))
<lxml.etree._ElementTree object at 0x10ea66f40>
>>>
hehe. Long time no see.
ah interesting. I wonder if this is something else then, because everything you have is more recent than me.
hmm another test outside of python in the terminal, to see if it's an issue with libxml.
The error message comes from libxml.
And I get:
No error. this is the correct emoji.
so maybe it's something in Daniel Veillard's libxml2