Created
April 19, 2019 10:10
-
-
Save SebDeclercq/641a5c09052ad76923b840d61cc0d493 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from lxml import etree | |
xml = etree.parse('FA173049.xml') | |
for entity in xml.docinfo.internalDTD.entities(): | |
print(f'{entity.name}.{entity.content}') |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
If entity is :
<!ENTITY FIG10 SYSTEM "FIG10.png" NDATA png>
, we obtain FIG10.png --> correct butif entity is :
<!ENTITY artwork1 SYSTEM "FA048474_p78-471F1.png" NDATA png>
, we obtain artwork1.png --> not ok