-
-
Save dginev/828279d9689417d65a14 to your computer and use it in GitHub Desktop.
#!/usr/perl/bin -w | |
use strict; | |
use warnings; | |
use XML::LibXML; | |
my $xml_content = <<EOL; | |
<?xml version="1.0" encoding="UTF-8"?> | |
<a xml:id="a"> | |
<b xml:id="b"></b> | |
</a> | |
EOL | |
my $dom = XML::LibXML->load_xml(string => $xml_content, no_blanks => 1); | |
my $a = $dom->documentElement; | |
my $b = $a->firstChild; | |
$b->unbindNode(); | |
my $c = $dom->createElement("c"); | |
$c->setAttributeNS('http://www.w3.org/XML/1998/namespace', 'id', "b"); | |
$a->appendChild($c); | |
print $dom->toString(1),"\n"; |
My latest interpretation is that we want to set attributes before we have a document owner, and let the appending method take care of attaching the id onto the document fragment.
This looks like semi-legitimate concern for a libxml bug in setAttributeNS
at the moment.
The question becomes which libxml methods coerce an overwrite of IDs of unbound nodes, and which don't. And is that separation a feature or a bug.
The problem with Element->new
is that it creates an unnamespaced element, where as $doc->createElement
has the variant $doc->createElementNS
that not only specifies a namespace, but it links to the namespace declaration node in the document. I've carefully contrived to put all the namespace declarations on the root node, and try to create elements using that interface so that libxml2 knows that all the same namespaces are actually the same. Otherwise, it starts throwing in random prefixes 'ns', ,'ns1',, etc. Really ugly, and breaks testing, even though the "Information Set" is the same.
not to say the same effect can't be achieved other ways, but just that there's method to the madness...
OBTW: I certainly agree that the documentation of unbindNode certainly leaves it reasonable that it should complain about duplicated ids. It's just a strange time to suddenly start enforcing that! :>
BTW: Your example above is "unconvincing" without at least an appendChild at the end:
$a->appendChild($c);
print "DOM: ".$dom->toString."\n";
This might be worth posting to either the libxml2 or perl-xml lists, or even crosspost (see https://mail.gnome.org/mailman/listinfo/xml or [email protected]).
Might either get suggestions for workarounds, or tickle the patch process!
Do you want to? Or should I? (I'm already subscribed to both; quite low-volume lists)
Feel free to mail it to them, I have my hands full at the moment. I added the appendChild at the end, but the error is raised on the setAttributeNS
irrespective of what follows it.
Btw, completely destroying $b
removes the warning:
$b->unbindNode();
undef $b;
But this is not so simple in LaTeXML, since we are reusing $b
, in a new and different subtree of its original parent.
I also have the eery memory that explicitly calling undef on a node that was just unbound isn't safe and carelessly doing so quickly results in memory leaks.
I don't think the undef should be unsafe, since the DESTROY should only happen if the refcount goes to 0. If there is another reference, it simply won't get destroyed, and the id wont be cleaned up!
I asked on the xml list, and got some good response from Nick Wellnhofer. Basically he made the same suggestions you did! Namely either set to undef, remove the id. Or, also you could put the node into a new document:
XML::LibXML::Document->new->adoptNode($node)
but that would be expensive.
I'm inclined to think that removing the id is the best approach:
$node->removeAttribute('xml:id');
I believe you tested it, and that got rid of the errors? At least the ones from the place where it was used; I'll have to scan through and find all the places that need that treatment. Should just be a few...
Oh, @dginev, so you get a notice! Ha!
For some reason it didn't even send me a notice ... Thanks for that, removeAttribute should do the trick, we just need to hunt down all positions where we unbind nodes (that we don't reuse later).
Reading the documentation, this behaviour kind of makes sense. On one hand, we are told unbindNode() does not remove the node from the documentFragment. On the other, we are setting attributes through the document fragment, which is a cause of conflict.
This refactoring of line 22 works without any errors in libxml 2.9.2: