If you need to compare two XML documents, but certain elements are not in the same order between the two documents, and order of elements does not matter (i.e. the elements contain collections rather than lists), here is a nice way to do it.
Consider the following files:
user@host:~/xmldiff$ cat a.xml
<?xml version="1.0"?>
<Element xmlns="https://mynamespace.com/path">
<SubElement>
<Items>
<Item>
<Field1>Value 1</Field1>
<Field2>Value 2</Field2>
<Field3>Value 3</Field3>
</Item>
</Items>
</SubElement>
</Element>
user@host:~/xmldiff$ cat b.xml
<?xml version="1.0"?>
<Element xmlns="https://mynamespace.com/path">
<SubElement>
<Items>
<Item>
<Field1>Value 1</Field1>
<Field3>Value 3</Field3>
<Field2>Value 2</Field2>
</Item>
</Items>
</SubElement>
</Element>
They do not diff cleanly:
user@host:~/xmldiff$ diff a.xml b.xml
7d6
< <Field2>Value 2</Field2>
8a8
> <Field2>Value 2</Field2>
But what if the order of Field1
, Field2
, and Field3
within Item
does not matter. This is problematic.
We can use this XSL template to reformat the XML:
user@host:~/xmldiff$ cat xmlsort.xsl
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" />
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="@*" />
<xsl:apply-templates select="*">
<xsl:sort select="(@name | text())[1]" />
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Apply it with xsltproc:
user@host:~/xmldiff$ xsltproc xmlsort.xsl a.xml
<?xml version="1.0"?>
<Element xmlns="https://mynamespace.com/path"><SubElement><Items><Item><Field1/><Field2/><Field3/></Item></Items></SubElement></Element>
user@host:~/xmldiff$ xsltproc xmlsort.xsl b.xml
<?xml version="1.0"?>
<Element xmlns="https://mynamespace.com/path"><SubElement><Items><Item><Field1/><Field2/><Field3/></Item></Items></SubElement></Element>
Now feed the outputs to diff:
user@host:~/xmldiff$ diff <(xsltproc xmlsort.xsl a.xml) <(xsltproc xmlsort.xsl b.xml)
user@host:~/xmldiff$ echo $?
0
And we're done! Remember, in certain XML documents, order might matter, and this will not be a valid diff, but if all of your data is a collection data structure rather than a list, this may work well for you!