Skip to content

Instantly share code, notes, and snippets.

@MichaelChirico
Created September 6, 2020 03:59
Show Gist options
  • Save MichaelChirico/0b5318b0d64d188df8f7fbe3185b09cf to your computer and use it in GitHub Desktop.
Save MichaelChirico/0b5318b0d64d188df8f7fbe3185b09cf to your computer and use it in GitHub Desktop.
Efficiency of reading XML documents with relative/absolute addresses
test_xml = '
<div>
<div>
<div>
<div>
<p>1</p>
<p>2</p>
<p>3</p>
</div>
</div>
</div>
</div>
'
library(xml2)
doc = read_xml(test_xml)
sub_doc = xml_find_first(doc, './div/div/div')
ps_from_full = function() {
o = list(
xml_find_first(doc, './div/div/div/p[1]'),
xml_find_first(doc, './div/div/div/p[2]'),
xml_find_first(doc, './div/div/div/p[3]')
)
sapply(o, xml_text)
}
ps_from_sub = function() {
o = list(
xml_find_first(sub_doc, './p[1]'),
xml_find_first(sub_doc, './p[2]'),
xml_find_first(sub_doc, './p[3]')
)
sapply(o, xml_text)
}
bench::mark(ps_from_full(), ps_from_sub())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment