With Python standard lib only, xstream.py. Stream through a large file by looking, e.g. at one "record" at a time.
$ metha-sync https://bop.unibe.ch/baf/oai && \
metha-cat https://bop.unibe.ch/baf/oai | python xstream.py
0 <ns0:record xmlns:ns [...] /><ns0:about /></ns0:record>
1 <ns0:record xmlns:ns [...] /><ns0:about /></ns0:record>
2 <ns0:record xmlns:ns [...] /><ns0:about /></ns0:record>
3 <ns0:record xmlns:dc [...] ta><ns0:about /></ns0:record>
4 <ns0:record xmlns:dc [...] ta><ns0:about /></ns0:record>
5 <ns0:record xmlns:dc [...] ta><ns0:about /></ns0:record>
6 <ns0:record xmlns:dc [...] ta><ns0:about /></ns0:record>
7 <ns0:record xmlns:dc [...] ta><ns0:about /></ns0:record>
8 <ns0:record xmlns:dc [...] ta><ns0:about /></ns0:record>
9 <ns0:record xmlns:dc [...] ta><ns0:about /></ns0:record>
...
Hacky, but ... hacky - with xmlcutty - mostly only for generating some lists out of text withing tags.
$ metha-sync https://bop.unibe.ch/baf/oai && \
metha-cat https://bop.unibe.ch/baf/oai | \
xmlcutty -path /Records/record/metadata/dc/identifier -rename '\n' | \
grep ^http
https://bop.unibe.ch/baf/article/view/3666
https://bop.unibe.ch/baf/article/view/3667
https://bop.unibe.ch/baf/article/view/4195
https://bop.unibe.ch/baf/article/view/4211
https://bop.unibe.ch/baf/article/view/4454
https://bop.unibe.ch/baf/article/view/7197
https://bop.unibe.ch/baf/article/view/7275
https://bop.unibe.ch/baf/article/view/3355
https://bop.unibe.ch/baf/article/view/3498
...
Generate a Go struct that can handle a certain XML document structure. Via
zek, -p
generates a small example program to
converts XML to JSON in a streaming fashion.
$ metha-sync https://bop.unibe.ch/baf/oai && \
metha-cat https://bop.unibe.ch/baf/oai | \
zek -p -j > example.go
$ metha-sync https://bop.unibe.ch/baf/oai && \
metha-cat https://bop.unibe.ch/baf/oai | \
GO111MODULE=off go run example.go | jq .
{
"XMLName": {
"Space": "",
"Local": "Records"
},
"Xsi": "http://www.w3.org/2001/XMLSchema-instance",
"Record": [
{
"Text": "",
"Xmlns": "http://www.openarchives.org/OAI/2.0/",
"Header": {
"Text": "",
"Status": "deleted",
"Identifier": "oai:ojs.bop.unibe.ch:article/2781",
"Datestamp": "2016-06-02T13:46:06Z",
"SetSpec": "baf:ART"
},
"Metadata": {
"Text": "",