Skip to content

Instantly share code, notes, and snippets.

View christopherkullenberg's full-sized avatar

Christopher Kullenberg christopherkullenberg

View GitHub Profile
@christopherkullenberg
christopherkullenberg / swepubxmlparser.py
Created December 30, 2015 13:07
Question concering XML
"""
Data structure: http://libris.kb.se/xsearch?d=swepub&hitlist&q=l%C3%A4ros%C3%A4te%3agu&f=ext&spell=true&hist=true&n=200&p=1
Trying to access only the value after "code="u">" in:
<datafield tag="700" ind1="1" ind2=" ">
<subfield code="a">Alvestad, Torgeir,</subfield>
<subfield code="d">1960-,</subfield>
<subfield code="u">Göteborgs universitet, Institutionen för pedagogik och didaktik, University of Gothenburg, Department of Education</subfield>
<subfield code="4">edt</subfield>
<subfield code="0">(SwePub:chalmers.se)xalvto</subfield>
</datafield>
import json
from os import listdir
for filename in listdir("GU20151228json/"): #alla filer i en katalog
with open("GU20151228json/" + filename) as currentFile:
jsondata = json.load(currentFile)
print(jsondata)
from urllib.request import urlopen
counter = 1
while True:
url = 'http://libris.kb.se/xsearch?d=swepub&hitlist&q=l%C3%A4ros%C3%A4te%3agu&f=ext&spell=true&hist=true&n=200&format=json&start=' + str(counter)
print ("Fetching: " + url)
data = urlopen(url).read()
if not data.find(b'"identifier"') >= 0:
print("No more records!")