Skip to content

Instantly share code, notes, and snippets.

@zacharysyoung
Last active March 11, 2023 01:21
Show Gist options
  • Save zacharysyoung/f3688f6bc23023cbb8b0ad9f12f3b9cf to your computer and use it in GitHub Desktop.
Save zacharysyoung/f3688f6bc23023cbb8b0ad9f12f3b9cf to your computer and use it in GitHub Desktop.
Trying to help answer SO-75698546
  • input.xml: a sample of OP's XML. The downloaded XML incorrectly states its encoding as ISO-8859-1; it really is encoded as Windows-1252. I've tried viewing the Raw representation in this Gist, and copying-pasting over my original file; doing so, git doesn't alert me of any modificaions, so I presume we are copying-pasting the Windows-1252 encoding.
  • main.py: OP's original program with some small tweaks for style and type correction, and I fixed the issue with not iterating the rupture nodes.
  • output.csv: what main.py generates given input.xml
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<pdv_liste>
<pdv id="1000013" latitude="4619851.83794" longitude="524350.637881" cp="01000" pop="R">
<adresse>BOULEVARD CHARLES DE GAULLE</adresse>
<ville>Bourg-en-Bresse</ville>
<horaires automate-24-24="">
<jour id="1" nom="Lundi" ferme="" />
<jour id="2" nom="Mardi" ferme="" />
<jour id="3" nom="Mercredi" ferme="" />
<jour id="4" nom="Jeudi" ferme="" />
<jour id="5" nom="Vendredi" ferme="" />
<jour id="6" nom="Samedi" ferme="" />
<jour id="7" nom="Dimanche" ferme="" />
</horaires>
<services>
<service>Carburant additivé</service>
<service>DAB (Distributeur automatique de billets)</service>
</services>
<prix nom="Gazole" id="1" maj="2023-03-09T00:01:00" valeur="1.805" />
<prix nom="E10" id="5" maj="2023-03-09T00:01:00" valeur="1.843" />
<prix nom="SP98" id="6" maj="2023-03-09T00:01:00" valeur="1.919" />
<rupture id="2" nom="SP95" debut="2022-09-27T09:15:21" fin="" />
<rupture id="3" nom="E85" debut="2022-09-27T09:15:21" fin="" />
<rupture id="4" nom="GPLc" debut="2022-09-27T09:15:22" fin="" />
</pdv>
</pdv_liste>
#!/usr/bin/env python3
import csv
import xml.etree.ElementTree as ET
fields = [
"pdv_id",
"latitude",
"longitude",
"cp",
"pop",
"adresse",
"ville",
"jour_nom",
"ferme",
"prix_nom",
"valeur",
"rupture_nom",
"debut",
"fin",
]
with (
# XML appears to actually be encoded as Windows-1252
open("input.xml", encoding="windows-1252") as f_xml,
open("output.csv", "w", newline="") as f_csv,
):
writer = csv.DictWriter(f_csv, fieldnames=fields)
writer.writeheader()
root = ET.parse(f_xml).getroot()
for pdv in root.iter("pdv"):
row = {
"pdv_id": pdv.attrib["id"],
"latitude": pdv.attrib["latitude"],
"longitude": pdv.attrib["longitude"],
"cp": pdv.attrib["cp"],
"pop": pdv.attrib["pop"],
"adresse": "",
"ville": "",
}
for nodename in ["adresse", "ville"]:
node = pdv.find(nodename)
if node:
row[nodename] = node.text
# Extract jour data and append to row
for jour in pdv.iter("jour"):
row["jour_nom"] = jour.attrib["nom"]
row["ferme"] = jour.attrib["ferme"]
writer.writerow(row)
# Extract prix data and append to row
for prix in pdv.iter("prix"):
row["prix_nom"] = prix.attrib["nom"]
row["valeur"] = prix.attrib["valeur"]
writer.writerow(row)
# Extract rupture data and append to row
for rupture in pdv.iter("rupture"):
row["rupture_nom"] = rupture.attrib["nom"]
row["debut"] = rupture.attrib["debut"]
row["fin"] = rupture.attrib["fin"]
writer.writerow(row)
pdv_id latitude longitude cp pop adresse ville jour_nom ferme prix_nom valeur rupture_nom debut fin
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Lundi
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Mardi
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Mercredi
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Jeudi
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Vendredi
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Samedi
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Dimanche
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Dimanche Gazole 1.805
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Dimanche E10 1.843
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Dimanche SP98 1.919
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Dimanche SP98 1.919 SP95 2022-09-27T09:15:21
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Dimanche SP98 1.919 E85 2022-09-27T09:15:21
1000013 4619851.83794 524350.637881 01000 R BOULEVARD CHARLES DE GAULLE Bourg-en-Bresse Dimanche SP98 1.919 GPLc 2022-09-27T09:15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment