- input.xml: a sample of OP's XML. The downloaded XML incorrectly states its encoding as ISO-8859-1; it really is encoded as Windows-1252. I've tried viewing the Raw representation in this Gist, and copying-pasting over my original file; doing so, git doesn't alert me of any modificaions, so I presume we are copying-pasting the Windows-1252 encoding.
- main.py: OP's original program with some small tweaks for style and type correction, and I fixed the issue with not iterating the rupture nodes.
- output.csv: what main.py generates given input.xml
Last active
March 11, 2023 01:21
-
-
Save zacharysyoung/f3688f6bc23023cbb8b0ad9f12f3b9cf to your computer and use it in GitHub Desktop.
Trying to help answer SO-75698546
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?> | |
<pdv_liste> | |
<pdv id="1000013" latitude="4619851.83794" longitude="524350.637881" cp="01000" pop="R"> | |
<adresse>BOULEVARD CHARLES DE GAULLE</adresse> | |
<ville>Bourg-en-Bresse</ville> | |
<horaires automate-24-24=""> | |
<jour id="1" nom="Lundi" ferme="" /> | |
<jour id="2" nom="Mardi" ferme="" /> | |
<jour id="3" nom="Mercredi" ferme="" /> | |
<jour id="4" nom="Jeudi" ferme="" /> | |
<jour id="5" nom="Vendredi" ferme="" /> | |
<jour id="6" nom="Samedi" ferme="" /> | |
<jour id="7" nom="Dimanche" ferme="" /> | |
</horaires> | |
<services> | |
<service>Carburant additivé</service> | |
<service>DAB (Distributeur automatique de billets)</service> | |
</services> | |
<prix nom="Gazole" id="1" maj="2023-03-09T00:01:00" valeur="1.805" /> | |
<prix nom="E10" id="5" maj="2023-03-09T00:01:00" valeur="1.843" /> | |
<prix nom="SP98" id="6" maj="2023-03-09T00:01:00" valeur="1.919" /> | |
<rupture id="2" nom="SP95" debut="2022-09-27T09:15:21" fin="" /> | |
<rupture id="3" nom="E85" debut="2022-09-27T09:15:21" fin="" /> | |
<rupture id="4" nom="GPLc" debut="2022-09-27T09:15:22" fin="" /> | |
</pdv> | |
</pdv_liste> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
import csv | |
import xml.etree.ElementTree as ET | |
fields = [ | |
"pdv_id", | |
"latitude", | |
"longitude", | |
"cp", | |
"pop", | |
"adresse", | |
"ville", | |
"jour_nom", | |
"ferme", | |
"prix_nom", | |
"valeur", | |
"rupture_nom", | |
"debut", | |
"fin", | |
] | |
with ( | |
# XML appears to actually be encoded as Windows-1252 | |
open("input.xml", encoding="windows-1252") as f_xml, | |
open("output.csv", "w", newline="") as f_csv, | |
): | |
writer = csv.DictWriter(f_csv, fieldnames=fields) | |
writer.writeheader() | |
root = ET.parse(f_xml).getroot() | |
for pdv in root.iter("pdv"): | |
row = { | |
"pdv_id": pdv.attrib["id"], | |
"latitude": pdv.attrib["latitude"], | |
"longitude": pdv.attrib["longitude"], | |
"cp": pdv.attrib["cp"], | |
"pop": pdv.attrib["pop"], | |
"adresse": "", | |
"ville": "", | |
} | |
for nodename in ["adresse", "ville"]: | |
node = pdv.find(nodename) | |
if node: | |
row[nodename] = node.text | |
# Extract jour data and append to row | |
for jour in pdv.iter("jour"): | |
row["jour_nom"] = jour.attrib["nom"] | |
row["ferme"] = jour.attrib["ferme"] | |
writer.writerow(row) | |
# Extract prix data and append to row | |
for prix in pdv.iter("prix"): | |
row["prix_nom"] = prix.attrib["nom"] | |
row["valeur"] = prix.attrib["valeur"] | |
writer.writerow(row) | |
# Extract rupture data and append to row | |
for rupture in pdv.iter("rupture"): | |
row["rupture_nom"] = rupture.attrib["nom"] | |
row["debut"] = rupture.attrib["debut"] | |
row["fin"] = rupture.attrib["fin"] | |
writer.writerow(row) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pdv_id | latitude | longitude | cp | pop | adresse | ville | jour_nom | ferme | prix_nom | valeur | rupture_nom | debut | fin | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Lundi | |||||||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Mardi | |||||||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Mercredi | |||||||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Jeudi | |||||||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Vendredi | |||||||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Samedi | |||||||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Dimanche | |||||||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Dimanche | Gazole | 1.805 | |||||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Dimanche | E10 | 1.843 | |||||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Dimanche | SP98 | 1.919 | |||||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Dimanche | SP98 | 1.919 | SP95 | 2022-09-27T09:15:21 | |||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Dimanche | SP98 | 1.919 | E85 | 2022-09-27T09:15:21 | |||
1000013 | 4619851.83794 | 524350.637881 | 01000 | R | BOULEVARD CHARLES DE GAULLE | Bourg-en-Bresse | Dimanche | SP98 | 1.919 | GPLc | 2022-09-27T09:15:22 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment