-
-
Save davidamichelson/d6f8a727313026a79c10fe390bab05cd to your computer and use it in GitHub Desktop.
Data To Concatenate
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Headword,Syriaca_URI,Map_1,Map_2,Map_3,Map_4,Map_5,Map_6,Map_7,Map_8,Map_9,Map_10,Map_11,Map_12,Map_13,Map_14 | |
Abila,syriaca.org/place/4355,1,,,4,,,,,,,,,, | |
Abivard,syriaca.org/place/4222,,,,,,,,,9,,,,, | |
Abr Shahr,syriaca.org/place/4220,,,,,,,,,9,10,,,, | |
Acre|Akko,syriaca.org/place/14,1,2,,,5,,,,,,,,, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Headword,Syriaca_URI,Map_1,Map_2,Map_3,Map_4,Map_5,Map_6,Map_7,Map_8,Map_9,Map_10,Map_11,Map_12,Map_13,Map_14 | |
Abila,syriaca.org/place/4355,1,,,4,,,,,,,,,, | |
Abivard,syriaca.org/place/4222,,,,,,,,,9,,,,, | |
Abr Shahr,syriaca.org/place/4220,,,,,,,,,9,10,,,, | |
Acre,syriaca.org/place/14,,2,,,5,,,,,,,,, | |
Akko,syriaca.org/place/14,1,,,,5,,,,,,,,, |
Hi Dave,
Here's a query expression that should do the trick of merging rows with the same identifiers:
xquery version "3.1";
(: Merges content from rows with the same identifiers :)
let $uri := "https://gist.githubusercontent.com/CliffordAnderson/592c2c43870a9ef3335d26714e51b38a/raw/2b1c467d2bafdbba13e0a1dd91abafebfd2e7722/Original.csv"
let $csv := fetch:text($uri) => csv:parse(map{"header":fn:true()})
let $ids := fn:distinct-values($csv/csv/record/Syriaca_URI)
let $rows := fn:distinct-values($csv/csv/record/* ! fn:node-name(.))
for $id in $ids
return
element record {
for $cell in $rows
return element {$cell} {
let $contents := $csv/csv/record[Syriaca_URI = $id]/*[fn:name(.) = fn:string($cell)]/text()
return
if (count($contents) ne count(fn:distinct-values($contents)))
then $contents[1]
else fn:string-join($contents, "|")
}
}
Output:
<record>
<Headword>Abila</Headword>
<Syriaca_URI>syriaca.org/place/4355</Syriaca_URI>
<Map_1>1</Map_1>
<Map_2/>
<Map_3/>
<Map_4>4</Map_4>
<Map_5/>
<Map_6/>
<Map_7/>
<Map_8/>
<Map_9/>
<Map_10/>
<Map_11/>
<Map_12/>
<Map_13/>
<Map_14/>
</record>
<record>
<Headword>Abivard</Headword>
<Syriaca_URI>syriaca.org/place/4222</Syriaca_URI>
<Map_1/>
<Map_2/>
<Map_3/>
<Map_4/>
<Map_5/>
<Map_6/>
<Map_7/>
<Map_8/>
<Map_9>9</Map_9>
<Map_10/>
<Map_11/>
<Map_12/>
<Map_13/>
<Map_14/>
</record>
<record>
<Headword>Abr Shahr</Headword>
<Syriaca_URI>syriaca.org/place/4220</Syriaca_URI>
<Map_1/>
<Map_2/>
<Map_3/>
<Map_4/>
<Map_5/>
<Map_6/>
<Map_7/>
<Map_8/>
<Map_9>9</Map_9>
<Map_10>10</Map_10>
<Map_11/>
<Map_12/>
<Map_13/>
<Map_14/>
</record>
<record>
<Headword>Acre|Akko</Headword>
<Syriaca_URI>syriaca.org/place/14</Syriaca_URI>
<Map_1>1</Map_1>
<Map_2>2</Map_2>
<Map_3/>
<Map_4/>
<Map_5>5</Map_5>
<Map_6/>
<Map_7/>
<Map_8/>
<Map_9/>
<Map_10/>
<Map_11/>
<Map_12/>
<Map_13/>
<Map_14/>
</record>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We have two goals, we want the script to concatenate and merge headwords (Acre|Akko) based on URI. We also want to merge the Map numbers (1,2,,,5,,,,,,,,,) based on URI. The map numbers should not duplicate.
The end goal is further transformation in XML so if the result is in XML that is fine. In that case the concatenation could look like:
<headword><label>Acre</label><label>Akko</label></headword>
rather than a concatenation into one element.