Skip to content

Instantly share code, notes, and snippets.

@davidamichelson
Created April 7, 2020 18:03
Show Gist options
  • Save davidamichelson/d6f8a727313026a79c10fe390bab05cd to your computer and use it in GitHub Desktop.
Save davidamichelson/d6f8a727313026a79c10fe390bab05cd to your computer and use it in GitHub Desktop.
Data To Concatenate
Headword,Syriaca_URI,Map_1,Map_2,Map_3,Map_4,Map_5,Map_6,Map_7,Map_8,Map_9,Map_10,Map_11,Map_12,Map_13,Map_14
Abila,syriaca.org/place/4355,1,,,4,,,,,,,,,,
Abivard,syriaca.org/place/4222,,,,,,,,,9,,,,,
Abr Shahr,syriaca.org/place/4220,,,,,,,,,9,10,,,,
Acre|Akko,syriaca.org/place/14,1,2,,,5,,,,,,,,,
Headword,Syriaca_URI,Map_1,Map_2,Map_3,Map_4,Map_5,Map_6,Map_7,Map_8,Map_9,Map_10,Map_11,Map_12,Map_13,Map_14
Abila,syriaca.org/place/4355,1,,,4,,,,,,,,,,
Abivard,syriaca.org/place/4222,,,,,,,,,9,,,,,
Abr Shahr,syriaca.org/place/4220,,,,,,,,,9,10,,,,
Acre,syriaca.org/place/14,,2,,,5,,,,,,,,,
Akko,syriaca.org/place/14,1,,,,5,,,,,,,,,
@CliffordAnderson
Copy link

CliffordAnderson commented Apr 7, 2020

Hi Dave,

Here's a query expression that should do the trick of merging rows with the same identifiers:

xquery version "3.1";

(: Merges content from rows with the same identifiers :)

let $uri := "https://gist.githubusercontent.com/CliffordAnderson/592c2c43870a9ef3335d26714e51b38a/raw/2b1c467d2bafdbba13e0a1dd91abafebfd2e7722/Original.csv"
let $csv := fetch:text($uri) => csv:parse(map{"header":fn:true()})
let $ids := fn:distinct-values($csv/csv/record/Syriaca_URI)
let $rows :=  fn:distinct-values($csv/csv/record/* ! fn:node-name(.))
for $id in $ids
return 
  element record {
    for $cell in $rows
    return element {$cell} {
      let $contents := $csv/csv/record[Syriaca_URI = $id]/*[fn:name(.) = fn:string($cell)]/text()
      return
        if (count($contents) ne count(fn:distinct-values($contents)))
        then $contents[1]
        else fn:string-join($contents, "|")
      }
  }

Output:

<record>
  <Headword>Abila</Headword>
  <Syriaca_URI>syriaca.org/place/4355</Syriaca_URI>
  <Map_1>1</Map_1>
  <Map_2/>
  <Map_3/>
  <Map_4>4</Map_4>
  <Map_5/>
  <Map_6/>
  <Map_7/>
  <Map_8/>
  <Map_9/>
  <Map_10/>
  <Map_11/>
  <Map_12/>
  <Map_13/>
  <Map_14/>
</record>
<record>
  <Headword>Abivard</Headword>
  <Syriaca_URI>syriaca.org/place/4222</Syriaca_URI>
  <Map_1/>
  <Map_2/>
  <Map_3/>
  <Map_4/>
  <Map_5/>
  <Map_6/>
  <Map_7/>
  <Map_8/>
  <Map_9>9</Map_9>
  <Map_10/>
  <Map_11/>
  <Map_12/>
  <Map_13/>
  <Map_14/>
</record>
<record>
  <Headword>Abr Shahr</Headword>
  <Syriaca_URI>syriaca.org/place/4220</Syriaca_URI>
  <Map_1/>
  <Map_2/>
  <Map_3/>
  <Map_4/>
  <Map_5/>
  <Map_6/>
  <Map_7/>
  <Map_8/>
  <Map_9>9</Map_9>
  <Map_10>10</Map_10>
  <Map_11/>
  <Map_12/>
  <Map_13/>
  <Map_14/>
</record>
<record>
  <Headword>Acre|Akko</Headword>
  <Syriaca_URI>syriaca.org/place/14</Syriaca_URI>
  <Map_1>1</Map_1>
  <Map_2>2</Map_2>
  <Map_3/>
  <Map_4/>
  <Map_5>5</Map_5>
  <Map_6/>
  <Map_7/>
  <Map_8/>
  <Map_9/>
  <Map_10/>
  <Map_11/>
  <Map_12/>
  <Map_13/>
  <Map_14/>
</record>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment