In this example we have the same sequence GQ247641 in two sequence datasets ("European Molecular Biology Laboratory Australian Mirror" and "Geographically tagged INSDC sequences"), and also the voucher specimen ("AM W.35546.001" or "AMS:W.35546") also occurs in GBIF (provided by "Australian Museum provider for OZCAM"). Linking the two sequence occurrences is trivial, we just link by the accession "GQ247641". Linking the sequence to the museum specimen requires matching the slightly different strings "AM W.35546.001" and "AMS:W.35546".
The graph links three records in GBIf that all refer to the same thing.
CREATE (occurrence487122480:Occurrence { name: "gbif487122480", catalogNumber: "AM W.35546.001" }),
(sample1:Sample { name: "AM W.35546.001" }),
(occurrence487122480)-[:HASCODE]->(sample1),
(occurrence488829630:Occurrence { name: "gbif488829630", catalogNumber: "GQ247641" }),
(GQ247641:Sequence { accession:"GQ247641", catalogueNumber: "AMS:W.35546" }),
(GQ247641)-[:SAMEAS]->(occurrence488829630),
(occurrence1006303667:Occurrence { name: "gbif1006303667", catalogNumber: "GQ247641" }),
(GQ247641)-[:SAMEAS]->(occurrence1006303667),
(dataset1:Dataset { name: "European Molecular Biology Laboratory Australian Mirror"})<-[:SOURCE]-(occurrence488829630),
(dataset2:Dataset { name: "Geographically tagged INSDC sequences"})<-[:SOURCE]-(occurrence1006303667),
(dataset3:Dataset { name: "Australian Museum provider for OZCAM"})<-[:SOURCE]-(occurrence487122480),
(GQ247641)-[:HASCODE]->(sample2:Sample { name: "AMS:W.35546" }),
(occurrence487122480)-[:HASCODE]->(sample2)
MATCH (o1:Occurrence)-[:HASCODE]-(Sample)-[:HASCODE]-(Sequence)-[:SAMEAS]-(o2:Occurrence)
WITH o1, collect(o2.name) AS os
RETURN o1.name , os