Forked from PeterHovenkamp/Taxonomic-mind-mapper
Last active
August 29, 2015 14:01
-
-
Save nickynicolson/6b48b1c131abcab7e13d to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
= The taxonomic mind-mapper | |
Nicky Nicolson, Peter Hovenkamp | |
:neo4j-version: 2.1.0 | |
:author: Nicky Nicolson | |
:twitter: nickynicolson | |
== Domain | |
We investigate the use of the scheme in: https://docs.google.com/document/d/1FIxNrrGrIZs0l4QJEdGfctXbYVL4cQhKspmMEHmKAGg/edit with a use case in the taxonomy of Diplazium tomentosum (https://docs.google.com/document/d/1vni44RBwGNZ7iRCFf243NtcD-7xwjHrDeAPehato7KY/edit?usp=sharing) | |
== Data model | |
=== Node types | |
* CollectionEvent - shown in red. Properties: | |
** source (e.g. "L0051073") | |
** collector (e.g. "Blume") | |
** locality (e.g. "Java") | |
** country (e.g. "Indonesia") | |
* Specimen - shown in purple. Properties: | |
** id (e.g. "L0051073") | |
** url (e.g. "http://plants.jstor.org/specimen/l0051073") | |
** heldIn (e.g. "L") | |
* Name - shown in green. Properties: | |
** name (e.g. "Diplazium tomentosum Blume") | |
** ipniid (e.g. "17089100-1") | |
** url (e.g. "http://biodiversitylibrary.org/page/31163025") | |
=== Relationship types | |
* Conspecific_with | |
** (Specimen)-[CONSPECIFIC_WITH]->(Specimen) | |
** assertedBy (e.g. "PH") | |
* Derived_from | |
** (Specimen)-[DERIVED_FROM]->(CollectionEvent) | |
** assertedBy (e.g. "PH") | |
* Type_of | |
** (Specimen)-[TYPE_OF]->(Name) | |
** assertedBy (e.g. "Morton C.V. in http://www.biodiversitylibrary.org/page/409837#page/305") | |
** typeOfType (e.g. "Lecto") | |
== Use case | |
Our case starts with one of the types from the Leiden herbarium (http://plants.jstor.org/specimen/l0051073). It is the type of Diplazium tomentosum Blume: | |
//setup | |
[source,cypher] | |
---- | |
CREATE (c:CollectingEvent | |
{source: 'L0051073' | |
,collector:'Blume' | |
,locality:"Java" | |
,country:"Indonesia"} | |
) | |
,(s:Specimen | |
{id:"L0051073" | |
,url:'http://plants.jstor.org/specimen/l0051073' | |
,heldIn:'L'} | |
) | |
,(n:Name | |
{name:"Diplazium tomentosum Blume" | |
,ipniid: "17089100-1" | |
,url:"http://biodiversitylibrary.org/page/31163025"} | |
) | |
,(s)<-[:DerivedFrom]-(c) | |
,(c)-[:TypeOf]->(n) | |
RETURN s, n, c | |
---- | |
//table | |
Now we add details of another type specimen from Berlin: http://plants.jstor.org/specimen/b%2020%200051655 | |
This time, there are no data about the collecting event, which means that there is no collection event to which to attach it | |
//setup | |
[source,cypher] | |
---- | |
CREATE (s:Specimen | |
{id:'B-200051655' | |
,heldIn:'B'} | |
) | |
RETURN s | |
---- | |
//table | |
But that's not what they say in Berlin. They say it's a type of Diplazium tomentosum Blume. | |
We model that as an assertion by Berlin that it is derived from the same collection event as the Leiden specimen. | |
//setup | |
[source,cypher] | |
---- | |
MATCH ({name:"Diplazium tomentosum Blume"})<-[:TypeOf]-(c) | |
MATCH (s:Specimen | |
{id:'B-200051655'} | |
) | |
CREATE (s) <-[r:DerivedFrom{assertedBy: 'Berlin'}]- (c) | |
RETURN s, c, r | |
---- | |
//graph | |
The graph now shows the basic relations between the three different types of nodes. | |
We next add several other specimens from different herbaria, which all have been identified as Diplazium tomentosum, | |
starting with this one (http://plants.jstor.org/specimen/gh00022916) from the GH herbarium. | |
We have to introduce a new name to accomodate the type status of the specimen. We model the identification of the specimen as an AsIdentifiedBy link to a Type collecting event. When we have examined the specimen we could confirm this | |
which we express by adding a Conspecificity link to one of the specimens derived from the collecting event. | |
//setup | |
[source,cypher] | |
---- | |
MATCH ({name:"Diplazium tomentosum Blume"})<-[:TypeOf]-(c2)-[:DerivedFrom]->(s3:Specimen{heldIn: 'L'}) | |
CREATE (c1:CollectingEvent | |
{collNumber:'386' | |
,collector:'H. Cuming'} | |
) | |
,(s1:Specimen{id:'GH00022916' | |
,heldIn:'GH' | |
,url:'http://plants.jstor.org/specimen/gh00022916'} | |
) | |
, (n1:Name | |
{name:'Asplenium deflexum Mett.' | |
,ipniid:'17042170-1'} | |
) | |
, (s1)<-[:ConspecificWith | |
{asIdentifiedBy:'M.G.Price 1989'}]-(c2) | |
, (s1)<-[:DerivedFrom]- (c1) | |
, (s1) <-[:ConspecificWith | |
{assertedBy: 'PH'}] -(s3) | |
, (c1)-[:TypeOf]->(n1) | |
RETURN s1, s3, c2, n1, c1 | |
---- | |
//graph | |
//table | |
We find that a duplicate of this specimen is at the Michigan herbarium (http://plants.jstor.org/specimen/mich1190057) | |
and this can now easily be added: | |
//setup | |
[source,cypher] | |
---- | |
MATCH (c:CollectingEvent | |
{collector:'H. Cuming' | |
,collNumber: '386'} | |
) | |
CREATE (s:Specimen | |
{id:'MICH1190057' | |
,url:'http://plants.jstor.org/specimen/mich1190057' | |
,heldIn:'MICH'}) | |
, (s)<-[:DerivedFrom]-(c) | |
RETURN s, c | |
---- | |
//graph | |
//table | |
Next stop is Brussels, where another specimen is held (http://plants.jstor.org/specimen/br0000006990008). | |
The annotations that come with this specimens are quite extensive, en introduce a number of new nodes. | |
//setup | |
[source,cypher] | |
---- | |
MATCH ({name:"Diplazium tomentosum Blume"})<-[:TypeOf]-(c2) | |
CREATE (c:CollectingEvent | |
{collector: 'Roxburgh W.' | |
,collNumber:'S.N.'} | |
) | |
,(s:Specimen{id:'BR0000006990008' | |
,url:'http://plants.jstor.org/specimen/br0000006990008' | |
,heldIn:'BR'} | |
) | |
, (n:Name{ipniid: '17044840-1',name:'Asplenium hemionitoides Roxb.'}) | |
, (s)<-[:DerivedFrom]- (c) | |
, (s)<-[conspec:ConspecificWith | |
{assertedBy:'Morton C.V. 1970/7/1'}]-(c2) | |
, (s) <-[:ConspecificWith | |
{assertedBy: 'PH'}] -(c2) | |
, (s)-[type:TypeOf | |
{assertedBy: 'Morton C.V. in http://www.biodiversitylibrary.org/page/409837#page/305' | |
, typeOfType: "Lecto"}]->(n) | |
RETURN n,s, c, c2, conspec, type | |
---- | |
//graph | |
//table | |
And to accommodate the annotations by Morton, again a new specimen has to be introduced, from the Geneva herbarium. | |
//setup | |
[source,cypher] | |
---- | |
MATCH (c:CollectingEvent | |
{collector: 'Roxburgh W.' | |
,collNumber:'S.N.'} | |
) | |
CREATE (s:Specimen | |
{collector: 'Christopher Smith' | |
,locality: 'Amboina' | |
,heldIn: 'G'} | |
) | |
,(s)<-[:DerivedFrom]-(c) | |
RETURN s,c | |
---- | |
//graph | |
//table | |
Next stop is Sweden, where an interesting specimen from Taiwan is held, under the same name, but marked as type of another name: | |
//setup | |
[source,cypher] | |
---- | |
CREATE (c:CollectingEvent | |
{collector:'Faurie, U.J.' | |
,collNumber:'168' | |
,locality:'Formosa. Urai.' | |
,country:'Taiwan' | |
,eventDate:'1914/4'} | |
) | |
,(s:Specimen{id:'SP10962' | |
,url:'http://plants.jstor.org/specimen/s-p-10962' | |
,heldIn:'SP'} | |
) | |
, (n:Name | |
{name:'Diplazium crenato-serratum (Blume) T.Moore var. hirta Rosenst.'} | |
) | |
, (s)<-[:DerivedFrom]-(c) | |
, (c)-[:TypeOf]->(n) | |
RETURN n, c, s | |
---- | |
//graph | |
//table | |
To model the identification of this specimen, the specimen is linked to the type material of D. tomentosum | |
//setup | |
[source,cypher] | |
---- | |
MATCH ({collector:'Faurie, U.J.' | |
,collNumber:'168'})-->(s) | |
MATCH ({name:"Diplazium tomentosum Blume"})<-[:TypeOf]-(c) | |
CREATE (s)<-[:ConspecificWith | |
{asIdentifiedBy: 'S'}] -(c) | |
RETURN s.heldIn | |
---- | |
//graph | |
//table | |
And finally a specimen from Sumatra, also in Sweden, where it is labeled as type of D. burchardii, but the only type specimen we know of for this species is in L, at least, according to Morton's annotations in Geneva, who has seen and photographed it, with the number "Rosenst. fil. sumatranae exs. 22". | |
But we know that Rosenstock frequently renumbered specimens in series he distributed, so we can express this an an assertion that the Swedish specimen indeed derives from the same collection as the Leiden one. | |
//setup | |
[source,cypher] | |
---- | |
CREATE (s1:Specimen{id:'SP4702' | |
,url:'http://plants.jstor.org/specimen/s-p-4702' | |
,heldIn:'SP'} | |
) | |
,(s2:Specimen{ | |
heldIn:'L' | |
,collNumber:'Rosenst. fil. sumatranae exs. 22'} | |
) | |
,(c:CollectingEvent{ | |
collector:'Burchard, O.' | |
,collNumber:'121' | |
,locality:'Sumatera: Indragiri, inter Tjinaco et Pukan Herun.' | |
,eventDate:'1907'} | |
) | |
, (n:Name{ | |
name:'Diplazium burchardii Rosenst.' | |
,ipniid:'17246030-1'} | |
) | |
, (s1)<-[:DerivedFrom | |
{assertedBy: 'S'}]-(c) | |
, (s2)<-[:DerivedFrom | |
{assertedBy: 'PH'}]-(c) | |
, (c)-[:TypeOf]->(n) | |
RETURN s1, s2, n, c | |
---- | |
//graph | |
//table | |
Knowing this, we could ask someone in L to examine the specimen and compare it with | |
any of the other specimens that are in L and that we have already connected to D. tomentosum. A Conspecificity link with any of the specimens in the network would include the name D. burchardii in the list of available names. | |
//setup | |
[source,cypher] | |
---- | |
MATCH (n1:Name{name:"Diplazium burchardii Rosenst."})<--()-->(s2{heldIn: "L"}) | |
MATCH (n2:Name{name:"Diplazium tomentosum Blume"})<--()-->(s1{heldIn: "L"}) | |
CREATE(s1)<-[r:ConspecificWith | |
{assertedBy: "someone in L"}]-(s2) | |
RETURN s1, s2, r | |
---- | |
//table | |
We may now list all the specimens that are connected by Conspecificity links to the original Blume type specimen. | |
[source,cypher] | |
---- | |
MATCH (typeL:Specimen{id: "L0051073"})-[conspec:ConspecificWith*1..10]-()-[:DerivedFrom*1..10]-(s:Specimen) | |
WHERE s <> typeL | |
RETURN DISTINCT s, conspec | |
---- | |
//table | |
Instead of listing a path, we could just output the names of the people that asserted the conspecificity links connected to the original Blume type specimen. | |
[source,cypher] | |
---- | |
MATCH (typeL:Specimen{id: "L0051073"})-[conspec:ConspecificWith*1..10]-()-[:DerivedFrom*1..10]-(s:Specimen) | |
WHERE s <> typeL | |
RETURN DISTINCT s, reduce(x=[], r in conspec | x + (CASE WHEN (r.assertedBy IN x OR r.assertedBy IS NULL) THEN [] ELSE [r.assertedBy] END)) as namesOfAsserters | |
---- | |
//table | |
By applying more strict criteria to the paths traversed to arrive at these results, we are now in principle able to recover different taxon concepts and specify these as specimen lists with associated names. | |
[source,cypher] | |
---- | |
MATCH (typeL:Specimen{id: "L0051073"})-[conspec:ConspecificWith*1..10{assertedBy:"PH"}]-()-[:DerivedFrom*1..10]-(s:Specimen) | |
WHERE s <> typeL | |
RETURN DISTINCT s, conspec | |
---- | |
//table | |
[source,cypher] | |
---- | |
MATCH (typeL:Specimen{id: "L0051073"})-[conspec:ConspecificWith*1..10]-()-[deriv:DerivedFrom*1..10]-(s:Specimen) | |
WHERE s <> typeL | |
AND all(x in conspec WHERE x.assertedBy IN ['PH', 'Morton C.V. 1970/7/1', 'Morton C.V. in http://www.biodiversitylibrary.org/page/409837#page/305']) | |
AND all(x in conspec WHERE has(x.assertedBy)) | |
AND all(x in deriv WHERE x.assertedBy IN ['PH', 'Morton C.V. 1970/7/1', 'Morton C.V. in http://www.biodiversitylibrary.org/page/409837#page/305']) | |
AND all(x in deriv WHERE has(x.assertedBy)) | |
RETURN typeL, conspec, deriv, s | |
---- | |
//table | |
//graph |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment