Last active
October 21, 2024 18:40
-
-
Save datadavev/a542109d3fbcd7adaed864cc98d1151d to your computer and use it in GitHub Desktop.
Example of ordered vs. unordered in JSON-LD
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Dataset creator names in order\n", | |
"\n", | |
"Ensuring names of creators in a `schema.org/Dataset` are preserved. See `science-on-schema.org` [issue #135](https://github.com/ESIPFed/science-on-schema.org/issues/135)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Kirk N. Sato\n", | |
"Alexander Gagnon\n", | |
"Billie Swalla\n", | |
"Emily Carrington\n", | |
"Kenneth Sebens\n", | |
"Kristy Kull\n", | |
"Evelyn J. Lessard\n", | |
"Jan Newton\n", | |
"J. Dylan Crosby\n", | |
"---\n", | |
"Evelyn J. Lessard\n", | |
"Jan Newton\n", | |
"Billie Swalla\n", | |
"Alexander Gagnon\n", | |
"Kirk N. Sato\n", | |
"Emily Carrington\n", | |
"Kristy Kull\n", | |
"Kenneth Sebens\n", | |
"J. Dylan Crosby\n", | |
"---\n", | |
"Kenneth Sebens\n", | |
"Kristy Kull\n", | |
"Alexander Gagnon\n", | |
"Billie Swalla\n", | |
"Evelyn J. Lessard\n", | |
"Emily Carrington\n", | |
"J. Dylan Crosby\n", | |
"Kirk N. Sato\n", | |
"Jan Newton\n", | |
"---\n" | |
] | |
} | |
], | |
"source": [ | |
"import requests\n", | |
"import rdflib\n", | |
"import json\n", | |
"\n", | |
"def loadGraph(url):\n", | |
" response = requests.get(url)\n", | |
" g = rdflib.Dataset()\n", | |
" return g.parse(data=response.text, format=\"json-ld\")\n", | |
"\n", | |
"base = \"https://raw.githubusercontent.com/DataONEorg/SlenderNodes/schema-org-indexing/schema_org_indexing/examples/\"\n", | |
"\n", | |
"q_unordered = '''\n", | |
"PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>\n", | |
"PREFIX SO: <https://schema.org/>\n", | |
"SELECT ?name \n", | |
"WHERE {\n", | |
" ?ds rdf:type SO:Dataset .\n", | |
" ?ds SO:creator ?creatorlist .\n", | |
" ?creatorlist SO:name ?name .\n", | |
"}\n", | |
"'''\n", | |
"\n", | |
"# Load and query 3 times to see if the order changes\n", | |
"for i in range(0,3):\n", | |
" eg_unordered = loadGraph(base + \"eg_bcodmo_01.jsonld\")\n", | |
" res = eg_unordered.query(q_unordered) \n", | |
" for r in res:\n", | |
" print(r[0].value)\n", | |
" print(\"---\")" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Load a JSON-LD `SO:dataset` that has been modified to use the `@list` keyword on `creator`. The SPARQL uses the RDF list semantics to order the results. Based on this [explanation by Joshua Taylor](https://stackoverflow.com/questions/17523804/is-it-possible-to-get-the-position-of-an-element-in-an-rdf-collection-in-sparql/17530689#17530689)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"0, Kenneth Sebens\n", | |
"1, Emily Carrington\n", | |
"2, Alexander Gagnon\n", | |
"3, Evelyn J. Lessard\n", | |
"4, Jan Newton\n", | |
"5, Billie Swalla\n", | |
"6, J. Dylan Crosby\n", | |
"7, Kristy Kull\n", | |
"8, Kirk N. Sato\n", | |
"---\n", | |
"0, Kenneth Sebens\n", | |
"1, Emily Carrington\n", | |
"2, Alexander Gagnon\n", | |
"3, Evelyn J. Lessard\n", | |
"4, Jan Newton\n", | |
"5, Billie Swalla\n", | |
"6, J. Dylan Crosby\n", | |
"7, Kristy Kull\n", | |
"8, Kirk N. Sato\n", | |
"---\n", | |
"0, Kenneth Sebens\n", | |
"1, Emily Carrington\n", | |
"2, Alexander Gagnon\n", | |
"3, Evelyn J. Lessard\n", | |
"4, Jan Newton\n", | |
"5, Billie Swalla\n", | |
"6, J. Dylan Crosby\n", | |
"7, Kristy Kull\n", | |
"8, Kirk N. Sato\n", | |
"---\n" | |
] | |
} | |
], | |
"source": [ | |
"q_ordered = '''\n", | |
"PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>\n", | |
"PREFIX SO: <https://schema.org/>\n", | |
"SELECT (count(?mid)-1 as ?position) ?name \n", | |
"WHERE {\n", | |
" ?ds rdf:type SO:Dataset .\n", | |
" ?ds SO:creator ?creatorlist .\n", | |
" ?creatorlist rdf:rest* ?mid .\n", | |
" ?mid rdf:rest* ?node .\n", | |
" ?node rdf:first ?creator .\n", | |
" ?creator SO:name ?name .\n", | |
"}\n", | |
"group by ?node ?creator\n", | |
"'''\n", | |
"for i in range(0,3):\n", | |
" eg_ordered = loadGraph(base + \"eg_bcodmo_01-hacked.jsonld\")\n", | |
" res = eg_ordered.query(q_ordered)\n", | |
" for r in res:\n", | |
" print(f\"{r[0].value}, {r[1].value}\")\n", | |
" print(\"---\")" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Hack the JSON-LD after loading by adding \n", | |
"```\n", | |
" \"creator\": {\n", | |
" \"@container\":\"@list\"\n", | |
" }\n", | |
"```\n", | |
"to the `@context` to make the un-ordered list ordered." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"0, Kenneth Sebens\n", | |
"1, Emily Carrington\n", | |
"2, Alexander Gagnon\n", | |
"3, Evelyn J. Lessard\n", | |
"4, Jan Newton\n", | |
"5, Billie Swalla\n", | |
"6, J. Dylan Crosby\n", | |
"7, Kristy Kull\n", | |
"8, Kirk N. Sato\n", | |
"---\n", | |
"0, Kenneth Sebens\n", | |
"1, Emily Carrington\n", | |
"2, Alexander Gagnon\n", | |
"3, Evelyn J. Lessard\n", | |
"4, Jan Newton\n", | |
"5, Billie Swalla\n", | |
"6, J. Dylan Crosby\n", | |
"7, Kristy Kull\n", | |
"8, Kirk N. Sato\n", | |
"---\n", | |
"0, Kenneth Sebens\n", | |
"1, Emily Carrington\n", | |
"2, Alexander Gagnon\n", | |
"3, Evelyn J. Lessard\n", | |
"4, Jan Newton\n", | |
"5, Billie Swalla\n", | |
"6, J. Dylan Crosby\n", | |
"7, Kristy Kull\n", | |
"8, Kirk N. Sato\n", | |
"---\n" | |
] | |
} | |
], | |
"source": [ | |
"def loadGraphOrdered(url):\n", | |
" response = requests.get(url)\n", | |
" data = json.loads(response.text)\n", | |
" data[\"@context\"][\"creator\"] = {\"@container\":\"@list\"} \n", | |
" g = rdflib.Dataset()\n", | |
" return g.parse(data=json.dumps(data), format=\"json-ld\")\n", | |
"\n", | |
"for i in range(0,3):\n", | |
" g = loadGraphOrdered(base + \"eg_bcodmo_01.jsonld\")\n", | |
" res = g.query(q_ordered)\n", | |
" for r in res:\n", | |
" print(f\"{r[0].value}, {r[1].value}\")\n", | |
" print(\"---\")" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.8.5" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment