Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save tonyfast/565f67be4dced6433d26824855515f1f to your computer and use it in GitHub Desktop.
Save tonyfast/565f67be4dced6433d26824855515f1f to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# A JSON-LD 1.1 Context For `nbformat.v4`\n",
"For a long time, a fairly [basic JSON-LD context](https://gist.github.com/bollwyvl/e7f8136bd2ea5674dd00) could _almost_ represent the structure of the Jupyter [Notebook format](https://github.com/jupyter/nbformat). With the addition of [nested properties](https://json-ld.org/spec/ED/json-ld/20180215/#nested-properties) in JSON-LD 1.1, the crucial contents of `metadata` can be hoisted to refer to the object at the root of the notebook document, or subsequently it's `cell`s or `outputs`, and put it back in `metadata` when reserializing.\n",
"\n",
"> ## This means the notebook can now be considered a _Linked Data native_ format\n",
"\n",
"Without further ado, here's a new context that uses `@nest` to put (most) of the known metadata back in its righful place."
]
},
{
"cell_type": "code",
"execution_count": 116,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Overwriting nbformat.v4.jsonld\n"
]
}
],
"source": [
"%%file nbformat.v4.jsonld\n",
"{\n",
" \"@context\": {\n",
" \"@version\": 1.1,\n",
" \"@vocab\": \"http://ipython.org/nbformat/v4/\",\n",
"\n",
" \"nb4\": \"http://ipython.org/nbformat/v4/\",\n",
" \"xsd\": \"http://www.w3.org/2001/XMLSchema#\",\n",
" \"foaf\": \"http://xmlns.com/foaf/0.1/\",\n",
"\n",
" \"metadata\": \"@nest\",\n",
"\n",
" \"language\": {\"@type\": \"@id\"},\n",
" \"codemirror_mode\": {\"@type\": \"@id\"},\n",
"\n",
" \"cell_type\": {\"@id\": \"@type\"},\n",
" \"output_type\": {\"@id\": \"@type\"},\n",
"\n",
" \"cells\": {\"@container\": \"@list\"},\n",
" \"source\": {\"@container\": \"@list\"},\n",
" \"outputs\": {\"@container\": \"@list\"},\n",
" \"text\": {\"@container\": \"@list\"},\n",
" \"traceback\": {\"@container\": \"@list\"},\n",
"\n",
" \"kernelspec\": {\"@nest\": \"metadata\"},\n",
" \"language_info\": {\"@nest\": \"metadata\", \"@nest\": \"metadata\"},\n",
" \"title\": {\"@nest\": \"metadata\"},\n",
" \"authors\": {\"@nest\": \"metadata\"},\n",
" \"orig_nbformat\": {\"@nest\": \"metadata\"},\n",
" \"jupyter\": {\"@nest\": \"metadata\"},\n",
" \"collapsed\": {\"@nest\": \"metadata\"},\n",
" \"scrolled\": {\"@nest\": \"metadata\"},\n",
" \"name\": {\"@nest\": \"metadata\"},\n",
" \"tags\": {\"@container\": \"@set\", \"@nest\": \"metadata\"},\n",
"\n",
" \"text/html\": {\"@container\": \"@list\"},\n",
" \"image/png\": {\"@container\": \"@list\"},\n",
" \"text/plain\": {\"@container\": \"@list\"},\n",
" \"application/javascript\": {\"@container\": \"@list\"},\n",
"\n",
" \"collapsed\": {\"@type\": \"xsd:boolean\"},\n",
" \"execution_count\": {\"@type\": \"xsd:int\"},\n",
" \"nbformat_minor\": {\"@type\": \"xsd:int\", \"@nest\": \"metadata\"},\n",
" \"nbformat\": {\"@type\": \"xsd:int\", \"@nest\": \"metadata\"},\n",
" \"signature\": {\"@type\": \"foaf:sha1\"}\n",
" }\n",
"}\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"At present, [pyld](https://github.com/digitalbazaar/pyld) is the primary implementation of JSON-LD in python, and the only one that suports JSON-LD 1.1."
]
},
{
"cell_type": "code",
"execution_count": 109,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"from pathlib import Path\n",
"from pyld import jsonld\n",
"from nbformat import v4, NotebookNode\n",
"import IPython"
]
},
{
"cell_type": "code",
"execution_count": 112,
"metadata": {},
"outputs": [],
"source": [
"context = json.loads(Path(\"nbformat.v4.jsonld\").read_text())"
]
},
{
"cell_type": "code",
"execution_count": 130,
"metadata": {},
"outputs": [],
"source": [
"def roundtrip(nb: dict) -> NotebookNode:\n",
" \"\"\"\n",
" expand a notebook document to a self-describing JSON-LD document,\n",
" then re-compact and reparse, validating against the nbformat schema\n",
" \"\"\"\n",
" rt = jsonld.compact(\n",
" jsonld.expand(nb, dict(expandContext=context)), \n",
" context\n",
" )\n",
" # the @context stays around\n",
" del rt[\"@context\"]\n",
" # JSON-LD will remove empty cell `metadata`\n",
" [cell.update(metadata={}) for cell in rt[\"cells\"] if not \"metadata\" in cell]\n",
" return v4.reads(json.dumps(rt))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook is as good as any!"
]
},
{
"cell_type": "code",
"execution_count": 134,
"metadata": {},
"outputs": [
{
"data": {
"application/json": {
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": "# A JSON-LD 1.1 Context For `nbformat.v4`\nFor a long time, a fairly [basic JSON-LD context](https://gist.github.com/bollwyvl/e7f8136bd2ea5674dd00) could _almost_ represent the structure of the Jupyter [Notebook format](https://github.com/jupyter/nbformat). With the addition of [nested properties](https://json-ld.org/spec/ED/json-ld/20180215/#nested-properties) in JSON-LD 1.1, the crucial contents of `metadata` can be hoisted to refer to the object at the root of the notebook document, or subsequently it's `cell`s or `outputs`, and put it back in `metadata` when reserializing.\n\n> ## This means the notebook can now be considered a _Linked Data native_ format\n\nWithout further ado, here's a new context that uses `@nest` to put (most) of the known metadata back in its righful place."
},
{
"cell_type": "code",
"execution_count": 116,
"metadata": {},
"outputs": [
{
"cell_type": "stream",
"metadata": {
"name": "stdout"
},
"text": [
"Overwriting nbformat.v4.jsonld\n"
]
}
],
"source": "%%file nbformat.v4.jsonld\n{\n \"@context\": {\n \"@version\": 1.1,\n \"@vocab\": \"http://ipython.org/nbformat/v4/\",\n\n \"nb4\": \"http://ipython.org/nbformat/v4/\",\n \"xsd\": \"http://www.w3.org/2001/XMLSchema#\",\n \"foaf\": \"http://xmlns.com/foaf/0.1/\",\n\n \"metadata\": \"@nest\",\n\n \"language\": {\"@type\": \"@id\"},\n \"codemirror_mode\": {\"@type\": \"@id\"},\n\n \"cell_type\": {\"@id\": \"@type\"},\n \"output_type\": {\"@id\": \"@type\"},\n\n \"cells\": {\"@container\": \"@list\"},\n \"source\": {\"@container\": \"@list\"},\n \"outputs\": {\"@container\": \"@list\"},\n \"text\": {\"@container\": \"@list\"},\n \"traceback\": {\"@container\": \"@list\"},\n\n \"kernelspec\": {\"@nest\": \"metadata\"},\n \"language_info\": {\"@nest\": \"metadata\", \"@nest\": \"metadata\"},\n \"title\": {\"@nest\": \"metadata\"},\n \"authors\": {\"@nest\": \"metadata\"},\n \"orig_nbformat\": {\"@nest\": \"metadata\"},\n \"jupyter\": {\"@nest\": \"metadata\"},\n \"collapsed\": {\"@nest\": \"metadata\"},\n \"scrolled\": {\"@nest\": \"metadata\"},\n \"name\": {\"@nest\": \"metadata\"},\n \"tags\": {\"@container\": \"@set\", \"@nest\": \"metadata\"},\n\n \"text/html\": {\"@container\": \"@list\"},\n \"image/png\": {\"@container\": \"@list\"},\n \"text/plain\": {\"@container\": \"@list\"},\n \"application/javascript\": {\"@container\": \"@list\"},\n\n \"collapsed\": {\"@type\": \"xsd:boolean\"},\n \"execution_count\": {\"@type\": \"xsd:int\"},\n \"nbformat_minor\": {\"@type\": \"xsd:int\", \"@nest\": \"metadata\"},\n \"nbformat\": {\"@type\": \"xsd:int\", \"@nest\": \"metadata\"},\n \"signature\": {\"@type\": \"foaf:sha1\"}\n }\n}\n"
},
{
"cell_type": "markdown",
"metadata": {},
"source": "At present, [pyld](https://github.com/digitalbazaar/pyld) is the primary implementation of JSON-LD in python, and the only one that suports JSON-LD 1.1."
},
{
"cell_type": "code",
"execution_count": 109,
"metadata": {},
"outputs": [],
"source": "import json\nfrom pathlib import Path\nfrom pyld import jsonld\nfrom nbformat import v4, NotebookNode\nimport IPython"
},
{
"cell_type": "code",
"execution_count": 112,
"metadata": {},
"outputs": [],
"source": "context = json.loads(Path(\"nbformat.v4.jsonld\").read_text())"
},
{
"cell_type": "code",
"execution_count": 130,
"metadata": {},
"outputs": [],
"source": "def roundtrip(nb: dict) -> NotebookNode:\n \"\"\"\n expand a notebook document to a self-describing JSON-LD document,\n then re-compact and reparse, validating against the nbformat schema\n \"\"\"\n rt = jsonld.compact(\n jsonld.expand(nb, dict(expandContext=context)), \n context\n )\n # the @context stays around\n del rt[\"@context\"]\n # JSON-LD will remove empty cell `metadata`\n [cell.update(metadata={}) for cell in rt[\"cells\"] if not \"metadata\" in cell]\n return v4.reads(json.dumps(rt))"
},
{
"cell_type": "markdown",
"metadata": {},
"source": "This notebook is as good as any!"
},
{
"cell_type": "code",
"execution_count": 132,
"metadata": {},
"outputs": [],
"source": "nb = roundtrip(reads(Path(\"A New Context for Notebooks.ipynb\").read_text()))\nIPython.display.JSON(nb)"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"metadata": {
"name": "python3"
}
},
"language_info": {
"codemirror_mode": {
"metadata": {
"name": "ipython"
},
"version": 3
},
"file_extension": ".py",
"metadata": {
"name": "python"
},
"mimetype": "text/x-python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
},
"nbformat": 4,
"nbformat_minor": 2
}
},
"text/plain": [
"<IPython.core.display.JSON object>"
]
},
"execution_count": 134,
"metadata": {
"application/json": {
"expanded": false,
"root": "root"
}
},
"output_type": "execute_result"
}
],
"source": [
"nb = roundtrip(reads(Path(\"A New Context for Notebooks.ipynb\").read_text()))\n",
"IPython.display.JSON(nb)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment