Skip to content

Instantly share code, notes, and snippets.

@tonyfast
Created September 24, 2022 05:29
Show Gist options
  • Save tonyfast/d9842e69957895a883203101d32053c5 to your computer and use it in GitHub Desktop.
Save tonyfast/d9842e69957895a883203101d32053c5 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 43,
"id": "a06d12c8-f672-4458-86db-70ea4e1d9bf0",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
" shell.displays_manager.template_cls = pidgy.weave.IPythonMarkdown"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
" shell.displays_manager.template_cls = pidgy.weave.IPythonMarkdown"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "b5d8b8a1-a959-4457-8d9b-8cbfad0c3838",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"the default html convert templates `nbconvert` use combinations of html, javascript, and css to generate static representation that is congruent to the interactive views. their primary concerns are style consistencies which may conflict with the intent of an accessible interface.\n",
"\n",
"we are searching for a standard specification that improves\n",
"\n",
"\n",
"## the default views\n",
"\n",
"* __[lab]__ the default `nbconvert` html template is the lab stylesheet.\n",
"* __[classic]__ there is another template for the classic notebook theme\n",
"\n",
"### what is the same across these templates\n",
"\n",
"* the `base.html.j2` provides scope for the input/output code and markdown cells\n",
"* the `index.html.j2` has a buncha css and javascript.\n",
"\n",
"## two approaches\n",
"\n",
"### templating approach\n",
"\n",
"### remediation approach\n",
"\n",
"## what does success look like\n",
"\n",
"* our signal in the accessibility snapshot improves\n",
"* modifications to existing themes to improve the AOM signal\n",
"* a minimal html standard for accessible documents\n",
"* the increase in signal from readability.js\n",
"\n",
"https://stackoverflow.com/questions/43289280/is-it-more-correct-to-use-main-or-article-for-the-main-content-of-a-page-whi\n",
"\n",
"[lab]: https://github.com/jupyter/nbconvert/tree/main/share/templates/lab \"the nbcovert lab theme\"\n",
"[classic]: https://github.com/jupyter/nbconvert/tree/main/share/templates/classic \"the nbcovert classic theme\"\n",
"."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"the default html convert templates `nbconvert` use combinations of html, javascript, and css to generate static representation that is congruent to the interactive views. their primary concerns are style consistencies which may conflict with the intent of an accessible interface.\n",
"\n",
"we are searching for a standard specification that improves\n",
"\n",
"\n",
"## the default views\n",
"\n",
"* __[lab]__ the default `nbconvert` html template is the lab stylesheet.\n",
"* __[classic]__ there is another template for the classic notebook theme\n",
"\n",
"### what is the same across these templates\n",
"\n",
"* the `base.html.j2` provides scope for the input/output code and markdown cells\n",
"* the `index.html.j2` has a buncha css and javascript.\n",
"\n",
"## two approaches\n",
"\n",
"### templating approach\n",
"\n",
"### remediation approach\n",
"\n",
"## what does success look like\n",
"\n",
"* our signal in the accessibility snapshot improves\n",
"* modifications to existing themes to improve the AOM signal\n",
"* a minimal html standard for accessible documents\n",
"* the increase in signal from readability.js\n",
"\n",
"https://stackoverflow.com/questions/43289280/is-it-more-correct-to-use-main-or-article-for-the-main-content-of-a-page-whi\n",
"\n",
"[lab]: https://github.com/jupyter/nbconvert/tree/main/share/templates/lab \"the nbcovert lab theme\"\n",
"[classic]: https://github.com/jupyter/nbconvert/tree/main/share/templates/classic \"the nbcovert classic theme\"\n",
"."
]
},
{
"cell_type": "markdown",
"id": "95bcc66b-9dad-4d43-b1d8-9bbd4adb5171",
"metadata": {},
"source": [
"remediating existing templates with beautiful soup\n",
"\n",
"find isomorphic selectors for the classic and lab themes then fix them"
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "950803f2-820b-4ed3-86cb-6a32284afd12",
"metadata": {},
"outputs": [],
"source": [
" \n",
"we have to use the async api in IPython cause the event loop is running\n",
" \n",
"\n",
" import nbconvert\n",
"\n",
" from nbformat import v4"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "6dcb1afe-3a52-4915-9752-70536f4ef611",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
" TEMPLATES = enum.Enum(\"TEMPLATES\", \"lab classic\")"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
" TEMPLATES = enum.Enum(\"TEMPLATES\", \"lab classic\")"
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "48a496ef-80ab-4ee3-a8b2-57e1b9c269d6",
"metadata": {},
"outputs": [],
"source": [
" \n",
" def get_export(file, template=TEMPLATES.classic):\n",
" e = nbconvert.exporters.html.HTMLExporter(template_name=template.name)\n",
" nb = v4.reads(pathlib.Path(file).read_text())\n",
" return nbconvert.export(e, nb) \n",
" \n",
" def get_notebook_html(file, template=TEMPLATES.classic):\n",
" html, meta = get_export(file, template)[0]\n",
" return html\n",
"\n",
" def get_notebook_soup(file, template=TEMPLATES.classic):\n",
" from bs4 import BeautifulSoup\n",
" return BeautifulSoup(get_notebook_html(file, template))\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "dd167124-e251-4edb-9ace-300012fde0c5",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"selectors\n",
"\n",
" MAIN = \"#notebook\"\n",
" CELL = \".cell, .jp-Cell\"\n",
" CODE = \".code_cell, .jp-CodeCell\"\n",
" MD = \".text_cell, .jp-MarkdownCell\"\n",
" OUT = \".output, .jp-OutputArea.jp-Cell-outputArea\"\n",
" IN = \".code_cell .input .input_area\"\n",
" PROMPT = \".input_prompt\""
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"selectors\n",
"\n",
" MAIN = \"#notebook\"\n",
" CELL = \".cell, .jp-Cell\"\n",
" CODE = \".code_cell, .jp-CodeCell\"\n",
" MD = \".text_cell, .jp-MarkdownCell\"\n",
" OUT = \".output, .jp-OutputArea.jp-Cell-outputArea\"\n",
" IN = \".code_cell .input .input_area\"\n",
" PROMPT = \".input_prompt\""
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "ec34d767-2266-46dc-a67e-29dfd98de89d",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
" def set_main(soup):\n",
"map the `MAIN` tags to the primary notebook section\n",
" \n",
" e = soup.select_one(MAIN)\n",
" e.attrs.pop(\"tabindex\", None)\n",
" e.name = \"main\"\n",
" \n",
" def set_main_aside(soup):\n",
"[Move Metadata to the top](https://github.com/Iota-School/notebooks-for-all/issues/21)"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
" def set_main(soup):\n",
"map the `MAIN` tags to the primary notebook section\n",
" \n",
" e = soup.select_one(MAIN)\n",
" e.attrs.pop(\"tabindex\", None)\n",
" e.name = \"main\"\n",
" \n",
" def set_main_aside(soup):\n",
"[Move Metadata to the top](https://github.com/Iota-School/notebooks-for-all/issues/21)"
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "731d9cd8-1fe0-4f3a-9923-93dbc1bdd350",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"improving landmarks and roles in a notebook.\n",
"\n",
"https://github.com/Iota-School/notebooks-for-all/issues/19#issuecomment-1251245078\n",
"\n",
" def set_cells(soup):\n",
" for element in soup.select(CELLS):\n",
" code = element.select_one(CODE) \n",
" if code:\n",
" set_code_cell(element)\n",
" else:\n",
" md = element.select_one(MD)\n",
" if md:\n",
" set_md_cell(element)\n",
"\n",
"code and markdown cells are articles with a label added\n",
" \n",
" def set_code_cell(e): \n",
" e.name = \"article\"\n",
" e.attrs.setdefault(\"aria-label\", \"code cell\")\n",
" \n",
" def set_md_cell(e):\n",
" e.name = \"article\"\n",
" e.attrs.setdefault(\"aria-label\", \"markdown cell\")\n",
" \n",
" def set_displays(e):\n",
"introduces a section tag to the outputs\n",
" \n",
" out = e.select_one(OUT)\n",
" out.name = \"section\"\n",
" e.attrs.setdefault(\"aria-label\", \"code outputs\")"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"improving landmarks and roles in a notebook.\n",
"\n",
"https://github.com/Iota-School/notebooks-for-all/issues/19#issuecomment-1251245078\n",
"\n",
" def set_cells(soup):\n",
" for element in soup.select(CELLS):\n",
" code = element.select_one(CODE) \n",
" if code:\n",
" set_code_cell(element)\n",
" else:\n",
" md = element.select_one(MD)\n",
" if md:\n",
" set_md_cell(element)\n",
"\n",
"code and markdown cells are articles with a label added\n",
" \n",
" def set_code_cell(e): \n",
" e.name = \"article\"\n",
" e.attrs.setdefault(\"aria-label\", \"code cell\")\n",
" \n",
" def set_md_cell(e):\n",
" e.name = \"article\"\n",
" e.attrs.setdefault(\"aria-label\", \"markdown cell\")\n",
" \n",
" def set_displays(e):\n",
"introduces a section tag to the outputs\n",
" \n",
" out = e.select_one(OUT)\n",
" out.name = \"section\"\n",
" e.attrs.setdefault(\"aria-label\", \"code outputs\")\n"
]
},
{
"cell_type": "code",
"execution_count": 52,
"id": "9a97277e-eb20-4a14-bf3a-4ff2b73fb78c",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
" def set_inputs(soup):\n",
"https://github.com/Iota-School/notebooks-for-all/issues/15\n",
"https://github.com/Iota-School/notebooks-for-all/issues/20\n",
" \n",
" for inp in soup.select(IN):\n",
" inp.replace_with(BeautifulSoup(F\"<code><pre>{inp.text}</pre></code>\").select_one(\"code\"))"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
" def set_inputs(soup):\n",
"https://github.com/Iota-School/notebooks-for-all/issues/15\n",
"https://github.com/Iota-School/notebooks-for-all/issues/20\n",
" \n",
" for inp in soup.select(IN):\n",
" inp.replace_with(BeautifulSoup(F\"<code><pre>{inp.text}</pre></code>\").select_one(\"code\"))"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "ae2deb48-739b-4fce-a8d0-e714d130267d",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
" def set_prompts(soup):\n",
"https://github.com/Iota-School/notebooks-for-all/issues/20#issuecomment-1247172797\n",
" \n",
" for prompt in soup.select(PROMPT):\n",
" prompt.name = \"aside\""
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
" def set_prompts(soup):\n",
"https://github.com/Iota-School/notebooks-for-all/issues/20#issuecomment-1247172797\n",
" \n",
" for prompt in soup.select(PROMPT):\n",
" prompt.name = \"aside\""
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "91e832c1-7ec1-4820-b5d3-f471e1419d6b",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
" def set_notebook(soup):\n",
" set_cells(soup)\n",
" set_inputs(soup)\n",
" set_prompts(soup)"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
" def set_notebook(soup):\n",
" set_cells(soup)\n",
" set_inputs(soup)\n",
" set_prompts(soup)"
]
},
{
"cell_type": "code",
"execution_count": 135,
"id": "3752f7fe-b14e-4787-9d5d-6daa5464876f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"279477"
]
},
"execution_count": 135,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "56330381b7d44a77bcfb26afb0f20a65",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HTML(value='<pre><code>target = pathlib.Path(&quot;indexed.html&quot;)\\ntarget.write_text(str(soup))\\n# target…"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
" target = pathlib.Path(\"indexed.html\")\n",
" target.write_text(str(soup))\n",
" # target.write_text(\"\"\"<body><article><section>fuck</section></article></body>\"\"\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7be5f655-8535-4432-bc54-f70d565dba34",
"metadata": {},
"outputs": [],
"source": [
" import playwright.async_api\n",
" async with playwright.async_api.async_playwright() as play:\n",
" browser = await play.chromium.launch(\n",
" args=shlex.split('--enable-blink-features=\"AccessibilityObjectModel\"'),\n",
" headless=False, \n",
" channel=\"chrome-beta\"\n",
" )\n",
" page = await browser.new_page()\n",
" state = await page.goto(target.absolute().as_uri())\n",
" await asyncio.sleep(2000)\n",
" data = await page.accessibility.snapshot()\n",
" await browser.close()"
]
},
{
"cell_type": "code",
"execution_count": 139,
"id": "feb772d9-7c59-4f3a-b5ed-572eb3ad5d12",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'role': 'WebArea',\n",
" 'name': 'Notebook',\n",
" 'children': [{'role': 'text', 'name': 'In\\xa0[117]:'},\n",
" {'role': 'text',\n",
" 'name': 'we have to use the async api in IPython cause the event loop is running import nbconvert from nbformat import v4'},\n",
" {'role': 'text', 'name': 'In\\xa0[118]:'},\n",
" {'role': 'text',\n",
" 'name': 'e = nbconvert.exporters.html.HTMLExporter( template_name=\"classic\" ) nb = v4.reads(pathlib.Path(\"2022-09-18-.ipynb\").read_text())'},\n",
" {'role': 'text',\n",
" 'name': \"HTML(value='<pre><code>e = nbconvert.exporters.html.HTMLExporter(\\\\n template_name=&quot;classic&quot;\\\\n)\\\\n\\\\…\"},\n",
" {'role': 'text', 'name': 'In\\xa0[119]:'},\n",
" {'role': 'text', 'name': 'from bs4 import BeautifulSoup'},\n",
" {'role': 'text',\n",
" 'name': \"HTML(value='<pre><code>from bs4 import BeautifulSoup</code></pre>\\\\n')\"},\n",
" {'role': 'text', 'name': 'In\\xa0[120]:'},\n",
" {'role': 'text', 'name': 'soup = BeautifulSoup(nbconvert.export(e, nb)[0])'},\n",
" {'role': 'text',\n",
" 'name': \"HTML(value='<pre><code>soup = BeautifulSoup(nbconvert.export(e, nb)[0])</code></pre>\\\\n')\"},\n",
" {'role': 'text', 'name': 'In\\xa0[121]:'},\n",
" {'role': 'text',\n",
" 'name': '+++ MAIN = \"#notebook\" CELL = \".cell, .jp-Cell\" CODE = \".code_cell, .jp-CodeCell\" MD = \".text_cell, .jp-MarkdownCell\" OUT = \".output, .jp-OutputArea.jp-Cell-outputArea\" IN = \".code_cell .input .input_area\" PROMPT = \".input_prompt\" +++ e = soup.select_one(MAIN) e.attrs.pop(\"tabindex\", None) e.name = \"main\"'},\n",
" {'role': 'text', 'name': 'In\\xa0[122]:'},\n",
" {'role': 'text',\n",
" 'name': 'for element in soup.select(CELLS): element.name = \"article\" code, md = element.select_one(CODE), element.select_one(MD) if code or md: element.attrs.setdefault(\"aria-label\", \" \".join([code and \"code\" or \"markdown\", \"cell\"])) out = element.select_one(OUT) if out: out.name = \"section\" element.attrs.setdefault(\"aria-label\", \"code cell output\")'},\n",
" {'role': 'text',\n",
" 'name': \"HTML(value='<pre><code>for element in soup.select(CELLS):\\\\n element.name = &quot;article&quot;\\\\n code, m…\"},\n",
" {'role': 'text', 'name': 'In\\xa0[123]:'},\n",
" {'role': 'text',\n",
" 'name': 'for inp in soup.select(IN): inp.replace_with(BeautifulSoup(F\"'},\n",
" {'role': 'text', 'name': '{inp.text}'},\n",
" {'role': 'text', 'name': '\").select_one(\"code\"))'},\n",
" {'role': 'text',\n",
" 'name': \"HTML(value='<pre><code>for inp in soup.select(IN):\\\\n inp.replace_with(BeautifulSoup(F&quot;&lt;code&gt;&lt;…\"},\n",
" {'role': 'text', 'name': 'In\\xa0[124]:'},\n",
" {'role': 'text',\n",
" 'name': 'for prompt in soup.select(PROMPT): prompt.name = \"aside\"'},\n",
" {'role': 'text',\n",
" 'name': \"HTML(value='<pre><code>for prompt in soup.select(PROMPT):\\\\n prompt.name = &quot;aside&quot;</code></pre>\\\\n'…\"},\n",
" {'role': 'text', 'name': 'In\\xa0[125]:'},\n",
" {'role': 'text',\n",
" 'name': 'target = pathlib.Path(\"indexed.html\") target.write_text(str(soup)) # target.write_text(\"\"\"'},\n",
" {'role': 'text', 'name': 'fuck'},\n",
" {'role': 'text', 'name': '\"\"\")'},\n",
" {'role': 'text', 'name': 'Out[125]:'},\n",
" {'role': 'text', 'name': '650636'},\n",
" {'role': 'text',\n",
" 'name': \"HTML(value='<pre><code>target = pathlib.Path(&quot;indexed.html&quot;)\\\\ntarget.write_text(str(soup))\\\\n# target…\"},\n",
" {'role': 'text', 'name': 'In\\xa0[126]:'},\n",
" {'role': 'text',\n",
" 'name': 'import playwright.async_api async with playwright.async_api.async_playwright() as play: browser = await play.chromium.launch( args=shlex.split(\\'--enable-blink-features=\"AccessibilityObjectModel\"\\'), # headless=False, channel=\"chrome-beta\" ) page = await browser.new_page() state = await page.goto(target.absolute().as_uri()) data = await page.accessibility.snapshot() await browser.close()'},\n",
" {'role': 'text',\n",
" 'name': 'HTML(value=\"<pre><code>import playwright.async_api\\\\nasync with playwright.async_api.async_playwright() as play…'},\n",
" {'role': 'text', 'name': 'data'},\n",
" {'role': 'text', 'name': 'In\\xa0[\\xa0]:'}]}"
]
},
"execution_count": 139,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "c0c05c3ff675456dad5245fbb393ef74",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HTML(value='<pre><code>data</code></pre>\\n')"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
" data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "617db39f-2234-40de-bdfa-026f5ffb0f74",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "pidgy",
"language": "markdown",
"name": "pidgy"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.10"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment