Created
March 15, 2025 19:05
-
-
Save mgaitan/03974a60b834ad673e5166d8d1ac5279 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"id": "927fdcb4-76fb-4e5b-b23e-982379b48abe", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from ytelegraph import md_to_dom" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"id": "30904760-9052-487b-afbf-362b20b34bc5", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from ytelegraph.md_to_dom_2 import md_to_dom as md_to_dom_2" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"id": "3391977a-4213-4b9c-8d3d-e1ba67e9e3ba", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"content = \"\"\"\n", | |
"## Markdown example\n", | |
"\n", | |
"- **Bold text:** Use `**text**` to make text **bold**.\n", | |
"- *Italic text:* Use `*text*` or `_text_` to make text *italic*.\n", | |
"- ***Bold and Italic:*** You can combine them with triple asterisks, like ***this example***.\n", | |
"- ~~Strikethrough text:~~ Wrap text in `~~` to get ~~strikethrough~~.\n", | |
"\n", | |
"### Mixed Formats in a Sentence\n", | |
"\n", | |
"You can combine several formats in one sentence. For example, here is some ***bold and italic text*** alongside `inline code` to emphasize certain elements.\n", | |
"\n", | |
"### Links with Code Formatting\n", | |
"\n", | |
"It is also possible to have a link whose text is formatted as inline code. For instance: [`Special_Link`](https://www.example.com). This link uses code formatting for its text.\n", | |
"\n", | |
"### Additional Inline Formatting\n", | |
"\n", | |
"Sometimes you might want to mix more styles:\n", | |
"- **Bold**, _italic_, and `inline code` can all appear in the same sentence.\n", | |
"- Try this: **This is bold**, _this is italic_, and `this is code` all together.\n", | |
"\n", | |
"**Final Example:** Check out [**Ultimate_Link**](https://www.example.com/ultimate) which combines bold into one link!\n", | |
"\n", | |
"Enjoy testing your parser with this rich variety of inline formatting!\n", | |
"\n", | |
"### What You'll Find\n", | |
"\n", | |
"- **Lists:**\n", | |
" - Unordered lists with nested items:\n", | |
" - Item 1\n", | |
" - Sub-item 1a\n", | |
" - Sub-item 1b\n", | |
" - Item 2\n", | |
"- **Ordered Lists:**\n", | |
" 1. First item\n", | |
" 2. Second item\n", | |
" 1. Nested first\n", | |
" 2. Nested second\n", | |
"- **Links:** \n", | |
" You can visit [Example Website](https://www.example.com) for more information.\n", | |
"- **Images:** \n", | |
" Here's an absolute URL image: \n", | |
" \n", | |
"\n", | |
"## Code Samples\n", | |
"\n", | |
"Below is a simple Python code example:\n", | |
"\n", | |
"```python\n", | |
"def greet(name):\n", | |
" print(f\"Hello, {name}!\")\n", | |
" \n", | |
"greet(\"World\")\n", | |
"```\n", | |
"\"\"\"" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"id": "e0222694-48ed-4f5a-90cf-d8f96caa63c6", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[{'tag': 'h4', 'children': ['Markdown example']},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Bold text:']},\n", | |
" 'Use',\n", | |
" {'tag': 'code', 'children': ['**text**']},\n", | |
" 'to make text',\n", | |
" {'tag': 'strong', 'children': ['bold']},\n", | |
" '.']},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'em', 'children': ['Italic text:']},\n", | |
" 'Use',\n", | |
" {'tag': 'code', 'children': ['*text*']},\n", | |
" 'or',\n", | |
" {'tag': 'code', 'children': ['_text_']},\n", | |
" 'to make text',\n", | |
" {'tag': 'em', 'children': ['italic']},\n", | |
" '.']},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'strong',\n", | |
" 'children': [{'tag': 'em', 'children': ['Bold and Italic:']}]},\n", | |
" 'You can combine them with triple asterisks, like',\n", | |
" {'tag': 'strong',\n", | |
" 'children': [{'tag': 'em', 'children': ['this example']}]},\n", | |
" '.']},\n", | |
" {'tag': 'li',\n", | |
" 'children': ['~~Strikethrough text:~~ Wrap text in',\n", | |
" {'tag': 'code', 'children': ['~~']},\n", | |
" 'to get ~~strikethrough~~.']}]},\n", | |
" {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong',\n", | |
" 'children': ['Mixed Formats in a Sentence']}]},\n", | |
" {'tag': 'p',\n", | |
" 'children': ['You can combine several formats in one sentence. For example, here is some',\n", | |
" {'tag': 'strong',\n", | |
" 'children': [{'tag': 'em', 'children': ['bold and italic text']}]},\n", | |
" 'alongside',\n", | |
" {'tag': 'code', 'children': ['inline code']},\n", | |
" 'to emphasize certain elements.']},\n", | |
" {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Links with Code Formatting']}]},\n", | |
" {'tag': 'p',\n", | |
" 'children': ['It is also possible to have a link whose text is formatted as inline code. For instance:',\n", | |
" {'tag': 'a',\n", | |
" 'attrs': {'href': 'https://www.example.com'},\n", | |
" 'children': [{'tag': 'code', 'children': ['Special_Link']}]},\n", | |
" '. This link uses code formatting for its text.']},\n", | |
" {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong',\n", | |
" 'children': ['Additional Inline Formatting']}]},\n", | |
" {'tag': 'p',\n", | |
" 'children': ['Sometimes you might want to mix more styles:\\n-',\n", | |
" {'tag': 'strong', 'children': ['Bold']},\n", | |
" ',',\n", | |
" {'tag': 'em', 'children': ['italic']},\n", | |
" ', and',\n", | |
" {'tag': 'code', 'children': ['inline code']},\n", | |
" 'can all appear in the same sentence.\\n- Try this:',\n", | |
" {'tag': 'strong', 'children': ['This is bold']},\n", | |
" ',',\n", | |
" {'tag': 'em', 'children': ['this is italic']},\n", | |
" ', and',\n", | |
" {'tag': 'code', 'children': ['this is code']},\n", | |
" 'all together.']},\n", | |
" {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Final Example:']},\n", | |
" 'Check out',\n", | |
" {'tag': 'a',\n", | |
" 'attrs': {'href': 'https://www.example.com/ultimate'},\n", | |
" 'children': [{'tag': 'strong', 'children': ['Ultimate_Link']}]},\n", | |
" 'which combines bold into one link!']},\n", | |
" {'tag': 'p',\n", | |
" 'children': ['Enjoy testing your parser with this rich variety of inline formatting!']},\n", | |
" {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': [\"What You'll Find\"]}]},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Lists:']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': ['Unordered lists with nested items:',\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li', 'children': ['Item 1']},\n", | |
" {'tag': 'li', 'children': ['Sub-item 1a']},\n", | |
" {'tag': 'li', 'children': ['Sub-item 1b']},\n", | |
" {'tag': 'li', 'children': ['Item 2']}]}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Ordered Lists:']},\n", | |
" '1. First item\\n 2. Second item',\n", | |
" {'tag': 'ol',\n", | |
" 'children': [{'tag': 'li', 'children': ['Nested first']},\n", | |
" {'tag': 'li', 'children': ['Nested second']}]}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Links:']},\n", | |
" {'tag': 'br', 'children': []},\n", | |
" 'You can visit',\n", | |
" {'tag': 'a',\n", | |
" 'attrs': {'href': 'https://www.example.com'},\n", | |
" 'children': ['Example Website']},\n", | |
" 'for more information.']},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Images:']},\n", | |
" {'tag': 'br', 'children': []},\n", | |
" \"Here's an absolute URL image:\",\n", | |
" {'tag': 'br', 'children': []},\n", | |
" {'tag': 'img', 'attrs': {'src': 'https://via.placeholder.com/200'}}]}]},\n", | |
" {'tag': 'h4', 'children': ['Code Samples']},\n", | |
" {'tag': 'p', 'children': ['Below is a simple Python code example:']},\n", | |
" {'tag': 'pre',\n", | |
" 'children': [{'tag': 'code',\n", | |
" 'children': ['def greet(name):\\n print(f\"Hello, {name}!\")\\n\\ngreet(\"World\")']}]}]" | |
] | |
}, | |
"execution_count": 6, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"md_to_dom(content)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"id": "2da05e8b-69a3-4595-8597-85a71038acbb", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[{'tag': 'h4', 'children': ['Markdown example']},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Bold text:']},\n", | |
" ' Use ',\n", | |
" {'tag': 'code', 'children': ['**text**']},\n", | |
" ' to make text ',\n", | |
" {'tag': 'strong', 'children': ['bold']},\n", | |
" '.']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'em', 'children': ['Italic text:']},\n", | |
" ' Use ',\n", | |
" {'tag': 'code', 'children': ['*text*']},\n", | |
" ' or ',\n", | |
" {'tag': 'code', 'children': ['_text_']},\n", | |
" ' to make text ',\n", | |
" {'tag': 'em', 'children': ['italic']},\n", | |
" '.']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'em',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Bold and Italic:']}]},\n", | |
" ' You can combine them with triple asterisks, like ',\n", | |
" {'tag': 'em',\n", | |
" 'children': [{'tag': 'strong', 'children': ['this example']}]},\n", | |
" '.']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'del', 'children': ['Strikethrough text:']},\n", | |
" ' Wrap text in ',\n", | |
" {'tag': 'code', 'children': ['~~']},\n", | |
" ' to get ~~strikethrough~~.']}]}]},\n", | |
" {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong',\n", | |
" 'children': ['Mixed Formats in a Sentence']}]},\n", | |
" {'tag': 'p',\n", | |
" 'children': ['You can combine several formats in one sentence. For example, here is some ',\n", | |
" {'tag': 'em',\n", | |
" 'children': [{'tag': 'strong', 'children': ['bold and italic text']}]},\n", | |
" ' alongside ',\n", | |
" {'tag': 'code', 'children': ['inline code']},\n", | |
" ' to emphasize certain elements.']},\n", | |
" {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Links with Code Formatting']}]},\n", | |
" {'tag': 'p',\n", | |
" 'children': ['It is also possible to have a link whose text is formatted as inline code. For instance: ',\n", | |
" {'tag': 'a',\n", | |
" 'attrs': {'href': 'https://www.example.com'},\n", | |
" 'children': [{'tag': 'code', 'children': ['Special_Link']}]},\n", | |
" '. This link uses code formatting for its text.']},\n", | |
" {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong',\n", | |
" 'children': ['Additional Inline Formatting']}]},\n", | |
" {'tag': 'p', 'children': ['Sometimes you might want to mix more styles:']},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Bold']},\n", | |
" ', ',\n", | |
" {'tag': 'em', 'children': ['italic']},\n", | |
" ', and ',\n", | |
" {'tag': 'code', 'children': ['inline code']},\n", | |
" ' can all appear in the same sentence.']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': ['Try this: ',\n", | |
" {'tag': 'strong', 'children': ['This is bold']},\n", | |
" ', ',\n", | |
" {'tag': 'em', 'children': ['this is italic']},\n", | |
" ', and ',\n", | |
" {'tag': 'code', 'children': ['this is code']},\n", | |
" ' all together.']}]}]},\n", | |
" {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Final Example:']},\n", | |
" ' Check out ',\n", | |
" {'tag': 'a',\n", | |
" 'attrs': {'href': 'https://www.example.com/ultimate'},\n", | |
" 'children': [{'tag': 'strong', 'children': ['Ultimate_Link']}]},\n", | |
" ' which combines bold into one link!']},\n", | |
" {'tag': 'p',\n", | |
" 'children': ['Enjoy testing your parser with this rich variety of inline formatting!']},\n", | |
" {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': [\"What You'll Find\"]}]},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Lists:']}]},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': ['Unordered lists with nested items:']},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Item 1']},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Sub-item 1a']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Sub-item 1b']}]}]}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Item 2']}]}]}]}]}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Ordered Lists:']}]},\n", | |
" {'tag': 'ol',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['First item']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Second item']},\n", | |
" {'tag': 'ol',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Nested first']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Nested second']}]}]}]}]}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Links:']},\n", | |
" {'tag': 'br'},\n", | |
" 'You can visit ',\n", | |
" {'tag': 'a',\n", | |
" 'attrs': {'href': 'https://www.example.com'},\n", | |
" 'children': ['Example Website']},\n", | |
" ' for more information.']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Images:']},\n", | |
" {'tag': 'br'},\n", | |
" \"Here's an absolute URL image:\",\n", | |
" {'tag': 'br'},\n", | |
" {'tag': 'img',\n", | |
" 'attrs': {'src': 'https://via.placeholder.com/200',\n", | |
" 'alt': ['Placeholder Image']}}]}]}]},\n", | |
" {'tag': 'h4', 'children': ['Code Samples']},\n", | |
" {'tag': 'p', 'children': ['Below is a simple Python code example:']},\n", | |
" {'tag': 'pre',\n", | |
" 'children': [{'tag': 'code',\n", | |
" 'children': ['def greet(name):\\n print(f\"Hello, {name}!\")\\n\\ngreet(\"World\")\\n'],\n", | |
" 'attrs': {'class': 'language-python'}}]}]" | |
] | |
}, | |
"execution_count": 7, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"md_to_dom_2(content)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"id": "3e95f444-ef74-421b-9b89-b0f569076418", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from deepdiff import DeepDiff" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"id": "c93ee200-4e76-4dbc-92f4-77d7056d96cc", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"{'type_changes': {\"root[1]['children'][0]['children'][0]['children'][0]\": {'old_type': str,\n", | |
" 'new_type': dict,\n", | |
" 'old_value': 'Bold text:',\n", | |
" 'new_value': {'tag': 'strong', 'children': ['Bold text:']}},\n", | |
" \"root[1]['children'][1]['children'][0]['children'][0]\": {'old_type': str,\n", | |
" 'new_type': dict,\n", | |
" 'old_value': 'Italic text:',\n", | |
" 'new_value': {'tag': 'em', 'children': ['Italic text:']}},\n", | |
" \"root[1]['children'][2]['children'][0]['children'][0]['children'][0]\": {'old_type': str,\n", | |
" 'new_type': dict,\n", | |
" 'old_value': 'Bold and Italic:',\n", | |
" 'new_value': {'tag': 'strong', 'children': ['Bold and Italic:']}},\n", | |
" \"root[1]['children'][3]['children'][0]\": {'old_type': str,\n", | |
" 'new_type': dict,\n", | |
" 'old_value': '~~Strikethrough text:~~ Wrap text in',\n", | |
" 'new_value': {'tag': 'p',\n", | |
" 'children': [{'tag': 'del', 'children': ['Strikethrough text:']},\n", | |
" ' Wrap text in ',\n", | |
" {'tag': 'code', 'children': ['~~']},\n", | |
" ' to get ~~strikethrough~~.']}},\n", | |
" \"root[8]['children'][0]['children'][0]\": {'old_type': str,\n", | |
" 'new_type': dict,\n", | |
" 'old_value': 'Final Example:',\n", | |
" 'new_value': {'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Bold']},\n", | |
" ', ',\n", | |
" {'tag': 'em', 'children': ['italic']},\n", | |
" ', and ',\n", | |
" {'tag': 'code', 'children': ['inline code']},\n", | |
" ' can all appear in the same sentence.']}},\n", | |
" \"root[8]['children'][1]\": {'old_type': str,\n", | |
" 'new_type': dict,\n", | |
" 'old_value': 'Check out',\n", | |
" 'new_value': {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': ['Try this: ',\n", | |
" {'tag': 'strong', 'children': ['This is bold']},\n", | |
" ', ',\n", | |
" {'tag': 'em', 'children': ['this is italic']},\n", | |
" ', and ',\n", | |
" {'tag': 'code', 'children': ['this is code']},\n", | |
" ' all together.']}]}},\n", | |
" \"root[9]['children'][0]\": {'old_type': str,\n", | |
" 'new_type': dict,\n", | |
" 'old_value': 'Enjoy testing your parser with this rich variety of inline formatting!',\n", | |
" 'new_value': {'tag': 'strong', 'children': ['Final Example:']}},\n", | |
" \"root[10]['children'][0]\": {'old_type': dict,\n", | |
" 'new_type': str,\n", | |
" 'old_value': {'tag': 'strong', 'children': [\"What You'll Find\"]},\n", | |
" 'new_value': 'Enjoy testing your parser with this rich variety of inline formatting!'},\n", | |
" \"root[11]['children'][0]['children'][0]\": {'old_type': dict,\n", | |
" 'new_type': str,\n", | |
" 'old_value': {'tag': 'strong', 'children': ['Lists:']},\n", | |
" 'new_value': \"What You'll Find\"},\n", | |
" \"root[12]['children'][0]\": {'old_type': str,\n", | |
" 'new_type': dict,\n", | |
" 'old_value': 'Code Samples',\n", | |
" 'new_value': {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Lists:']}]},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': ['Unordered lists with nested items:']},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Item 1']},\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Sub-item 1a']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Sub-item 1b']}]}]}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Item 2']}]}]}]}]}]}},\n", | |
" \"root[14]['children'][0]\": {'old_type': dict,\n", | |
" 'new_type': str,\n", | |
" 'old_value': {'tag': 'code',\n", | |
" 'children': ['def greet(name):\\n print(f\"Hello, {name}!\")\\n\\ngreet(\"World\")']},\n", | |
" 'new_value': 'Below is a simple Python code example:'}},\n", | |
" 'values_changed': {\"root[1]['children'][0]['children'][0]['tag']\": {'new_value': 'p',\n", | |
" 'old_value': 'strong'},\n", | |
" \"root[1]['children'][1]['children'][0]['tag']\": {'new_value': 'p',\n", | |
" 'old_value': 'em'},\n", | |
" \"root[1]['children'][2]['children'][0]['tag']\": {'new_value': 'p',\n", | |
" 'old_value': 'strong'},\n", | |
" \"root[3]['children'][0]\": {'new_value': 'You can combine several formats in one sentence. For example, here is some ',\n", | |
" 'old_value': 'You can combine several formats in one sentence. For example, here is some'},\n", | |
" \"root[3]['children'][1]['tag']\": {'new_value': 'em', 'old_value': 'strong'},\n", | |
" \"root[3]['children'][1]['children'][0]['tag']\": {'new_value': 'strong',\n", | |
" 'old_value': 'em'},\n", | |
" \"root[3]['children'][2]\": {'new_value': ' alongside ',\n", | |
" 'old_value': 'alongside'},\n", | |
" \"root[3]['children'][4]\": {'new_value': ' to emphasize certain elements.',\n", | |
" 'old_value': 'to emphasize certain elements.'},\n", | |
" \"root[5]['children'][0]\": {'new_value': 'It is also possible to have a link whose text is formatted as inline code. For instance: ',\n", | |
" 'old_value': 'It is also possible to have a link whose text is formatted as inline code. For instance:'},\n", | |
" \"root[7]['children'][0]\": {'new_value': 'Sometimes you might want to mix more styles:',\n", | |
" 'old_value': 'Sometimes you might want to mix more styles:\\n-',\n", | |
" 'diff': '--- \\n+++ \\n@@ -1,2 +1 @@\\n Sometimes you might want to mix more styles:\\n--'},\n", | |
" \"root[8]['tag']\": {'new_value': 'ul', 'old_value': 'p'},\n", | |
" \"root[8]['children'][0]['tag']\": {'new_value': 'li', 'old_value': 'strong'},\n", | |
" \"root[11]['tag']\": {'new_value': 'p', 'old_value': 'ul'},\n", | |
" \"root[11]['children'][0]['tag']\": {'new_value': 'strong', 'old_value': 'li'},\n", | |
" \"root[12]['tag']\": {'new_value': 'ul', 'old_value': 'h4'},\n", | |
" \"root[13]['tag']\": {'new_value': 'h4', 'old_value': 'p'},\n", | |
" \"root[13]['children'][0]\": {'new_value': 'Code Samples',\n", | |
" 'old_value': 'Below is a simple Python code example:'},\n", | |
" \"root[14]['tag']\": {'new_value': 'p', 'old_value': 'pre'}},\n", | |
" 'iterable_item_added': {\"root[1]['children'][0]['children'][0]['children'][1]\": ' Use ',\n", | |
" \"root[1]['children'][0]['children'][0]['children'][2]\": {'tag': 'code',\n", | |
" 'children': ['**text**']},\n", | |
" \"root[1]['children'][0]['children'][0]['children'][3]\": ' to make text ',\n", | |
" \"root[1]['children'][0]['children'][0]['children'][4]\": {'tag': 'strong',\n", | |
" 'children': ['bold']},\n", | |
" \"root[1]['children'][0]['children'][0]['children'][5]\": '.',\n", | |
" \"root[1]['children'][1]['children'][0]['children'][1]\": ' Use ',\n", | |
" \"root[1]['children'][1]['children'][0]['children'][2]\": {'tag': 'code',\n", | |
" 'children': ['*text*']},\n", | |
" \"root[1]['children'][1]['children'][0]['children'][3]\": ' or ',\n", | |
" \"root[1]['children'][1]['children'][0]['children'][4]\": {'tag': 'code',\n", | |
" 'children': ['_text_']},\n", | |
" \"root[1]['children'][1]['children'][0]['children'][5]\": ' to make text ',\n", | |
" \"root[1]['children'][1]['children'][0]['children'][6]\": {'tag': 'em',\n", | |
" 'children': ['italic']},\n", | |
" \"root[1]['children'][1]['children'][0]['children'][7]\": '.',\n", | |
" \"root[1]['children'][2]['children'][0]['children'][1]\": ' You can combine them with triple asterisks, like ',\n", | |
" \"root[1]['children'][2]['children'][0]['children'][2]\": {'tag': 'em',\n", | |
" 'children': [{'tag': 'strong', 'children': ['this example']}]},\n", | |
" \"root[1]['children'][2]['children'][0]['children'][3]\": '.',\n", | |
" \"root[9]['children'][1]\": ' Check out ',\n", | |
" \"root[9]['children'][2]\": {'tag': 'a',\n", | |
" 'attrs': {'href': 'https://www.example.com/ultimate'},\n", | |
" 'children': [{'tag': 'strong', 'children': ['Ultimate_Link']}]},\n", | |
" \"root[9]['children'][3]\": ' which combines bold into one link!',\n", | |
" \"root[12]['children'][1]\": {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Ordered Lists:']}]},\n", | |
" {'tag': 'ol',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['First item']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Second item']},\n", | |
" {'tag': 'ol',\n", | |
" 'children': [{'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Nested first']}]},\n", | |
" {'tag': 'li',\n", | |
" 'children': [{'tag': 'p', 'children': ['Nested second']}]}]}]}]}]},\n", | |
" \"root[12]['children'][2]\": {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Links:']},\n", | |
" {'tag': 'br'},\n", | |
" 'You can visit ',\n", | |
" {'tag': 'a',\n", | |
" 'attrs': {'href': 'https://www.example.com'},\n", | |
" 'children': ['Example Website']},\n", | |
" ' for more information.']}]},\n", | |
" \"root[12]['children'][3]\": {'tag': 'li',\n", | |
" 'children': [{'tag': 'p',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Images:']},\n", | |
" {'tag': 'br'},\n", | |
" \"Here's an absolute URL image:\",\n", | |
" {'tag': 'br'},\n", | |
" {'tag': 'img',\n", | |
" 'attrs': {'src': 'https://via.placeholder.com/200',\n", | |
" 'alt': ['Placeholder Image']}}]}]},\n", | |
" 'root[15]': {'tag': 'pre',\n", | |
" 'children': [{'tag': 'code',\n", | |
" 'children': ['def greet(name):\\n print(f\"Hello, {name}!\")\\n\\ngreet(\"World\")\\n'],\n", | |
" 'attrs': {'class': 'language-python'}}]}},\n", | |
" 'iterable_item_removed': {\"root[1]['children'][0]['children'][1]\": 'Use',\n", | |
" \"root[1]['children'][0]['children'][2]\": {'tag': 'code',\n", | |
" 'children': ['**text**']},\n", | |
" \"root[1]['children'][0]['children'][3]\": 'to make text',\n", | |
" \"root[1]['children'][0]['children'][4]\": {'tag': 'strong',\n", | |
" 'children': ['bold']},\n", | |
" \"root[1]['children'][0]['children'][5]\": '.',\n", | |
" \"root[1]['children'][1]['children'][1]\": 'Use',\n", | |
" \"root[1]['children'][1]['children'][2]\": {'tag': 'code',\n", | |
" 'children': ['*text*']},\n", | |
" \"root[1]['children'][1]['children'][3]\": 'or',\n", | |
" \"root[1]['children'][1]['children'][4]\": {'tag': 'code',\n", | |
" 'children': ['_text_']},\n", | |
" \"root[1]['children'][1]['children'][5]\": 'to make text',\n", | |
" \"root[1]['children'][1]['children'][6]\": {'tag': 'em',\n", | |
" 'children': ['italic']},\n", | |
" \"root[1]['children'][1]['children'][7]\": '.',\n", | |
" \"root[1]['children'][2]['children'][1]\": 'You can combine them with triple asterisks, like',\n", | |
" \"root[1]['children'][2]['children'][2]\": {'tag': 'strong',\n", | |
" 'children': [{'tag': 'em', 'children': ['this example']}]},\n", | |
" \"root[1]['children'][2]['children'][3]\": '.',\n", | |
" \"root[1]['children'][3]['children'][1]\": {'tag': 'code', 'children': ['~~']},\n", | |
" \"root[1]['children'][3]['children'][2]\": 'to get ~~strikethrough~~.',\n", | |
" \"root[7]['children'][1]\": {'tag': 'strong', 'children': ['Bold']},\n", | |
" \"root[7]['children'][2]\": ',',\n", | |
" \"root[7]['children'][3]\": {'tag': 'em', 'children': ['italic']},\n", | |
" \"root[7]['children'][4]\": ', and',\n", | |
" \"root[7]['children'][5]\": {'tag': 'code', 'children': ['inline code']},\n", | |
" \"root[7]['children'][6]\": 'can all appear in the same sentence.\\n- Try this:',\n", | |
" \"root[7]['children'][7]\": {'tag': 'strong', 'children': ['This is bold']},\n", | |
" \"root[7]['children'][8]\": ',',\n", | |
" \"root[7]['children'][9]\": {'tag': 'em', 'children': ['this is italic']},\n", | |
" \"root[7]['children'][10]\": ', and',\n", | |
" \"root[7]['children'][11]\": {'tag': 'code', 'children': ['this is code']},\n", | |
" \"root[7]['children'][12]\": 'all together.',\n", | |
" \"root[8]['children'][2]\": {'tag': 'a',\n", | |
" 'attrs': {'href': 'https://www.example.com/ultimate'},\n", | |
" 'children': [{'tag': 'strong', 'children': ['Ultimate_Link']}]},\n", | |
" \"root[8]['children'][3]\": 'which combines bold into one link!',\n", | |
" \"root[11]['children'][1]\": {'tag': 'li',\n", | |
" 'children': ['Unordered lists with nested items:',\n", | |
" {'tag': 'ul',\n", | |
" 'children': [{'tag': 'li', 'children': ['Item 1']},\n", | |
" {'tag': 'li', 'children': ['Sub-item 1a']},\n", | |
" {'tag': 'li', 'children': ['Sub-item 1b']},\n", | |
" {'tag': 'li', 'children': ['Item 2']}]}]},\n", | |
" \"root[11]['children'][2]\": {'tag': 'li',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Ordered Lists:']},\n", | |
" '1. First item\\n 2. Second item',\n", | |
" {'tag': 'ol',\n", | |
" 'children': [{'tag': 'li', 'children': ['Nested first']},\n", | |
" {'tag': 'li', 'children': ['Nested second']}]}]},\n", | |
" \"root[11]['children'][3]\": {'tag': 'li',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Links:']},\n", | |
" {'tag': 'br', 'children': []},\n", | |
" 'You can visit',\n", | |
" {'tag': 'a',\n", | |
" 'attrs': {'href': 'https://www.example.com'},\n", | |
" 'children': ['Example Website']},\n", | |
" 'for more information.']},\n", | |
" \"root[11]['children'][4]\": {'tag': 'li',\n", | |
" 'children': [{'tag': 'strong', 'children': ['Images:']},\n", | |
" {'tag': 'br', 'children': []},\n", | |
" \"Here's an absolute URL image:\",\n", | |
" {'tag': 'br', 'children': []},\n", | |
" {'tag': 'img', 'attrs': {'src': 'https://via.placeholder.com/200'}}]}}}" | |
] | |
}, | |
"execution_count": 9, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"DeepDiff(Out[6], Out[7])" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"id": "a85ccbd4-4064-4fb3-beec-8bb30108dfb8", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"6.54 ms ± 408 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" | |
] | |
} | |
], | |
"source": [ | |
"%timeit md_to_dom(content)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"id": "b10ed047-0d21-4ea4-a7a5-326918556929", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"2.35 ms ± 55.9 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" | |
] | |
} | |
], | |
"source": [ | |
"%timeit md_to_dom_2(content)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"id": "13691625-990c-4c68-9295-063e912ca774", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from ytelegraph import TelegraphAPI\n", | |
"\n", | |
"ph = TelegraphAPI()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 14, | |
"id": "ed150c06-a2b3-48c7-9042-43b75e3e153c", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'https://telegra.ph/test-original-parser-03-15'" | |
] | |
}, | |
"execution_count": 14, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"ph.create_page_md(\"test original parser\", content)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 15, | |
"id": "1792b510-6df9-4fa0-8a29-69bd25906694", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'https://telegra.ph/test-new-parser-03-15'" | |
] | |
}, | |
"execution_count": 15, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"ph.create_page(\"test new parser\", md_to_dom_2(content))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"id": "a963555e-74b4-4023-9cad-94d523ef2557", | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3 (ipykernel)", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.12.8" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 5 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment