Last active
December 3, 2020 21:02
-
-
Save nick3499/90c021a5faab85b94a6df58e59a32b7c to your computer and use it in GitHub Desktop.
BeautifulSoup HTML parser
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"name": "beautifulsoup_html_parsing.ipynb", | |
"provenance": [], | |
"authorship_tag": "ABX9TyPlGIPxb4eNIrUwlhgQwDut", | |
"include_colab_link": true | |
}, | |
"kernelspec": { | |
"name": "python3", | |
"display_name": "Python 3" | |
} | |
}, | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "view-in-github", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"<a href=\"https://colab.research.google.com/gist/nick3499/90c021a5faab85b94a6df58e59a32b7c/beautifulsoup_html_parsing.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "05IvMHnCRX3a" | |
}, | |
"source": [ | |
"## HTML Data" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "qey1pYR4RMbv" | |
}, | |
"source": [ | |
"html_doc = '''\n", | |
"<html>\n", | |
" <head>\n", | |
" <title>Searching Tree</title>\n", | |
" </head>\n", | |
" <body>\n", | |
" <h1>Searching Parse Tree In BeautifulSoup</h1>\n", | |
" <p class=\"Main\">Learning\n", | |
" <a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>, \n", | |
" <a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a> and \n", | |
" <a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a> are fun!\n", | |
" </p>\n", | |
" <p class=\"Secondary\">\n", | |
" <b>Please subscribe!</b>\n", | |
" </p>\n", | |
" <p class=\"Secondary\" id=\"finxter\">\n", | |
" <b>copyright - FINXTER</b>\n", | |
" </p>\n", | |
" </body>\n", | |
"</html>\n", | |
"'''" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "SIxzJDWpg2w6" | |
}, | |
"source": [ | |
"## Soup Method Docs" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "-wcoqufdenv6" | |
}, | |
"source": [ | |
"The `__doc__` of Soup `find()` method reads:\n", | |
"\n", | |
"'''Return only the first child of this Tag matching the given criteria.'''\n", | |
"\n", | |
"The `__doc__` of Soup `find_all()` method reads:\n", | |
"\n", | |
"'''Extracts a list of Tag objects that match the given criteria. You can specify the name of the Tag and any attributes you want the Tag to have.\n", | |
"\n", | |
"The value of a key-value pair in the 'attrs' map can be a string, a list of strings, a regular expression object, or a callable that takes a string and returns whether or not the string matches for some custom definition of 'matches'. The same is true of the tag name.'''" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "nEbh4y4fRdLg" | |
}, | |
"source": [ | |
"## Import BeautifulSoup" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "nDKvFlinRhgb" | |
}, | |
"source": [ | |
"from bs4 import BeautifulSoup" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "LnHFIBv1Rlrz" | |
}, | |
"source": [ | |
"Import `BeautifulSoup` class" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "9Cy6v8QeRr9o" | |
}, | |
"source": [ | |
"## Instantiate Soup's Parse Tree" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "0Tzz5sWTRu44" | |
}, | |
"source": [ | |
"soup = BeautifulSoup(html_doc, 'html.parser')" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "BF-LFSxuRxrN" | |
}, | |
"source": [ | |
"Pass `html_doc` string object to `BeautifulSoup` class, along with `'html.parser'` to instantiate parse tree named `soup`." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "hswg6v5JTt09" | |
}, | |
"source": [ | |
"## Get H1" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "S_Kw1TBETzJH", | |
"outputId": "d0e62791-4711-436f-a1b9-27a7e8c4cc70" | |
}, | |
"source": [ | |
"soup.h1" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"<h1>Searching Parse Tree In BeautifulSoup</h1>" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 4 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "gvhjSOIkT1q4" | |
}, | |
"source": [ | |
"To get the first `h1` tag found in the parse tree." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "cpdBC1UxUJW6" | |
}, | |
"source": [ | |
"## Iterate through All Anchor Tags" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "ZfOC7n6zUgbX", | |
"outputId": "4a5aa7fb-211a-484a-929f-81583b686be4" | |
}, | |
"source": [ | |
"for i in soup.find_all('a'):\n", | |
" print(i)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": [ | |
"<a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>\n", | |
"<a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a>\n", | |
"<a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a>\n" | |
], | |
"name": "stdout" | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "HoSKawM5UpKy" | |
}, | |
"source": [ | |
"`soup` parser's `find_all()` method lists all anchor tags which can then be iterated through using a `for` loop." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "S-ichoO6VTTl" | |
}, | |
"source": [ | |
"## Find Anchor Tag with Attribute" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "lmyK_x_tVutu", | |
"outputId": "b4c142b2-00f8-47f8-c69d-84b9e5172815" | |
}, | |
"source": [ | |
"soup.find('a', id='golang')" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"<a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a>" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 6 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "w0W45aCpVyeL" | |
}, | |
"source": [ | |
"`soup` parser's `find()` method receives two attributes\n", | |
"\n", | |
"- `'a'` anchor\n", | |
"- `id='golang'` attribute" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "cIdGoDptWQtU" | |
}, | |
"source": [ | |
"## List All Anchor Tags in Language Class" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "eok7lRQwWhP8", | |
"outputId": "923ba6aa-a99e-4e0c-b964-8dc7299a2792" | |
}, | |
"source": [ | |
"for i in soup.select('a.language'):\n", | |
" print(i)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": [ | |
"<a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>\n", | |
"<a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a>\n", | |
"<a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a>\n" | |
], | |
"name": "stdout" | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "NAZdJPbVWmh-" | |
}, | |
"source": [ | |
"`soup` parser's `select()` method receives one parameter:\n", | |
"\n", | |
"- `'a.language'` every anchor tag with `language` class." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "FHbaY0gLXr1j", | |
"outputId": "336337cb-187a-4b3d-a406-2d615d6b42e7" | |
}, | |
"source": [ | |
"for i in soup.find_all('a', {'class': 'language'}):\n", | |
" print(i)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": [ | |
"<a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>\n", | |
"<a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a>\n", | |
"<a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a>\n" | |
], | |
"name": "stdout" | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "vjaKXudBXx-D" | |
}, | |
"source": [ | |
"`soup` parser's `find_all()` method lists all anchor tags with `class` key and `language` value." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "pjyMAxX8YOkP" | |
}, | |
"source": [ | |
"## Get First Anchor Tag with Language Class" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "grIyvYhHYj5T", | |
"outputId": "e06ad087-5e4a-413e-eefc-0d8197849331" | |
}, | |
"source": [ | |
"for i in soup.find('a', class_='language'):\n", | |
" print(i)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": [ | |
"Python\n" | |
], | |
"name": "stdout" | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "qPWQAucfYpbU" | |
}, | |
"source": [ | |
"`soup` parser's `find` method gets label from first instance of anchor tag with `language` `class`." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "N_54NtMpV7hl" | |
}, | |
"source": [ | |
"## Type of Find()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "2mpc8c8AWBEM", | |
"outputId": "e76cb9fd-a468-499f-c9d1-019c12fc5bdb" | |
}, | |
"source": [ | |
"type(soup.find('h1'))" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"bs4.element.Tag" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 10 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "akCZrmveW5zu" | |
}, | |
"source": [ | |
"## Attrs Attribute" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "2ShbUrcTW8uq", | |
"outputId": "21103000-b7d2-4e52-ddb1-f892dab3f7bf" | |
}, | |
"source": [ | |
"soup.find_all('a', attrs={'id': 'java', 'class': 'language'})" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"[<a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a>]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 11 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "8SUC90bFYxz3" | |
}, | |
"source": [ | |
"The `attrs` attribute can be used with `find_all()` method to find key/value pairs for `id` and `class` attributes in anchor tags. This is where searches become more specific." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "kwyZlz65XpSA" | |
}, | |
"source": [ | |
"## String Attribute" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "zDewBhnPXs_m", | |
"outputId": "e03a00fc-30cf-4027-9fbe-c631a6caca7e" | |
}, | |
"source": [ | |
"soup.find_all(string=[\"Python\", \"Java\", \"Golang\"])" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"['Python', 'Java', 'Golang']" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 12 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "mkFacwj1YWSE" | |
}, | |
"source": [ | |
"The `string` attribute lists all of the strings that it can find. In this case, they are the three anchor tag **labels** for three computer languages." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "fsij1h66X4O9" | |
}, | |
"source": [ | |
"## Limit Attribute" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "UuqUVx8nX6Qz", | |
"outputId": "1a4c7903-3660-41a5-e2ce-3d5b44847ff2" | |
}, | |
"source": [ | |
"soup.find_all('a', limit=2)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"[<a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>,\n", | |
" <a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a>]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 13 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "ZxI2ttepiImz" | |
}, | |
"source": [ | |
"The `limit` attribute limits the total amount of elements Soup's `find_all()` method extracts." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "gJ4JQmnM2h7A" | |
}, | |
"source": [ | |
"## Text Method" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "rX_NSb842kIV", | |
"outputId": "05af1d77-72ae-415c-bc0e-16b1af3ba1fa" | |
}, | |
"source": [ | |
"for i in soup.find_all('a'):\n", | |
" print(i.text)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": [ | |
"Python\n", | |
"Java\n", | |
"Golang\n" | |
], | |
"name": "stdout" | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "E9-IyAkR28mk" | |
}, | |
"source": [ | |
"The `text` method extracts **text labels** from anchor tags." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "R-FKZo9Kl3Yp" | |
}, | |
"source": [ | |
"## Additional Methods" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "NQdQjW9Il_Bz", | |
"outputId": "9f6c7071-bd39-4ecd-d09e-8d260bf24dce" | |
}, | |
"source": [ | |
"current = soup.find('a', id='java')\n", | |
"current.find_parent()" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"<p class=\"Main\">Learning\n", | |
" <a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>, \n", | |
" <a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a> and \n", | |
" <a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a> are fun!\n", | |
" </p>" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 15 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "w-aNRoc1mL7u", | |
"outputId": "3540683f-b2be-4268-c70f-8e092763b7b4" | |
}, | |
"source": [ | |
"current.find_parents()" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"[<p class=\"Main\">Learning\n", | |
" <a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>, \n", | |
" <a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a> and \n", | |
" <a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a> are fun!\n", | |
" </p>, <body>\n", | |
" <h1>Searching Parse Tree In BeautifulSoup</h1>\n", | |
" <p class=\"Main\">Learning\n", | |
" <a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>, \n", | |
" <a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a> and \n", | |
" <a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a> are fun!\n", | |
" </p>\n", | |
" <p class=\"Secondary\">\n", | |
" <b>Please subscribe!</b>\n", | |
" </p>\n", | |
" <p class=\"Secondary\" id=\"finxter\">\n", | |
" <b>copyright - FINXTER</b>\n", | |
" </p>\n", | |
" </body>, <html>\n", | |
" <head>\n", | |
" <title>Searching Tree</title>\n", | |
" </head>\n", | |
" <body>\n", | |
" <h1>Searching Parse Tree In BeautifulSoup</h1>\n", | |
" <p class=\"Main\">Learning\n", | |
" <a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>, \n", | |
" <a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a> and \n", | |
" <a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a> are fun!\n", | |
" </p>\n", | |
" <p class=\"Secondary\">\n", | |
" <b>Please subscribe!</b>\n", | |
" </p>\n", | |
" <p class=\"Secondary\" id=\"finxter\">\n", | |
" <b>copyright - FINXTER</b>\n", | |
" </p>\n", | |
" </body>\n", | |
" </html>, \n", | |
" <html>\n", | |
" <head>\n", | |
" <title>Searching Tree</title>\n", | |
" </head>\n", | |
" <body>\n", | |
" <h1>Searching Parse Tree In BeautifulSoup</h1>\n", | |
" <p class=\"Main\">Learning\n", | |
" <a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>, \n", | |
" <a class=\"language\" href=\"https://docs.oracle.com/en/java/\" id=\"java\">Java</a> and \n", | |
" <a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a> are fun!\n", | |
" </p>\n", | |
" <p class=\"Secondary\">\n", | |
" <b>Please subscribe!</b>\n", | |
" </p>\n", | |
" <p class=\"Secondary\" id=\"finxter\">\n", | |
" <b>copyright - FINXTER</b>\n", | |
" </p>\n", | |
" </body>\n", | |
" </html>]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 16 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "3PVxbADvmRGm", | |
"outputId": "c933301b-e0bb-4957-e406-27f751862874" | |
}, | |
"source": [ | |
"current.find_previous_sibling()" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"<a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 17 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "T1AewC_OmVQU", | |
"outputId": "29ab09e8-831f-4c12-f4c5-4169c1391eaa" | |
}, | |
"source": [ | |
"current.find_previous_siblings()" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"[<a class=\"language\" href=\"https://docs.python.org/3/\" id=\"python\">Python</a>]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 18 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "W5_ImHNKmZTr", | |
"outputId": "4ded964f-8800-440b-eafc-49a824a56db9" | |
}, | |
"source": [ | |
"current.find_next()" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"<a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a>" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 19 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "ZJZIQ6MAmff1", | |
"outputId": "4fe0b02a-cdd2-41e0-b1c9-5d7ab073712b" | |
}, | |
"source": [ | |
"current.find_all_next()" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"[<a class=\"language\" href=\"https://golang.org/doc/\" id=\"golang\">Golang</a>,\n", | |
" <p class=\"Secondary\">\n", | |
" <b>Please subscribe!</b>\n", | |
" </p>,\n", | |
" <b>Please subscribe!</b>,\n", | |
" <p class=\"Secondary\" id=\"finxter\">\n", | |
" <b>copyright - FINXTER</b>\n", | |
" </p>,\n", | |
" <b>copyright - FINXTER</b>]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 20 | |
} | |
] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment