Created
October 3, 2016 02:14
-
-
Save CarlLee/dd26bb9fcbd3b24667cb4a0497a26ad2 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
html_doc = """ | |
<html><head><title>The Dormouse's story</title></head> | |
<body> | |
<p class="title"><b>The Dormouse's story</b></p> | |
<p class="story">Once upon a time there were three little sisters; and their names were | |
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>, | |
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and | |
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>; | |
and they lived at the bottom of a well.</p> | |
<p class="story">...</p> | |
""" | |
from bs4 import BeautifulSoup | |
soup = BeautifulSoup(html_doc, 'html.parser') | |
result = soup.find_all(text='Elsie') | |
print [item.parent for item in result] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment