Created
July 26, 2014 03:33
-
-
Save alecxe/69513eedc9e4aa00c624 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from bs4 import BeautifulSoup | |
import re | |
data = """ | |
<div> | |
<p>D: string-1.string2 15030 9h7a2m string3.string<br/> | |
D: string-1.string2 15030 9h7a2m string3.string<br/> | |
D: string-1.string2 15030 9h7a2m string3.string</p> | |
<p><span id="more-1203"></span></p> | |
<p>D: string-1.string2 15030 9h7a2m string3.string<br/> | |
D: string-1.string2 15030 9h7a2m string3.string<br/> | |
D: string-1.string2 15030 9h7a2m string3.string<br/> | |
D: string-1.string2 15030 9h7a2m string3.string<br/> | |
<p>pinging test is positive but no works</p> | |
<p>how much time are online?</p> | |
<p><input aria-required="true" id="author" name="author" size="22" tabindex="1" type="text" value=""/> | |
<label for="author"><small>Name (required)</small></label></p> | |
<p><input aria-required="true" id="email" name="email" size="22" tabindex="2" type="text" value=""/> | |
<label for="email"><small>Mail (will not be published) (required)</small></label></p> | |
<p><input id="url" name="url" size="22" tabindex="3" type="text" value=""/> | |
<label for="url"><small>Website</small></label></p> | |
<p><textarea cols="100%" id="comment" name="comment" rows="10" tabindex="4"></textarea></p> | |
<p><input id="submit" name="submit" tabindex="5" type="submit" value="Submit Comment"/> | |
<input id="comment_post_ID" name="comment_post_ID" type="hidden" value="41"/> | |
<input id="comment_parent" name="comment_parent" type="hidden" value="0"/> | |
</p> | |
<p style="display: none;"><input id="akismet_comment_nonce" name="akismet_comment_nonce" type="hidden" value="1709964457"/></p> | |
<p style="display: none;"><input id="ak_js" name="ak_js" type="hidden" value="99"/></p> | |
</div> | |
""" | |
soup = BeautifulSoup(data) | |
s = soup.find('p').br.previous_sibling | |
match = re.search('string\-1\.string2 \d+ (\w+) string3\.string', s) | |
print match.group(1) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment