-
-
Save MichelleDalalJian/2c9aaadbda21290e1ccfc87a9ab1f937 to your computer and use it in GitHub Desktop.
| #Actual data: http://py4e-data.dr-chuck.net/comments_24964.html (Sum ends with 73) | |
| from urllib import request | |
| from bs4 import BeautifulSoup | |
| html=request.urlopen('http://python-data.dr-chuck.net/comments_24964.html').read() | |
| soup = BeautifulSoup(html) | |
| tags=soup('span') | |
| sum=0 | |
| for tag in tags: | |
| sum=sum+int(tag.contents[0]) | |
| print(sum) |
Here is the way how you guys can solve this :
Working code below 👍
READ ME "":: Copy the actual Data url and run the file from the cmd/terminal and then paste the in terminal or CMD like so
#! /bin/python3
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
leave the url empty for now. Paste the url after running the file in cmd or terminal.
url = input("")
html = urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, "html.parser")
spans = soup('span')
numbers = []
for span in spans:
numbers.append(int(span.string))
print (sum(numbers))
togithub.mp4
This is the error I am getting can anybody help?
Notes Regarding the Use of BeautifulSoup
The sample code for this course and textbook examples use BeautifulSoup to parse HTML.
Using BeautifulSoup 4 with Python 3.10 or Python 3.11
Instructions for Windows 10:
-
pip install beautifulsoup4 (run this command)
-
if the bs4.zip file was downloaded, delete it
Instructions for MacOS:
-
pip3 install beautifulsoup4 (run this command)
-
if the bs4.zip file was downloaded or you have a bs4 folder, delete it
Using BeautifulSoup 3 (only for Python 3.8 or Python 3.9)
If you want use our samples "as is", download our Python 3 version of BeautifulSoup 3 from
http://www.py4e.com/code3/bs4.zip
You must unzip this into a "bs4" folder and have that folder as a sub-folder of the folder where you put our sample code like:
Hello I tried this for the same question:
#Scraping Numbers from HTML using BeautifulSoup
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
import re
Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = input('Enter - ')
html = urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, "html.parser")
Retrieve all of the anchor tags
counts = dict()
my_list = list()
tags = soup('span')
for tag in tags:
# Look at the parts of a tag
num = str(tag)
number = re.findall('[0-9]+', num)
if len(number) != 1:
continue
for integer in number:
integer = int(y)
my_list = append(integer)
counts[integer] = counts.get(integer, 0 ) + 1
print('Count ', counts)
#or you can say
#print('Count ', len(my_list))
print('Sum ', sum(my_list))
For window user follow the instruction given by instructor in the discussion forum than the above top one code even work out for you.
https://www.coursera.org/learn/python-network-data/discussions/forums/G0TMJ6G0EeqqMhL7huUnrQ/threads/Fi07MzG0EeymZRIVts3h3w
import urllib.request
from bs4 import BeautifulSoup
# Prompt user for URL
url = input('Enter URL: ')
# Read HTML from URL
html = urllib.request.urlopen(url).read()
#Parse the HTML using BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
# Find all span tags
tags = soup('span')
# Sum up the numbers
sum = 0
for tag in tags:
sum += int(tag.contents[0])
# Print the sum
print(sum)
import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl
import re
Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = input('ENTER URL:') #http://py4e-data.dr-chuck.net/comments_1692181.html
fhand = urllib.request.urlopen(url,context=ctx).read()
soup = BeautifulSoup(fhand,'html.parser')
#print(soup)
Retrieve all of the anchor tags
tags = soup('span')
lst=list()
for tag in tags:
tag = str(tag)
#print(tag)
tag2 = re.findall('[0-9]+',tag)
tag3 = int(tag2[0])
lst.append(tag3)
#print(lst)
total = sum(lst)
print(total)
For window user follow the instruction given by instructor in the discussion forum than the above top one code even work out for you. https://www.coursera.org/learn/python-network-data/discussions/forums/G0TMJ6G0EeqqMhL7huUnrQ/threads/Fi07MzG0EeymZRIVts3h3w
Thank you for your help, It works.
jai essaie sur vs code il y a un probleme trace back et maintenant sur jupyter toujours le resultat 0
uninstall the zip folder and the extracted folder of bs4 and install it using your command prompt by typing: -
pip install beautifulsoup4
Here is the way how you guys can solve this : Working code below 👍 READ ME "":: Copy the actual Data url and run the file from the cmd/terminal and then paste the in terminal or CMD like so
#! /bin/python3 from urllib.request import urlopen from bs4 import BeautifulSoup import ssl
ctx = ssl.create_default_context() ctx.check_hostname = False ctx.verify_mode = ssl.CERT_NONE
leave the url empty for now. Paste the url after running the file in cmd or terminal.
url = input("") html = urlopen(url, context=ctx).read() soup = BeautifulSoup(html, "html.parser")
spans = soup('span') numbers = []
for span in spans: numbers.append(int(span.string))
print (sum(numbers))
togithub.mp4
this is the best way to answer. run it in terminal. thank you
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
scy = ssl.create_default_context()
scy.check_hostname = False
scy.verify_mode = ssl.CERT_NONE
ur = input("pls input url")
html = urlopen(url, context=scy).read()
soup = BeautifulSoup(html, "html.parser")
s = 1
tags = soup('span')
for t in tags:
lines = ("contents:", t.contents[0])
num = list(lines)
x = int(num[1])
if x > 0:
newsum = newsum + x
print(newsum)
i need help it keep saying i am woung do anyone know what i did woung?
Enhanced command list with categories
COMMANDS = {
'System Information': {
'/os': 'Show operating system info',
'/ip': 'Show local IP address',
'/hostname': 'Show device hostname',
'/whoami': 'Show current user',
'/uptime': 'Show system uptime',
'/cpu': 'Show CPU information',
'/ram': 'Show RAM usage',
'/disk': 'Show disk space',
'/battery': 'Show battery status',
'/sysinfo': 'Detailed system information',
},
'File Operations': {
'/ls [path]': 'List directory contents',
'/cd [path]': 'Change directory',
'/pwd': 'Print working directory',
'/cat [file]': 'Show file contents',
'/mkdir [name]': 'Create directory',
'/rm [file]': 'Remove file',
'/download [file]': 'Download a file',
},
'Network': {
'/ping [host]': 'Ping a host',
'/netstat': 'Show network connections',
'/publicip': 'Show public IP address',
'/speedtest': 'Run internet speed test',
'/traceroute [host]': 'Trace route to host',
},
'Utilities': {
'/time': 'Show current time',
'/date': 'Show current date',
'/calendar': 'Show current month calendar',
'/random [min-max]': 'Generate random number',
'/calc [expression]': 'Simple calculator',
'/qr [text]': 'Generate QR code',
'/weather [city]': 'Get weather forecast',
},
'Entertainment': {
'/joke': 'Tell a random joke',
'/quote': 'Show inspirational quote',
'/fact': 'Show interesting fact',
'/trivia': 'Show trivia question',
'/meme': 'Show random meme',
},
'Bot Control': {
'/start': 'Show command list',
'/help': 'Show help information',
'/status': 'Show bot status',
'/restart': 'Restart the bot (admin)',
'/stop': 'Stop the bot (admin)',
}
}
Jokes, quotes, facts databases
JOKES = [
"Why don't scientists trust atoms? Because they make up everything!",
"Did you hear about the mathematician who's afraid of negative numbers? He'll stop at nothing to avoid them.",




I work with PyCharm using Python 3.11 and encountered a similar issue, after installing bs4.
Implemented @Nikowos solution and it works! Thanks