Created
October 7, 2017 14:48
-
-
Save MichelleDalalJian/4d630b054e647b2d61a5ed9bcc385f10 to your computer and use it in GitHub Desktop.
Extracting Data With Regular Expressions Finding Numbers in a Haystack In this assignment you will read through and parse a file with text and numbers. You will extract all the numbers in the file and compute the sum of the numbers. Data Files We provide two files for this assignment. One is a sample file where we give you the sum for your testi…
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import re | |
| hand = open("regex_sum_24962.txt") | |
| x=list() | |
| for line in hand: | |
| y = re.findall('[0-9]+',line) | |
| x = x+y | |
| sum=0 | |
| for z in x: | |
| sum = sum + int(z) | |
| print(sum) |
My prompt was
"Actual data: http://py4e-data.dr-chuck.net/regex_sum_2076089.txt (There are 79 values and the sum ends with 942)"
`import re
import urllib.request
url = 'https://py4e-data.dr-chuck.net/regex_sum_2076089.txt'
response = urllib.request.urlopen(url)
data = response.read().decode()
numbers = re.findall('[0-9]+', data)
total_sum = sum(int(num) for num in numbers)
print("Sum:", total_sum)
`
Sum = 391942
This one works
I'm not even sure how to open this file. I tried converting the data into a txt file and opening it through terminal but it just keeps saying it can't find it.
Actual data: http://py4e-data.dr-chuck.net/regex_sum_2301852.txt (There are 87 values and the sum ends with 524) I need help with this
import re
import urllib.request
# Step 1: Fetch data from the URL
url = 'http://py4e-data.dr-chuck.net/regex_sum_2301852.txt'
data = urllib.request.urlopen(url).read().decode()
# Step 2: Use regex to find all numbers
numbers = re.findall('[0-9]+', data)
# Step 3: Convert to integers and compute the sum
numbers = [int(num) for num in numbers]
print('Count:', len(numbers))
print('Sum:', sum(numbers))
*Gabrile Emuron *
*Explanation:*
-
re.findall('[0-9]+', data) finds all sequences of digits.
-
[int(num) for num in numbers] converts them to integers.
-
sum(numbers) adds them all up.
When you run this code, you’ll get:
Count: 87Sum: 50524
So the final *sum = 50524*
Systems Administrator
----------------------
M: +256 774103904 / +256 704574080 |
E: ***@***.*** ***@***.***
T: +256 326300600 | ***@***.***
Plot 69 Bukoto Street, Kololo | P.O Box 36429, Kampala, Uganda
…On Thu, Oct 23, 2025 at 4:19 PM ElPapuchoo22 ***@***.***> wrote:
***@***.**** commented on this gist.
------------------------------
Actual data: http://py4e-data.dr-chuck.net/regex_sum_2301852.txt (There
are 87 values and the sum ends with 524) I need help with this
—
Reply to this email directly, view it on GitHub
<https://gist.github.com/MichelleDalalJian/4d630b054e647b2d61a5ed9bcc385f10#gistcomment-5820191>
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASFLRWEWWEVGGCGGSXGVKID3ZDIXVBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLKJSGI4TCNBVG43TRJDOMFWWLKDBMN2G64S7NFSIFJLWMFWHKZNEORZHKZNENZQW2ZN3ORUHEZLBMRPXAYLSORUWG2LQMFXHIX3BMN2GS5TJOR4YFJLWMFWHKZNEM5UXG5FENZQW2ZNLORUHEZLBMRPXI6LQMWWHG5LCNJSWG5C7OR4XAZNLI5UXG5CDN5WW2ZLOOSTHI33QNFRXHEMCUR2HS4DFURTWS43UUV3GC3DVMWUDQMJWGU3DANJUU52HE2LHM5SXFJTDOJSWC5DF>
.
You are receiving this email because you commented on the thread.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>
.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This week’s project will focus on extracting meaningful information from a typical log file.
Please use the file redhat.txt Download redhat.txtfor this assignment.
Your assignment is to develop a python script that:
Prompts the user for a file to process.
Open the file and iterate through each line of the file, catch and report any errors that occur.
For each line in the file:
If the line identifies a specific worm that was detected, then record the unique name of the worm.
Keep track of the number of occurrences of each unique worm.
Once all the lines have been processed produce a prettytable result that includes the name of each unique worm and number of occurrences that were identified. The prettytable should be sorted by highest occurring worms.
Your script must be commented in detail.
Submit one file logScan.py. In addition, submit a screenshot of successful execution of the script.
Any help as soon as possible