Skip to content

Instantly share code, notes, and snippets.

@Aveek-Saha
Created May 14, 2019 18:37
Show Gist options
  • Save Aveek-Saha/860464a7b52c5bab781f870dcb73ed57 to your computer and use it in GitHub Desktop.
Save Aveek-Saha/860464a7b52c5bab781f870dcb73ed57 to your computer and use it in GitHub Desktop.
A python script to scrape today's trending Twitter hashtags from https://trendogate.com
from bs4 import BeautifulSoup
import urllib
URL = "https://trendogate.com/place/23424977"
# Open the URL
page = urllib.request.Request(URL)
result = urllib.request.urlopen(page)
# Store the HTML page in a variable
resulttext = result.read()
# Creates a nested data structure
soup = BeautifulSoup(resulttext, 'html.parser')
# Since we are interested only in an element with class "list-group"
# We will search for all elements with that class in the soup
soup = soup.find_all(class_= "list-group")
# Soup now contains an array of all elements with the class "list-group"
# Since the Trending today list is the first on the page, it's index is 0
trending_list = soup[0]
# Now we will iterate through the elements of the <ul>
# <ul> has <li> tags nested inside
trending_tags = []
for li in trending_list.contents:
# There is an <a> tag nested in each <li>
a = li.contents[0]
# The contents of 'a' is just the text inside the tag
tag = a.contents[0].strip()
trending_tags.append(tag)
print(trending_tags)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment