Skip to content

Instantly share code, notes, and snippets.

View snoop2head's full-sized avatar

snoop2head snoop2head

View GitHub Profile
🌞 Morning 205 commits β–ˆβ–ˆβ–ˆβ–β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 16.6%
πŸŒ† Daytime 390 commits β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 31.5%
πŸŒƒ Evening 372 commits β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Žβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 30.1%
πŸŒ™ Night 270 commits β–ˆβ–ˆβ–ˆβ–ˆβ–Œβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 21.8%
import numpy as np
import pandas as pd
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
import csv
from urllib.parse import urlparse
def crwl_as_csv(univ_query):
page = 1
import re
from urllib.request import urlopen
from bs4 import BeautifulSoup
from xlwt import Workbook
wb = Workbook()
sheet1 = wb.add_sheet('Sheet 1', cell_overwrite_ok=True)
#HTML tag removal function
def remove_tag(content):