Last active
August 16, 2022 20:58
-
-
Save Magnus167/9e51ce8a38b1b900a84a8d14a73ae993 to your computer and use it in GitHub Desktop.
Gist to read a Wikipedia table to pandas. Useful for data analysis in general thanks to bs4 and pandas
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import pandas as pd | |
| import bs4, requests | |
| wiki_link = 'https://en.wikipedia.org/wiki/List_of_chess_grandmasters' | |
| # link to wiki page | |
| soup = bs4.BeautifulSoup(requests.get(wiki_link).text, 'html.parser') | |
| # build the soup :P | |
| table = pd.io.html.read_html(str(soup.find('table', id='grandmasters'))) | |
| # get table by table ID from Wiki html. Use inspect element instead of painfully going through soup | |
| tableDF = pd.concat(table) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment