Skip to content

Instantly share code, notes, and snippets.

@DuaneR5280
Last active June 9, 2024 13:54
Show Gist options
  • Save DuaneR5280/e433ec746bea1c007e13d70b4449099f to your computer and use it in GitHub Desktop.
Save DuaneR5280/e433ec746bea1c007e13d70b4449099f to your computer and use it in GitHub Desktop.
Extract Table from HTML element
def table_extract(table):
"""
Extracts data from a table in an HTML document.
Args:
table (HTML element): The table element to extract data from.
Returns:
list of dict: A list of dictionaries, where each dictionary represents a row in the table.
The keys of the dictionaries are the headers of the table, and the values are the cell values.
"""
data = []
headers = [header.text() for header in table.css("th")]
for row in table.css("tr"):
cells = row.css("td")
if len(cells) == len(headers):
row_data = {}
for i, cell in enumerate(cells):
row_data[headers[i]] = cell.text().strip()
data.append(row_data)
return data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment