Last active
August 13, 2024 15:04
-
-
Save tchen/01d1d61a985190ff6b71fc14c45f95c9 to your computer and use it in GitHub Desktop.
openpyxl: dealing with merged cells
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# When exporting excel spreadsheets with merged cells, only the first cell of the merged cell has a value | |
# This snippet allow you to take the value from the first cell for all the other cells within the merged range | |
# Tested with openpyxl 3.0.7 as of 2021-06-17 | |
# | |
# References: | |
# https://stackoverflow.com/questions/39574991/how-to-detect-merged-cells-in-excel-with-openpyxl | |
# https://openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.merge.html | |
import openpyxl | |
def parent_of_merged_cell(cell): | |
""" Find the parent of the merged cell by iterating through the range of merged cells """ | |
sheet = cell.parent | |
child_coord = cell.coordinate | |
# Note: if there are many merged cells in a large spreadsheet, this may become inefficient | |
for merged in sheet.merged_cells.ranges: | |
if child_coord in merged: | |
return merged.start_cell.coordinate | |
return None | |
def cell_value(cell): | |
""" Reads the value of a cell, if cell is within a merged cell, | |
find the first cell in the merged cell and get its value | |
""" | |
if isinstance(cell, openpyxl.cell.cell.Cell): | |
return cell.value | |
if isinstance(cell, openpyxl.cell.cell.MergedCell): | |
coord = parent_of_merged_cell(cell) | |
parent = cell.parent[coord] | |
return parent.value | |
workbook = openpyxl.load_workbook(filename) | |
sheet = workbook['Some Sheet'] | |
# Say A1:A4 are merged, only the first cell have a value | |
sheet[A1].value # has value | |
sheet[A2].value # is None | |
cell_value(sheet[A1]) # returns sheet[A1].value | |
cell_value(sheet[A2]) # returns sheet[A1].value | |
cell_value(sheet[A3]) # returns sheet[A1].value | |
cell_value(sheet[A4]) # returns sheet[A1].value |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Here's my take:
To read an Excel file where merged cells are filled in (in other words, the Pandas DataFrame values are all the same), I used the following code:
The StackOverflow.