Skip to content

Instantly share code, notes, and snippets.

@thehappycheese
Last active January 19, 2023 02:50
Show Gist options
  • Save thehappycheese/0bf2be1ab037e1e801c0872faebd2426 to your computer and use it in GitHub Desktop.
Save thehappycheese/0bf2be1ab037e1e801c0872faebd2426 to your computer and use it in GitHub Desktop.
Fill blank row with nearest populated row with data using a chainage from/to location
for group_index, group in df.groupby(["road_number","cway"]):
blank_rows = group[group["cluster"].isna()]
filled_rows = group[group["cluster"].notna()]
for blank_row_index, blank_row in blank_rows.iterrows():
# find distance by looing for minimum "signed overlap"
overlap_min = np.maximum(filled_rows["slk_from"], blank_row["slk_from"])
overlap_max = np.minimum(filled_rows["slk_to"], blank_row["slk_to"])
overlap = overlap_max - overlap_min
# when overlap is positive the nearest filled row is overlapping the blank row
# when overlap is negative, the nearest filled row is some distance away
distance_to_filled_rows = np.minimum(overlap, 0).abs()
# finally select the ID of the row with the minimum value
nearest_filled_row_id = distance_to_filled_rows.idxmin()
# fill in the blank row with the value from the nearest filled row
blank_row["cluster"] = filled_rows.loc[nearest_filled_row_id, "cluster"]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment