Last active
April 15, 2021 10:12
-
-
Save ewauq/963c7a5ea8b8554744b855e09d0d4126 to your computer and use it in GitHub Desktop.
Build a ranking on child items based on a parent item in a pandas DataFrame
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
# Consider a list of movies with unranked actors that you want to rank based on the source list order below. | |
# [movie_id, actor_id] | |
movies_list = [ | |
[123, 54], | |
[123, 21], | |
[123, 66], | |
[45, 22], | |
[45, 54], | |
[61, 87], | |
[61, 21], | |
[88, 21], | |
] | |
movies_df = pd.DataFrame(movies_list, columns=["movie_id", "actor_id"]) | |
# Building the logical rank for each row first | |
movies_df["ranking"] = range(len(movies_df)) | |
# Building the sub-ranking grouped by movie_id based on the logical rank defined above | |
movies_df["ranking"] = movies_df.groupby("movie_id")["ranking"].rank() | |
# Converting the rank value from float to int | |
movies_df["ranking"] = movies_df["ranking"].convert_dtypes(int) | |
print(movies_df) | |
movie_id actor_id ranking | |
0 123 54 1 | |
1 123 21 2 | |
2 123 66 3 | |
3 45 22 1 | |
4 45 54 2 | |
5 61 87 1 | |
6 61 21 2 | |
7 88 21 1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment