Skip to content

Instantly share code, notes, and snippets.

@bennyistanto
Created January 6, 2023 15:52
Show Gist options
  • Save bennyistanto/1f0ccc86e5a13032ab2c2327393d49b1 to your computer and use it in GitHub Desktop.
Save bennyistanto/1f0ccc86e5a13032ab2c2327393d49b1 to your computer and use it in GitHub Desktop.
Reshape wide column structure to long following Stata-style CSV
import pandas as pd
# Load the data
min_df = pd.read_csv("min.csv")
mean_df = pd.read_csv("mean.csv")
max_df = pd.read_csv("max.csv")
# Melt the data frames to long format
min_long = pd.melt(min_df, id_vars=["lon", "lat", "orig_id"], var_name="date", value_name="min")
mean_long = pd.melt(mean_df, id_vars=["lon", "lat", "orig_id"], var_name="date", value_name="mean")
max_long = pd.melt(max_df, id_vars=["lon", "lat", "orig_id"], var_name="date", value_name="max")
# Merge the data frames
df = min_long.merge(mean_long, on=["lon", "lat", "orig_id", "date"])
df = df.merge(max_long, on=["lon", "lat", "orig_id", "date"])
# Extract year and month from the date column
df["year"] = df["date"].str[:4]
df["month"] = df["date"].str[4:6]
# Drop the date column
df = df.drop("date", axis=1)
# Rearrange the columns to the desired order
df = df[["lon", "lat", "orig_id", "year", "month", "min", "mean", "max"]]
# Save the resulting data frame to a Stata-style CSV file
df.to_csv("output.csv", index=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment