Skip to content

Instantly share code, notes, and snippets.

@amontalenti
Last active August 29, 2015 14:23
Show Gist options
  • Save amontalenti/255a8e5b1191b2918f34 to your computer and use it in GitHub Desktop.
Save amontalenti/255a8e5b1191b2918f34 to your computer and use it in GitHub Desktop.
feature area
0 A 32.5
1 A 45.6
2 A 42.1
3 B 1.5
4 B 6.08
5 B 5.1
6 C 5.9
7 C 16.5
8 C 32.5
9 D 45.6
10 D 42.1
11 D 6.08
import pandas as pd
df = pd.DataFrame.from_csv("data.csv")
# calculate sum of areas per feature
sums = df.groupby("feature").area.sum()
df["sums"] = df.feature.apply(
# look up pre-computed sum, fill in every row
lambda x: sums[x])
df["pct"] = df.apply(
# calculate percentage of each area from its sum
lambda row: "{:.2%}".format(row["area"] / row["sums"]),
# apply to rows
axis=1)
# print a CSV file
print df.to_csv()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment