Skip to content

Instantly share code, notes, and snippets.

@rom1504
Created June 9, 2022 01:37
Show Gist options
  • Save rom1504/759b14e6ff9cfe0648e5082c7f5e5828 to your computer and use it in GitHub Desktop.
Save rom1504/759b14e6ff9cfe0648e5082c7f5e5828 to your computer and use it in GitHub Desktop.
Display aesthetic
import pandas as pd
df = pd.read_parquet("aethetic_multi/0000.parquet")
buckets = [(i, i+1) for i in range(10)]
html= "<h1>Aesthetic subsets in Laion2B-multi</h1>"
for [a,b] in buckets:
total_part = df[(df["prediction"] >= a) & (df["prediction"] <= b)]
count_part = len(total_part) / len(df) * 100
estimated =len(total_part) / len(df) * 2.2*10**3
part = total_part[:200]
html+=f"<h2>In bucket {a} - {b} there is {count_part:.2f}% ({estimated:.2f}M) samples:</h2> <div>"
for url in part["url"]:
html+='<img src="'+url+'" width="100" />'
html+="</div>"
with open("aesthetic/aesthetic_viz_multi.html", "w") as f:
f.write(html)
here's the script to print html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment