Skip to content

Instantly share code, notes, and snippets.

View mplatzer's full-sized avatar
🤓

Michael Platzer mplatzer

🤓
View GitHub Profile
@mplatzer
mplatzer / the-prize-flat.py
Last active July 3, 2025 20:20
Making a First Submission to the FLAT DATA challenge of The MOSTLY AI Prize 🏆
# install Synthetic Data SDK
# see also https://github.com/mostly-ai/mostlyai
#!uv pip install "mostlyai[local]"
# load training data
import pandas as pd
trn = pd.read_csv('/Users/mplatzer/github/the-prize-data/flat/flat-training.csv')
# instantiate SDK in LOCAL mode
from mostlyai.sdk import MostlyAI
@mplatzer
mplatzer / mostly-ai-dp.py
Created November 17, 2024 17:23
US Census Income Dataset - Differentially Privacy Synthetic Data with MOSTLY AI
# LOAD original data
import pandas as pd
census_df = pd.read_csv('https://github.com/mostly-ai/public-demo-data/raw/refs/heads/dev/census/census.csv.gz')
# INITIALIZE python client
from mostlyai import MostlyAI
mostly = MostlyAI()
# TRAIN with Differential Privacy
for m in [0.25, 0.5, 1, 1.5, 2, 4, 8, 16, 32]: # noise multipliers
from mostlyai import MostlyAI
# initialize client
mostly = MostlyAI()
# train generator for BERKA dataset
config = {
"name": "berka",
"tables": [
{
@mplatzer
mplatzer / resize images
Created June 29, 2020 17:00
resize & crop photos to 10:15, plus add a white border
library(magick)
fns <- list.files(patt='jpg')
lapply(fns, function(fn) {
cat(basename(fn), '\n')
img <- image_read(fn)
cdate <- exif_read(fn)$CreateDate
if (!is.null(cdate)) {
cdate <- as.Date(cdate, format='%Y:%m:%d')
} else {
cdate <- Sys.Date()
@mplatzer
mplatzer / cdnow.csv
Created June 29, 2020 06:24
CDNOW Master File - converted to CSV
We can't make this file beautiful and searchable because it's too large.
users_id,date,cds,amt
1,1997-01-01,1,11.77
2,1997-01-12,1,12
2,1997-01-12,5,77
3,1997-01-02,2,20.76
3,1997-03-30,2,20.76
3,1997-04-02,2,19.54
3,1997-11-15,5,57.45
3,1997-11-25,4,20.96
3,1998-05-28,1,16.99