Skip to content

Instantly share code, notes, and snippets.

@mplatzer
Last active July 3, 2025 20:20
Show Gist options
  • Save mplatzer/1a30319a5e9fb2e560b4fa51f776cab7 to your computer and use it in GitHub Desktop.
Save mplatzer/1a30319a5e9fb2e560b4fa51f776cab7 to your computer and use it in GitHub Desktop.
Making a First Submission to the FLAT DATA challenge of The MOSTLY AI Prize πŸ†
# install Synthetic Data SDK
# see also https://github.com/mostly-ai/mostlyai
#!uv pip install "mostlyai[local]"
# load training data
import pandas as pd
trn = pd.read_csv('/Users/mplatzer/github/the-prize-data/flat/flat-training.csv')
# instantiate SDK in LOCAL mode
from mostlyai.sdk import MostlyAI
mostly = MostlyAI(local=True)
# train a generator
g = mostly.train(config={
'tables': [{
'name': 'flat',
'data': trn, # your training data
'tabular_model_configuration': {
'max_training_time': 2, # e.g. limit training to 2 minutes
}
}]
})
# create a synthetic dataset
sd = mostly.generate(g)
syn = sd.data()
syn.to_csv('your-flat-submission01.csv.gz', index=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment