Skip to content

Instantly share code, notes, and snippets.

@roycoding
roycoding / pizza-rf.md
Last active June 27, 2017 18:36
Beat the Benchmark: Random Acts of Pizza

Beating the Random Acts of Pizza Benchmark

The Random Acts of Pizza competition is about predicting when a request for a free pizza on the Random Acts of Pizza sub-reddit is granted. The benchmark is simply guessing that no pizzas are given (or all). This results in an AUC score of 50.

To beat the AUC = 50 benchmark with a simple model, I first looked at the training and test data to find simple features. I decided to use the word counts of the request title and comment text, as longer comments might be skipped by readers.

To build the model I first extracted only the desired fields from the original JSON files with jq and used json2csv to write out CSV.

@roycoding
roycoding / bikeshare.md
Last active June 27, 2017 18:37
Day 3: Mean benchmark of the Bike Sharing Demand Kaggle competition.

Mean benchmark for Bike Sharing Demand competition

In the Bike Sharing Demand competition on Kaggle, the goal is to predict the demand for bike share bikes in Washington DC based on historical usage data. For this regression problem, the evaluation metric is RMSLE.

I decided to recreate the mean value benchmark using unix commandline tools. The benchmark consists of using the overall usage mean from the training set for all test set datetimes (i.e. using the same, single value for all predicted counts).

I used the csvkit suite of tools along with sed to recreate the benchmark. This was my first time using csvkit and I'm happy so far!

@roycoding
roycoding / pizza.md
Last active June 27, 2017 18:38
All zeros benchmark for Random Acts of Pizza

All zeros benchmark for Random Acts of Pizza competition

In the Random Acts of Pizza competition on Kaggle, the goal is to predict whether people posting on Reddit's Random Acts of Pizza sub-Reddit will actually receive a free pizza based on their post. For this classification problem, the evaluation metric is AUC.

I recreated the all-zeros benchmark using a couple of unix commandline tools.

  1. Create the CSV header:
echo "request_id,requester_received_pizza" > zero-benchmark.csv
@roycoding
roycoding / gcf.py
Last active August 29, 2015 14:05
Kaggle - Titanic: match the Gender, Class, Fare benchmark
# Python code for the Kaggle Titanic competition
# https://www.kaggle.com/c/titanic-gettingStarted
# This code implements the gender, class, fare benchmark.
# This is part of the Match 5 Kaggle Benchmarks in 5 Days challenge.
# https://www.kaggle.com/forums/t/9993/match-5-kaggle-benchmarks-in-5-days
import pandas as pd
import numpy as np
@roycoding
roycoding / klackers.md
Last active September 1, 2020 10:58
Klackers Strategy

Klackers strategy via Monte Carlo

Roy Keyes

19 May 2014 - This is a post on my blog.

Klackers (a.k.a Shut the Box) is a dice game, often played in bars and pubs. It's a game of chance, arithmetic, and strategy. This little project is intended to find the best simple strategy for playing Klackers. Maybe you can become a Klackers shark...?

The game

Klackers is played with dice on a game board like the one pictured below.

A Shut the Box game, via Wikipedia

@roycoding
roycoding / 2014-03-11-gists.md
Last active June 19, 2017 18:47
Gihub Gists: Blogging for the lazy

Github Gists: Blogging for the lazy

Roy Keyes

11 March 2014 - This is a post on my blog.

Recently I decided to revamp my website. I wanted it to be simple, mobile friendly, have Markdown-based blogging, and not pay an arm and a leg to host it.

Static sites are all the rage these days, and not without reason. They're cheap, fast, and portable. Of the several hosting options I looked at, including S3, Github seemed like the easiest. A site is included even with your free account and you can just push a git repo to publish.

Although static site generators are very popular, I decided that I would simply use a CSS framework like Bootstrap. Having built a few websites before, I knew I wanted to start "responsive" out of the box and use something with light mental overhead. At some point I came accross Skeleton and it seemed to fit the bill.