Skip to content

Instantly share code, notes, and snippets.

View thomasjpfan's full-sized avatar

Thomas J. Fan thomasjpfan

View GitHub Profile
@thomasjpfan
thomasjpfan / README.md
Last active February 5, 2019 23:04
Cost Complexity Pruning benchmarks

How to run benchmarks

Run pure python version

  1. Set up a virtual env.
  2. git clone -b ccp_prune_tree https://github.com/thomasjpfan/scikit-learn scikit-learn-ccp-python
  3. cd scikit-learn-ccp-python
  4. git checkout 25910e085cf7bb0a98ee33c050fa9233e247e523
  5. Install scikit-learn
  6. Go to directory with bench_prune_tree.py
@thomasjpfan
thomasjpfan / ghpr
Last active August 13, 2019 19:27
Checkout branch based on PR number
#!/usr/bin/env python3
# pip install GitPython PyGithub
import argparse
import os
import sys
from github import Github
from git import Repo
@thomasjpfan
thomasjpfan / gridsearch_different_estimators.py
Created May 1, 2019 16:10
GridSearch over different estimators
from sklearn.tree import DecisionTreeRegressor
from sklearn.linear_model import Ridge
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
boston = load_boston()
X_train, X_test, y_train, y_test = train_test_split(
@thomasjpfan
thomasjpfan / weight-initialization.ipynb
Last active May 16, 2019 15:46 — forked from ceceshao1/weight-initialization.ipynb
Weight-initialization-methods
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@thomasjpfan
thomasjpfan / README.md
Last active August 15, 2019 19:09
Benchmark for scaling GBDT

For reference here are the time it takes to run the following:

OMP_NUM_THREADS=$i python benchmarks/bench_hist_gradient_boosting_higgsboson.py \
    --n-leaf-nodes 255 --n-trees 100
lightgbm==2.2.1
xgboost==0.90
@thomasjpfan
thomasjpfan / partial_dependence_api_issues.ipynb
Last active August 16, 2019 19:07
Partial dependence api issues
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
{
"TagRemovePreprocessor": { "remove_input_tags": ["to_remove"] },
"SlidesExporter": { "reveal_theme": "simple" },
"NbConvertApp": {
"export_format": "slides"
},
"TemplateExporter": {
"exclude_input_prompt": true,
"exclude_output_prompt": true
}
@thomasjpfan
thomasjpfan / benchmark.py
Created December 2, 2019 19:14
Benchmarking script
import sklearn
import numpy as np
import scipy
import csv
import argparse
from pathlib import Path
import openml
from openml.exceptions import OpenMLRunsExistError
from sklearn.experimental import enable_hist_gradient_boosting
@thomasjpfan
thomasjpfan / sklearn_medical.ipynb
Created January 16, 2020 20:58
sklearn usage in nature articles
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.