Skip to content

Instantly share code, notes, and snippets.

@lushl9301
Last active August 23, 2016 01:41
Show Gist options
  • Save lushl9301/bf42ace8f1236ce5dc691e01a619a371 to your computer and use it in GitHub Desktop.
Save lushl9301/bf42ace8f1236ce5dc691e01a619a371 to your computer and use it in GitHub Desktop.

Kaggle_CrowdFlower

Download souces and data

git clone https://github.com/ChenglongChen/Kaggle_CrowdFlower.git
cd Kaggle_CrowdFlower
cd Data
cp ~/Downloads/*.csv .

Dependencies

install pip https://pip.pypa.io/en/stable/installing/

wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo pip install --upgrade pip

install modules

sudo pip install numpy scipy pandas nltk bs4 sklearn hyperopt keras xgboost ml_metrics
sudo yum install rgf libfm

Prepare nltk

python
>>> import nltk
>>> nltk.download
>>> d
>>> l
>>> all // download all work list here

Run test

cd Code/Feat

python run_all.py

This may take a few hours.

Generate model library

sudo pip install pymongo networkx h5py
// do not install bson

cd Code/Model

python generate_best_single_model.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment