Skip to content

Instantly share code, notes, and snippets.

View halegreen's full-sized avatar
:octocat:
Focusing

halegreen halegreen

:octocat:
Focusing
  • Alibaba
  • Hong Kong
View GitHub Profile
@halegreen
halegreen / AbstractApiService.java
Last active April 29, 2019 09:45
封装Restful Api并调用
package com.hq.modules.executor.service.impl;
import com.alibaba.fastjson.JSON;
import com.hq.common.utils.GenericSuperclassUtil;
import com.hq.common.utils.HttpResult;
import com.hq.common.utils.HttpUtils;
import com.shaw.common.model.RoadLink;
import com.shaw.common.model.TaskInfo;
import com.shaw.common.model.TrafficLightModel;
import com.shaw.common.model.TrafficStateData;
@halegreen
halegreen / pandas2ffmlib.py
Last active November 22, 2017 08:29
Transform dataframe data to the ffmlib datatype
## for category columns
def category_feature2FFM(data, category_list):
previous_len = 0
for i in range(len(category_list)):
category_name = category_list[i]
dic = data[category_name].unique()
dic = dict(zip(dic, range(len(dic))))
@halegreen
halegreen / FFM.py
Created November 17, 2017 02:44
Python implementation of Field-aware Factorization Machine Model
from datetime import datetime
from csv import DictReader
from math import exp, log, sqrt,pow
import itertools
import math
from random import random,shuffle,uniform,seed
import pickle
import sys
seed(1024)
@halegreen
halegreen / Model_Ensemble.py
Created November 17, 2017 02:40
用了两层的模型融合,Level 1使用了:XGBoost、LightGBM、RandomForest、ExtraTrees、DecisionTree、AdaBoost,一共6个模型,Level 2使用了LinearRegression来拟合第一层的结果
class Ensemble(object):
def __init__(self, n_splits, stacker, base_models):
self.n_splits = n_splits
self.stacker = stacker
self.base_models = base_models
def fit_predict(self, X, y, T):
X = np.array(X)
y = np.array(y)
T = np.array(T)
@halegreen
halegreen / tf_CBOW.py
Last active October 7, 2017 11:43
CBOW(Continuous Bag-of-Words) model implementation with tensorflow ,just need to change the generate_batch() function.
def generate_batch(data, batch_size, skip_window):
"""
Generates a mini-batch of training data for the training CBOW
embedding model.
:param data (numpy.ndarray(dtype=int, shape=(corpus_size,)): holds the
training corpus, with words encoded as an integer
:param batch_size (int): size of the batch to generate
:param skip_window (int): number of words to both left and right that form
the context window for the target word.
Batch is a vector of shape (batch_size, 2*skip_window), with each entry for the batch containing all the context words, with the corresponding label being the word in the middle of the context
@halegreen
halegreen / likelihood_encoding.py
Last active September 9, 2017 11:43
likelihood encoding implementation, using 2 level CV
import numpy as np
import pandas as pd
from sklearn.model_selection import KFold
import dill as pickle
import sys
def input_data(train_file):
with open(train_file, 'rb') as f1:
train_data = pickle.load(f1)
cat_feature = []