Focusing

halegreen halegreen

Focusing

Software Engineer , Focus on Distributed System | Cloud Native | Serverless Computing

halegreen / AbstractApiService.java

Last active April 29, 2019 09:45

封装Restful Api并调用

	package com.hq.modules.executor.service.impl;

	import com.alibaba.fastjson.JSON;
	import com.hq.common.utils.GenericSuperclassUtil;
	import com.hq.common.utils.HttpResult;
	import com.hq.common.utils.HttpUtils;
	import com.shaw.common.model.RoadLink;
	import com.shaw.common.model.TaskInfo;
	import com.shaw.common.model.TrafficLightModel;
	import com.shaw.common.model.TrafficStateData;

halegreen / pandas2ffmlib.py

Last active November 22, 2017 08:29

Transform dataframe data to the ffmlib datatype

	## for category columns
	def category_feature2FFM(data, category_list):
	previous_len = 0
	for i in range(len(category_list)):

	category_name = category_list[i]

	dic = data[category_name].unique()
	dic = dict(zip(dic, range(len(dic))))

halegreen / FFM.py

Created November 17, 2017 02:44

Python implementation of Field-aware Factorization Machine Model

	from datetime import datetime
	from csv import DictReader
	from math import exp, log, sqrt,pow
	import itertools
	import math
	from random import random,shuffle,uniform,seed
	import pickle
	import sys

	seed(1024)

halegreen / Model_Ensemble.py

Created November 17, 2017 02:40

用了两层的模型融合，Level 1使用了：XGBoost、LightGBM、RandomForest、ExtraTrees、DecisionTree、AdaBoost，一共6个模型，Level 2使用了LinearRegression来拟合第一层的结果

	class Ensemble(object):
	def __init__(self, n_splits, stacker, base_models):
	self.n_splits = n_splits
	self.stacker = stacker
	self.base_models = base_models

	def fit_predict(self, X, y, T):
	X = np.array(X)
	y = np.array(y)
	T = np.array(T)

halegreen / tf_CBOW.py

Last active October 7, 2017 11:43

CBOW(Continuous Bag-of-Words) model implementation with tensorflow ,just need to change the generate_batch() function.

	def generate_batch(data, batch_size, skip_window):
	"""
	Generates a mini-batch of training data for the training CBOW
	embedding model.
	:param data (numpy.ndarray(dtype=int, shape=(corpus_size,)): holds the
	training corpus, with words encoded as an integer
	:param batch_size (int): size of the batch to generate
	:param skip_window (int): number of words to both left and right that form
	the context window for the target word.
	Batch is a vector of shape (batch_size, 2*skip_window), with each entry for the batch containing all the context words, with the corresponding label being the word in the middle of the context

halegreen / likelihood_encoding.py

Last active September 9, 2017 11:43

likelihood encoding implementation, using 2 level CV

	import numpy as np
	import pandas as pd
	from sklearn.model_selection import KFold
	import dill as pickle
	import sys

	def input_data(train_file):
	with open(train_file, 'rb') as f1:
	train_data = pickle.load(f1)
	cat_feature = []