Skip to content

Instantly share code, notes, and snippets.

View benwtrent's full-sized avatar
🏠
Working from home

Benjamin Trent benwtrent

🏠
Working from home
View GitHub Profile
@benwtrent
benwtrent / nab_logstash.conf
Last active August 15, 2018 22:30
NAB artificial data logstash conf
input {
stdin {
}
}
filter {
csv {
columns => ["timestamp","value"]
separator => ","
convert => { 'value' => 'float' }
@benwtrent
benwtrent / ml_hlrc_precheck.sh
Last active September 17, 2018 19:59
ML HLRC test running help
#!/usr/bin/env bash
#These should be deprecated when things are moved out of the protocol package
./gradlew x-pack:protocol:checkStyle
./gradlew x-pack:protocol:test
./gradlew :client:rest-high-level:checkStyle
./gradlew :client:rest-high-level:test
./gradlew :client:rest-high-level:integTest -Dtests.class=org.elasticsearch.client.documentation.MlClientDocumentationIT
@benwtrent
benwtrent / c_cpp_properties.json
Created November 9, 2018 19:59
Various .vscode files used to work with: https://github.com/elastic/ml-cpp
{
"configurations": [
{
"name": "Mac",
"includePath": [
"${workspaceFolder}/**",
"/usr/local/include/boost-1_65_1/",
"/usr/local/include",
"/usr/include"
]
@benwtrent
benwtrent / mix_type_and_reindex
Last active January 2, 2019 20:32
typeless reindex being called against indexes with `doc` mapping
PUT _template/template_test
{
"index_patterns": ["test*"],
"settings": {
"number_of_shards": 1
},
"mappings": {
"doc": {
"dynamic_templates": [
{
@benwtrent
benwtrent / mat-to-csv
Created January 3, 2019 19:25
mat to csv octave: just a simple
f=load(FILENAME.mat)
C = [f.X, f.y]
csvwrite(FILENAME.csv, C)
@benwtrent
benwtrent / values_true.json
Created February 28, 2019 19:25
How to select terms buckets by all values being true
PUT test-bool/
{
"mappings" : {
"_doc" : {
"properties" : {
"my-field" : {
"type" : "boolean"
}, "my-terms": {
"type" : "keyword"
}
@benwtrent
benwtrent / painless_linear_regression.json
Created March 7, 2019 21:50
Linear Regression Model inference with Painless in ElasticSearch
PUT _scripts/linear_regression_inference
{
"script": {
"lang": "painless",
"source": """
double total = params.intercept;
for (int i = 0; i < params.coefs.length; ++i) {
total += params.coefs.get(i) * doc[params['x'+i]].value;
}
return total;
@benwtrent
benwtrent / m2cgen_painless.py
Last active May 9, 2019 21:03
This is a giant script created via some hacking against m2cgen
import xgboost as xgb
from sklearn import datasets
from sklearn.metrics import mean_squared_error
import m2cgen as m2c
diabetes = datasets.load_diabetes() # load data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(diabetes.data, diabetes.target, test_size=0.2, random_state=0)
print(diabetes.feature_names)
@benwtrent
benwtrent / LogParser.java
Created July 12, 2019 19:37
Java implementation of the Drain algorithm.
import java.util.AbstractMap;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.regex.Pattern;
/**
* This class contains the parsing logic for implementing the DRAIN algorithm.
@benwtrent
benwtrent / mappings
Last active November 11, 2019 17:24
building out avg price prediction on a house given ashville air bnb listing data
{
"listings-ash" : {
"aliases" : { },
"mappings" : {
"_meta" : {
"created_by" : "ml-file-data-visualizer"
},
"properties" : {
"@timestamp" : {
"type" : "date"