Skip to content

Instantly share code, notes, and snippets.

View grayskripko's full-sized avatar
🏠
Working from home

Gray grayskripko

🏠
Working from home
  • Buenos Aires, Argentina
View GitHub Profile
@grayskripko
grayskripko / flawed_models.txt
Last active July 31, 2017 21:36
Flawed models
>> Regression:
dataset: [diamonds], target: [price], by_col: [color], prod(dim(df)): [539400]
model time memory mse R2
1: svmRadial 80.8 16.6 MB 403177.8 0.974
2: xgb 1.0 4.1 MB 1201304.2 0.924
3: glm 0.5 19.9 MB 1585278.7 0.899
4: mean_dummy 0.0 0 15767646.5 0.000
....
@grayskripko
grayskripko / model_by.txt
Last active July 31, 2017 22:15
Model_by experiments on diamonds dataset
# all xgb on every layer were runned with: eta = 0.1, nrounds = 200, lambda = 1
# glmnet: alpha = 1, nlambda = 100, standardize = T (shifted to mean and scaled)
# ranger: mtry = 8, num.trees = 200
# glm: standardize = T
>> Regression:
dataset: [diamonds], target: [price], by_col: [color], prod(dim(df)): [539400]
1-layer models: [ glm_by ]
model time memory mse R2
@grayskripko
grayskripko / catboost_install.txt
Last active February 10, 2018 18:12
Catboost installation problems in Windows 10 for R
the main answer: https://github.com/catboost/catboost/issues/247
another dirty way
Install both 2015 and 2017: http://landinghub.visualstudio.com/visual-cpp-build-tools
A lot of options are in the end of the article https://tech.yandex.com/catboost/doc/dg/concepts/r-installation-docpage/
Components to pick: https://github.com/catboost/catboost/issues/30#issuecomment-316545310
Check and edit PATH if needed to include python https://github.com/catboost/catboost/issues/3#issuecomment-316547561
devtools::find_rtools();
devtools::install_github('catboost/catboost', subdir = 'catboost/R-package')
@grayskripko
grayskripko / lightgbm_for_caret.txt
Last active August 2, 2017 11:15
LightGbm for caret
from https://github.com/bwilbertz/RLightGBM
in git shell:
git clone --recursive https://github.com/bwilbertz/RLightGBM.git
cd RLightGBM
R CMD build --no-build-vignettes pkg/RLightGBM
in RStudio for Windows 10:
install.packages("RLightGBM/RLightGBM_0.1.tar.gz", type = "source", repos = NULL)
@grayskripko
grayskripko / ignor_cols.txt
Created August 9, 2017 10:53
Ignoring columns
# please note, the first 2 tests are on the same data!
> devtools::test(filter='Ignor')
Loading stackatto
Testing stackatto
>> Ignoring columns:
dataset [Ionosphere], target [V26], by_col [], n_cells [12K], seed [9171]
layer1 models: [ kknn svmLinear svmRadial glm glmnet rf ranger catboost lgbm xgb ]
Ignored: [ V30 V16 ]
@grayskripko
grayskripko / edit_fun.R
Last active September 4, 2017 09:56
Edit external function
#' Edits a function
#' @param fun a function to change
#' @param pattern a string of the code lines to change. Not a regex
#' @param replacement a string of the new lines of code
edit_fun <- function(fun, pattern, replacement) {
stopifnot(length(pattern) == 1 && length(replacement) == 1)
align_spaces <- function(x) gsub(' +', ' ', gsub('\n', ' \n', x))
deparsed_func <- align_spaces(paste(deparse(fun), collapse = '\n'))
pattern <- align_spaces(pattern)
stopifnot(grepl(pattern, deparsed_func, fixed = T))
@grayskripko
grayskripko / ssh22temporary.txt
Created September 27, 2017 12:58
AWS ssh error "port 22: Resource temporarily unavailable"
Edit security group according to https://spark.rstudio.com/examples-emr.html
Pay attention to specification of the actual IP in inbound rules for the security group SSH source.
It is possible to pick source "anywhere" but is not desirable
@grayskripko
grayskripko / emr_permission_denied.txt
Created September 27, 2017 13:04
emr permission denied
sudo su
sudo su - hadoop
add administrative policies to aws entities (EMR_DefaultRole, EMR_EC2_DefaultRole)
@grayskripko
grayskripko / RBuildTools.txt
Created December 18, 2017 14:26
r windows 10 install rbuildtools again
devtools::find_rtools() # solution
devtools::install_github('thomasp85/lime')
@grayskripko
grayskripko / mlr_makelearner_assertion_failed.txt
Created December 18, 2017 14:36
mlr assertion on choices failed makeLearner
mlr::makeLearner("classif.ranger")
# > Assertion on 'choices' failed. Must be of length >= 1, but has length 0.
# Solution
library(mlr)