This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
import argparse | |
import numpy as np | |
import pandas as pd | |
import scipy.sparse as sparse | |
from sklearn import cross_validation, metrics | |
from sklearn.datasets import dump_svmlight_file | |
from sklearn.linear_model import LogisticRegression | |
from sklearn.preprocessing import StandardScaler |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
''' | |
\s [ \n\t\f\r] | |
\S [^ \n\t\f\r] | |
\w [A-Za-z0-9_] | |
\d [0-9] | |
\D [^0-9] | |
\W [^A-Za-z0-9_] | |
\b word boundary | |
^ beginning of string |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
def prep_log(sec_file, min_file, src_tz='US/Pacific', dst_tz='US/Eastern', datetime_fmt='%m/%d/%y %H:%M'): | |
"""Preprocess a second level log file by aggregating it in a minute level and converting timezone if necessary. | |
Args: | |
sec_file: a second level CSV log file with timestamps in the first column | |
min_file: a minute level CSV output log file with timestamps in the first column | |
src_tz: a source timezone (default: EST) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# written by 'y' @ kaggle - https://www.kaggle.com/tyi2000 | |
write.libsvm <- function(data, target, filename="out.dat") { | |
out <- file(filename) | |
writeLines(paste(target, apply(data, 1, function( X ) paste(apply(cbind(which(X!=0), X[which(X!=0)]), 1, paste, collapse=":"), collapse=" "))), out) | |
close( out ) | |
} |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.