Skip to content

Instantly share code, notes, and snippets.

View mitallast's full-sized avatar

Alexey Korchevsky mitallast

View GitHub Profile
@mitallast
mitallast / console output
Last active October 16, 2019 11:50
Example Naive Bayes Classifier with Apache Spark Pipeline
+--------+--------------------+-----+--------------------+--------------------+--------------------+--------------------+----------+
|category| text|label| words| features| rawPrediction| probability|prediction|
+--------+--------------------+-----+--------------------+--------------------+--------------------+--------------------+----------+
| 3001|Плойки и наборы V...| 24.0|[плойки, и, набор...|(10000,[326,796,1...|[-174.67716870697...|[6.63481663197049...| 24.0|
| 833|"Чехол-обложка дл...| 1.0|["чехол-обложка, ...|(10000,[514,986,1...|[-379.37151502387...|[5.32678001676623...| 1.0|
| 833|"Чехол-обложка дл...| 1.0|["чехол-обложка, ...|(10000,[514,986,1...|[-379.84825219376...|[2.15785456821554...| 1.0|
| 833|"Чехол-обложка дл...| 1.0|["чехол-обложка, ...|(10000,[290,514,9...|[-395.42735009477...|[6.44323423370500...| 1.0|
| 833|"Чехол-обложка дл...| 1.0|["чехол-обложка, ...|(10000,[290,514,9...|[-396.10251348
@mitallast
mitallast / console output
Last active February 2, 2016 09:01
Apache spark NN test
+--------+--------------------+-----+--------------------+--------------------+----------+
|category| text|label| words| features|prediction|
+--------+--------------------+-----+--------------------+--------------------+----------+
| 0|"Мышь беспроводна...| 0.0|["мышь, беспровод...|(10000,[372,634,6...| 3.0|
| 9|покрышка Данлоп 2...| 8.0|[покрышка, данлоп...|(10000,[118,1828,...| 0.0|
| 0|"Стилус для Nokia...| 0.0|["стилус, для, no...|(10000,[45,290,57...| 1.0|
| 9|покрышка Континен...| 8.0|[покрышка, контин...|(10000,[50,121,18...| 0.0|
| 833|Alcatel OT-890 St...| 1.0|[alcatel, ot-890,...|(10000,[971,1031,...| 0.0|
| 833|"Nokia Asha 200 G...| 1.0|["nokia, asha, 20...|(10000,[544,548,1...| 0.0|
| 833|"Samsung Champ Ne...| 1.0|["samsung, champ,...|(10000,[182,325,6...| 0.0|
@mitallast
mitallast / console output
Created February 2, 2016 10:12
Cross validation spark example
+--------+--------------------+-----+--------------------+--------------------+--------------------+--------------------+----------+
|category| text|label| words| features| rawPrediction| probability|prediction|
+--------+--------------------+-----+--------------------+--------------------+--------------------+--------------------+----------+
| 833|"Чехол-обложка дл...| 1.0|["чехол-обложка, ...|(10000,[514,986,1...|[-379.50617089769...|[3.15784456725782...| 1.0|
| 833|"Чехол-обложка дл...| 1.0|["чехол-обложка, ...|(10000,[290,514,9...|[-395.54097963559...|[3.98362185323457...| 1.0|
| 0|"Держатель для мо...| 0.0|["держатель, для,...|(10000,[34,45,47,...|[-333.85171077966...|[0.88443426309164...| 0.0|
| 9|Шина Nordman RS 1...| 8.0|[шина, nordman, r...|(10000,[1124,1223...|[-70.906588615908...|[5.01470370003123...| 8.0|
| 833|"Набор для зарядк...| 1.0|["набор, для, зар...|(10000,[130,292,5...|[-530.30719860
@mitallast
mitallast / SimpleApp.scala
Created February 5, 2016 12:07
Example app to classify sells
import org.apache.spark.ml.Pipeline
import org.apache.spark.ml.classification.NaiveBayes
import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
import org.apache.spark.ml.feature.{HashingTF, StringIndexer, Tokenizer}
import org.apache.spark.ml.tuning.{CrossValidator, ParamGridBuilder}
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}
object SimpleApp {
def main(args: Array[String]) {
@mitallast
mitallast / impulse-response.py
Created February 14, 2016 08:57
Prints impulse response from wav file format (like a reverb or guitar cab IR)
from scikits.audiolab import Sndfile
import numpy as np
import matplotlib.pyplot as plt
import scipy.signal
f = Sndfile('impulse.wav', 'r')
data = np.array(f.read_frames(f.nframes), dtype=np.float64)
f.close()
fs = f.samplerate
@mitallast
mitallast / predict.py
Created March 8, 2016 13:25
Predict next day open price for USD by previous 5 days
import pandas_datareader.data as web
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
import numpy as np
import datetime
start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2015, 3, 8)
usd = web.DataReader("USD", 'google', start, end)
$ go test -bench=.
testing: warning: no tests to run
PASS
BenchmarkJsonReflection-8 100000 14689 ns/op
BenchmarkFFJson-8 200000 6345 ns/op
ok github.com/mitallast/goprotoserver/protocol 2.974s
# init
# L - error function
def gbm_L(y, z):
return (y-z)**2
# derivative of L function without coefficient `2`
def dbm_L_der(y, z):
# ignore coefficient 2
return y - z
int[] a = [1, 2, 3, 4, 5, 6, 7, 8, 9]
int[] b = [1, 2, 3, 4, 9, 8, 7, 6, 5]
int[] e = [9, 1, 8, 2, 7, 3, 6, 4, 5]
int[] f = [9, 8, 7, 6, 5, 4, 3, 2, 1]
int[] c = [1, 2, 3, 4, 5, 6, 7, 8, 10]
int[] d = [1, 2, 3, 4, 5, 6, 7, 8, 8]
int[] g = [1, 8, 3, 4, 5, 6, 7, 8, 8]