Skip to content

Instantly share code, notes, and snippets.

View yu-iskw's full-sized avatar

Yu Ishikawa yu-iskw

View GitHub Profile
@yu-iskw
yu-iskw / file0.r
Created January 1, 2014 12:40
R の merge 関数は結合キーを複数指定できる ref: http://qiita.com/hereticreader/items/ac6aed5445a2198a2c49
merge(x=data.frame.x, y=data.frame.y
, by.x=c("clm1", "clm2")
, by.y=c("clm1", "clm2")
)
#!/bin/sh
# one way (older scala version will be installed)
# sudo apt-get install scala
#2nd way
sudo apt-get remove scala-library scala
wget www.scala-lang.org/files/archive/scala-2.10.3.deb
sudo dpkg -i scala-2.10.3.deb
sudo apt-get update
@yu-iskw
yu-iskw / gist:518a48a68ef368998058
Last active August 29, 2015 14:05
Experiment the performance Math.abs() and breeze.numerics.abs

Source Code by scalatest

class CheckPerformanceSuite extends FunSuite {
  test("check the performance") {
    var start = System.currentTimeMillis()
    val vectors = (1 to 3000000).map { i => Vectors.dense(Math.random(), -1 * Math.random(), Math.random())}
    var end = System.currentTimeMillis()
    println(s"Time for Creating: ${end - start}")
@yu-iskw
yu-iskw / gist:aa56aad7481c4b45192c
Created August 25, 2014 22:32
Recheck the performance of Math.abs and breeze.numerics.abs

Test Code

import breeze.numerics.abs
import org.apache.spark.mllib.linalg.Vectors
import org.scalatest.FunSuite

class CheckPerformanceSuite extends FunSuite {
  test("check the performance") {
    var start = System.currentTimeMillis()
@yu-iskw
yu-iskw / gist:37ae208c530f7018e048
Last active August 29, 2015 14:05
Distance Metric Implementation Example
import breeze.generic.UFunc
import breeze.linalg.{sum, DenseVector => DBV, Vector => BV}
import breeze.macros.expand
object distance extends UFunc {
val DEFAULT_METHOD = "euclidean"
@expand
@expand.valify
@yu-iskw
yu-iskw / gist:4e0d3a2f999effbcf640
Last active September 18, 2015 09:20
A weighted Euclidean distance function implementation
package breeze.linalg.functions
import breeze.generic.UFunc
import breeze.linalg.{SparseVector, DenseVector}
import breeze.numerics.sqrt
/**
* A weighted Euclidean distance function implementation
*/
object weightedEuclideanDistance extends UFunc {
/**
* bisecting <master> <input> <nNodes> <subIterations>
*
* divisive hierarchical clustering using bisecting k-means
* assumes input is a text file, each row is a data point
* given as numbers separated by spaces
*
*/
import org.apache.spark.SparkContext
using Distributions
export Measure, DiscreteMeasure, Estimator, EstimatorQuality,
blb, randsubset
abstract Measure
type DiscreteMeasure{S<:Number,T<:Number} <: Measure
points :: Vector{S}
weights :: Vector{T}
@yu-iskw
yu-iskw / gist:ba249f79ef338ff86967
Created June 17, 2015 23:51
The problem to add a new operator into SparkR
@yu-iskw
yu-iskw / gist:b727ad7c515eb790f971
Created June 19, 2015 03:56
How to install a package which is not installed
if (!require(pkg)){ 
        install.packages(pkg) 
} 
if ("somepackage" %in% row.names(installed.packages())  == FALSE) 
install.packages("somepackage")