libratiger’s gists

libratiger / heap_sort.py

Created December 16, 2012 15:17

a heap sort write by python

	# -- coding: utf-8 --

	"""
	This is the heap sort, the algorithm is dive into two step:
	first: bulid the max heap
	second: heap sort

	By: DjvuLee @2012-12-16
	"""

libratiger / adaboost.py

Created June 10, 2013 13:50

A simple demo about the adaboost algorithm

	from __future__ import division
	from numpy import *

	class AdaBoost:

	def __init__(self, training_set):
	self.training_set = training_set
	self.N = len(self.training_set)
	self.weights = ones(self.N)/self.N
	self.RULES = []

libratiger / scala.rb

Created November 4, 2013 08:13 — forked from JoshRosen/scala.rb

	require 'formula'

	class ScalaDocs < Formula
	homepage 'http://www.scala-lang.org/'
	url 'http://www.scala-lang.org/files/archive/scala-docs-2.9.3.zip'
	sha1 '5bf44bd04b2b37976bde5d4a4c9bb6bcdeb10eb2'
	end

	class ScalaCompletion < Formula
	homepage 'http://www.scala-lang.org/'

libratiger / dum_log.txt

Created May 18, 2014 03:15

log info

	2014-05-18 02:53:46
	Full thread dump OpenJDK 64-Bit Server VM (24.45-b08 mixed mode):

	"spark-akka.actor.default-dispatcher-26" daemon prio=10 tid=0x00007fd8e000a800 nid=0x6757 waiting on condition [0x00007fd94d322000]
	java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for <0x00007fdaa1a48bd8> (a akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinPool)
	at scala.concurrent.forkjoin.ForkJoinPool.scan(ForkJoinPool.java:2075)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

libratiger / gist:0015550b4904c7c62cfe

Created May 25, 2014 14:45

The tempest.api.identity test rusult

	======================================================================
	FAIL: tempest.api.identity.admin.test_tokens.TokensTestJSON.test_create_get_delete_token[gate]
	tags: worker-2
	----------------------------------------------------------------------
	Empty attachments:
	pythonlogging:''
	stderr
	stdout

	Traceback (most recent call last):

libratiger / Husky002.res

Last active August 29, 2015 14:02

LR数据, 24GB的输入数据。

	#Task ID #反序列化时间(ms) #执行时间(ms) #序列化时间(ms) #传输数据大小(Byte) #开始传输的时间(woker) #传输结束的时间(ms) #传输时间(s)

	1465 5 7884 0 562 2014 06 01 22:12:10 2014 06 01 22:12:10 -0.073
	1633 6 7415 0 562 2014 06 01 22:12:16 2014 06 01 22:12:16 -0.073
	1689 7 7783 0 562 2014 06 01 22:12:18 2014 06 01 22:12:18 -0.073
	1836 6 7594 0 562 2014 06 01 22:12:24 2014 06 01 22:12:23 -0.073
	1864 4 6939 0 562 2014 06 01 22:12:24 2014 06 01 22:12:24 -0.073
	2051 5 7684 1 562 2014 06 01 22:12:31 2014 06 01 22:12:31 -0.073
	2254 6 7800 0 562 2014 06 01 22:12:39 2014 06 01 22:12:39 -0.073
	2471 5 7786 0 562 2014 06 01 22:12:46 2014 06 01 22:12:46 -0.073

libratiger / Husky002_new.res

Created June 5, 2014 03:07

	#Task ID #反序列化时间(ms) #执行时间(ms) #序列化时间(ms) #传输开始时间 #传输结束时间 #传输时间(master) #传输时间(worker)
	15 624 10253 2 562 2014 06 04 16:18:21 2014 06 04 16:18:21 2014 06 04 16:18:21 -69 15
	43 625 10661 0 562 2014 06 04 16:18:22 2014 06 04 16:18:22 2014 06 04 16:18:22 -66 13
	71 625 11029 0 562 2014 06 04 16:18:22 2014 06 04 16:18:22 2014 06 04 16:18:22 -70 8
	99 624 10574 0 562 2014 06 04 16:18:22 2014 06 04 16:18:21 2014 06 04 16:18:22 -71 11
	127 624 10473 0 562 2014 06 04 16:18:21 2014 06 04 16:18:21 2014 06 04 16:18:21 -71 7
	155 624 11530 0 562 2014 06 04 16:18:22 2014 06 04 16:18:22 2014 06 04 16:18:22 -70 9
	183 625 10480 0 562 2014 06 04 16:18:21 2014 06 04 16:18:21 2014 06 04 16:18:21 -70 9
	211 624 10452 0 562 2014 06 04 16:18:21 2014 06 04 16:18:21 2014 06 04 16:18:21 -70 9
	239 624 10924 0 562 2014 06 04 16:18:22 2014 06 04 16:18:22 2014 06 04 16:18:22 -71 8

libratiger / driver.log

Created June 11, 2014 14:40

Kmeans algorithm driver.log

	14/06/11 14:34:46 INFO SparkContext: Job finished: collect at KMeans.scala:232, took 9.079383951 s
	14/06/11 14:34:46 INFO SparkContext: Starting job: collectAsMap at KMeans.scala:224
	14/06/11 14:34:46 INFO DAGScheduler: Registering RDD 15 (reduceByKey at KMeans.scala:224)
	14/06/11 14:34:46 INFO DAGScheduler: Got job 4 (collectAsMap at KMeans.scala:224) with 9000 output partitions (allowLocal=false)
	14/06/11 14:34:46 INFO DAGScheduler: Final stage: Stage 9 (collectAsMap at KMeans.scala:224)
	14/06/11 14:34:46 INFO DAGScheduler: Parents of final stage: List(Stage 10)
	14/06/11 14:34:46 INFO DAGScheduler: Missing parents: List(Stage 10)
	14/06/11 14:34:46 INFO DAGScheduler: Submitting Stage 10 (MapPartitionsRDD[15] at reduceByKey at KMeans.scala:224), which has no missing parents
	14/06/11 14:34:50 INFO DAGScheduler: Submitting 1200 missing tasks from Stage 10 (MapPartitionsRDD[15] at reduceByKey at KMeans.scala:224)
	14/06/11 14:34:50 INFO TaskSchedulerImpl: Adding task set 10.0 with 1200 tasks

libratiger / kmeans.scala

Last active August 29, 2015 14:02

	package mllib


	import scala.util.Random
	import org.jblas.DoubleMatrix

	import org.apache.spark.SparkContext
	import org.apache.spark.rdd._
	import org.apache.spark.SparkConf
	import org.apache.spark.SparkContext._

libratiger / data_generator.scala

Created June 13, 2014 12:40

	package mliib

	import scala.util.Random

	import org.jblas.DoubleMatrix

	import org.apache.spark._
	import org.apache.spark.SparkContext
	import org.apache.spark.rdd.RDD
	import org.apache.spark.mllib.regression.LabeledPoint

libra libratiger