Skip to content

Instantly share code, notes, and snippets.

@libratiger
libratiger / heap_sort.py
Created December 16, 2012 15:17
a heap sort write by python
# -*- coding: utf-8 -*-
"""
This is the heap sort, the algorithm is dive into two step:
first: bulid the max heap
second: heap sort
By: DjvuLee @2012-12-16
"""
@libratiger
libratiger / adaboost.py
Created June 10, 2013 13:50
A simple demo about the adaboost algorithm
from __future__ import division
from numpy import *
class AdaBoost:
def __init__(self, training_set):
self.training_set = training_set
self.N = len(self.training_set)
self.weights = ones(self.N)/self.N
self.RULES = []
require 'formula'
class ScalaDocs < Formula
homepage 'http://www.scala-lang.org/'
url 'http://www.scala-lang.org/files/archive/scala-docs-2.9.3.zip'
sha1 '5bf44bd04b2b37976bde5d4a4c9bb6bcdeb10eb2'
end
class ScalaCompletion < Formula
homepage 'http://www.scala-lang.org/'
2014-05-18 02:53:46
Full thread dump OpenJDK 64-Bit Server VM (24.45-b08 mixed mode):
"spark-akka.actor.default-dispatcher-26" daemon prio=10 tid=0x00007fd8e000a800 nid=0x6757 waiting on condition [0x00007fd94d322000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00007fdaa1a48bd8> (a akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinPool)
at scala.concurrent.forkjoin.ForkJoinPool.scan(ForkJoinPool.java:2075)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
@libratiger
libratiger / gist:0015550b4904c7c62cfe
Created May 25, 2014 14:45
The tempest.api.identity test rusult
======================================================================
FAIL: tempest.api.identity.admin.test_tokens.TokensTestJSON.test_create_get_delete_token[gate]
tags: worker-2
----------------------------------------------------------------------
Empty attachments:
pythonlogging:''
stderr
stdout
Traceback (most recent call last):
@libratiger
libratiger / Husky002.res
Last active August 29, 2015 14:02
LR数据, 24GB的输入数据。
#Task ID #反序列化时间(ms) #执行时间(ms) #序列化时间(ms) #传输数据大小(Byte) #开始传输的时间(woker) #传输结束的时间(ms) #传输时间(s)
1465 5 7884 0 562 2014 06 01 22:12:10 2014 06 01 22:12:10 -0.073
1633 6 7415 0 562 2014 06 01 22:12:16 2014 06 01 22:12:16 -0.073
1689 7 7783 0 562 2014 06 01 22:12:18 2014 06 01 22:12:18 -0.073
1836 6 7594 0 562 2014 06 01 22:12:24 2014 06 01 22:12:23 -0.073
1864 4 6939 0 562 2014 06 01 22:12:24 2014 06 01 22:12:24 -0.073
2051 5 7684 1 562 2014 06 01 22:12:31 2014 06 01 22:12:31 -0.073
2254 6 7800 0 562 2014 06 01 22:12:39 2014 06 01 22:12:39 -0.073
2471 5 7786 0 562 2014 06 01 22:12:46 2014 06 01 22:12:46 -0.073
#Task ID #反序列化时间(ms) #执行时间(ms) #序列化时间(ms) #传输开始时间 #传输结束时间 #传输时间(master) #传输时间(worker)
15 624 10253 2 562 2014 06 04 16:18:21 2014 06 04 16:18:21 2014 06 04 16:18:21 -69 15
43 625 10661 0 562 2014 06 04 16:18:22 2014 06 04 16:18:22 2014 06 04 16:18:22 -66 13
71 625 11029 0 562 2014 06 04 16:18:22 2014 06 04 16:18:22 2014 06 04 16:18:22 -70 8
99 624 10574 0 562 2014 06 04 16:18:22 2014 06 04 16:18:21 2014 06 04 16:18:22 -71 11
127 624 10473 0 562 2014 06 04 16:18:21 2014 06 04 16:18:21 2014 06 04 16:18:21 -71 7
155 624 11530 0 562 2014 06 04 16:18:22 2014 06 04 16:18:22 2014 06 04 16:18:22 -70 9
183 625 10480 0 562 2014 06 04 16:18:21 2014 06 04 16:18:21 2014 06 04 16:18:21 -70 9
211 624 10452 0 562 2014 06 04 16:18:21 2014 06 04 16:18:21 2014 06 04 16:18:21 -70 9
239 624 10924 0 562 2014 06 04 16:18:22 2014 06 04 16:18:22 2014 06 04 16:18:22 -71 8
@libratiger
libratiger / driver.log
Created June 11, 2014 14:40
Kmeans algorithm driver.log
14/06/11 14:34:46 INFO SparkContext: Job finished: collect at KMeans.scala:232, took 9.079383951 s
14/06/11 14:34:46 INFO SparkContext: Starting job: collectAsMap at KMeans.scala:224
14/06/11 14:34:46 INFO DAGScheduler: Registering RDD 15 (reduceByKey at KMeans.scala:224)
14/06/11 14:34:46 INFO DAGScheduler: Got job 4 (collectAsMap at KMeans.scala:224) with 9000 output partitions (allowLocal=false)
14/06/11 14:34:46 INFO DAGScheduler: Final stage: Stage 9 (collectAsMap at KMeans.scala:224)
14/06/11 14:34:46 INFO DAGScheduler: Parents of final stage: List(Stage 10)
14/06/11 14:34:46 INFO DAGScheduler: Missing parents: List(Stage 10)
14/06/11 14:34:46 INFO DAGScheduler: Submitting Stage 10 (MapPartitionsRDD[15] at reduceByKey at KMeans.scala:224), which has no missing parents
14/06/11 14:34:50 INFO DAGScheduler: Submitting 1200 missing tasks from Stage 10 (MapPartitionsRDD[15] at reduceByKey at KMeans.scala:224)
14/06/11 14:34:50 INFO TaskSchedulerImpl: Adding task set 10.0 with 1200 tasks
package mllib
import scala.util.Random
import org.jblas.DoubleMatrix
import org.apache.spark.SparkContext
import org.apache.spark.rdd._
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext._
package mliib
import scala.util.Random
import org.jblas.DoubleMatrix
import org.apache.spark._
import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD
import org.apache.spark.mllib.regression.LabeledPoint