Skip to content

Instantly share code, notes, and snippets.

@pochi
Last active August 29, 2015 13:56
Show Gist options
  • Save pochi/9172482 to your computer and use it in GitHub Desktop.
Save pochi/9172482 to your computer and use it in GitHub Desktop.

Install

  • Install from github.
git clone https://github.com/apache/incubator-spark.git
cd incubator-sparl
./sbt/sbt assembly
  • Confirm run spark-shell
pochi 0:15:48 % ./bin/spark-shell                                                       /opt/local/repos/incubator-spark [git incubator-spark master]
----------------
/usr/bin/java -cp :/opt/local/repos/incubator-spark/conf:/opt/local/repos/incubator-spark/assembly/target/scala-2.10/spark-assembly-1.0.0-incubating-SNAPSHOT-hadoop1.0.4.jar -Djava.library.path= -Xms512m -Xmx512m org.apache.spark.repl.Main
----------------
Unable to find a $JAVA_HOME at "/usr", continuing with system-provided Java...
14/02/24 00:16:44 INFO HttpServer: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
14/02/24 00:16:44 INFO HttpServer: Starting HTTP Server
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.0.0-SNAPSHOT
      /_/

Using Scala version 2.10.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_65)
Type in expressions to have them evaluated.
Type :help for more information.
14/02/24 00:17:19 INFO Slf4jLogger: Slf4jLogger started
14/02/24 00:17:19 INFO Remoting: Starting remoting
14/02/24 00:17:19 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:51942]
14/02/24 00:17:19 INFO Remoting: Remoting now listens on addresses: [akka.tcp://[email protected]:51942]
14/02/24 00:17:19 INFO SparkEnv: Registering BlockManagerMaster
14/02/24 00:17:19 INFO DiskBlockManager: Created local directory at /var/folders/vf/wjh62mkn1lj7bc5g4hh0nf340000gn/T/spark-local-20140224001719-cf6d
14/02/24 00:17:19 INFO MemoryStore: MemoryStore started with capacity 303.4 MB.
14/02/24 00:17:19 INFO ConnectionManager: Bound socket to port 51943 with id = ConnectionManagerId(192.168.0.4,51943)
14/02/24 00:17:19 INFO BlockManagerMaster: Trying to register BlockManager
14/02/24 00:17:19 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager 192.168.0.4:51943 with 303.4 MB RAM
14/02/24 00:17:19 INFO BlockManagerMaster: Registered BlockManager
14/02/24 00:17:19 INFO HttpServer: Starting HTTP Server
14/02/24 00:17:19 INFO HttpBroadcast: Broadcast server started at http://192.168.0.4:51944
14/02/24 00:17:19 INFO SparkEnv: Registering MapOutputTracker
14/02/24 00:17:19 INFO HttpFileServer: HTTP File server directory is /var/folders/vf/wjh62mkn1lj7bc5g4hh0nf340000gn/T/spark-9808872c-f48d-40d3-9cb5-05
3b780cbb95
14/02/24 00:17:19 INFO HttpServer: Starting HTTP Server
14/02/24 00:17:50 INFO SparkUI: Started Spark Web UI at http://192.168.0.4:4040
14/02/24 00:17:50 INFO Executor: Using REPL class URI: http://192.168.0.4:51935
2014-02-24 00:17:50.539 java[12334:130b] Unable to load realm info from SCDynamicStore
Created spark context..
Spark context available as sc.

scala> sc
res0: org.apache.spark.SparkContext = org.apache.spark.SparkContext@2d8dea20

scala> 

Just simple!

Try sample

First, I try to execute page ranke example.

  • Create sample input file
http://www.yahoo.co.jp/ http://www.yahoo.co.jp/1
http://www.yahoo.co.jp/ http://www.yahoo.co.jp/2
http://www.yahoo.co.jp/ http://www.yahoo.co.jp/3
http://www.yahoo.co.jp/1 http://www.yahoo.co.jp/
http://www.yahoo.co.jp/2 http://www.yahoo.co.jp/
http://www.yahoo.co.jp/3 http://www.yahoo.co.jp/
  • Set print mode
pochi 23:58:21 % export SPARK_PRINT_LAUNCH_COMMAND=1 
  • Execute page rank program
pochi 23:57:12 % ./bin/run-example org.apache.spark.examples.SparkPageRank local /opt/local/repos/spark/pagerank/input/yahoo.txt 3
Spark Command: /System/Library/Frameworks/JavaVM.framework/Home/bin/java -cp /opt/local/repos/incubator-spark/examples/target/scala-2.10/spark-example
s-assembly-1.0.0-incubating-SNAPSHOT.jar::/opt/local/repos/incubator-spark/conf:/opt/local/repos/incubator-spark/assembly/target/scala-2.10/spark-asse
mbly-1.0.0-incubating-SNAPSHOT-hadoop1.0.4.jar -Djava.library.path= org.apache.spark.examples.SparkPageRank local /opt/local/repos/spark/pagerank/inpu
t/yahoo.txt 3
========================================

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/local/repos/incubator-spark/examples/target/scala-2.10/spark-examples-assembly-1.0.0-incubating-SNAPSHOT.jar!/o
rg/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/local/repos/incubator-spark/assembly/target/scala-2.10/spark-assembly-1.0.0-incubating-SNAPSHOT-hadoop1.0.4.jar
!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (akka.event.slf4j.Slf4jLogger).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
14/02/23 23:57:47 INFO SparkEnv: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
14/02/23 23:57:47 INFO SparkEnv: Registering BlockManagerMaster
14/02/23 23:57:47 INFO DiskBlockManager: Created local directory at /var/folders/vf/wjh62mkn1lj7bc5g4hh0nf340000gn/T/spark-local-20140223235747-33ad
14/02/23 23:57:47 INFO MemoryStore: MemoryStore started with capacity 74.4 MB.
14/02/23 23:57:47 INFO ConnectionManager: Bound socket to port 51859 with id = ConnectionManagerId(192.168.0.4,51859)
14/02/23 23:57:47 INFO BlockManagerMaster: Trying to register BlockManager
14/02/23 23:57:47 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager 192.168.0.4:51859 with 74.4 MB RAM
14/02/23 23:57:47 INFO BlockManagerMaster: Registered BlockManager
14/02/23 23:57:47 INFO HttpServer: Starting HTTP Server
14/02/23 23:57:47 INFO HttpBroadcast: Broadcast server started at http://192.168.0.4:51860
14/02/23 23:57:47 INFO SparkEnv: Registering MapOutputTracker
14/02/23 23:57:47 INFO HttpFileServer: HTTP File server directory is /var/folders/vf/wjh62mkn1lj7bc5g4hh0nf340000gn/T/spark-fb5fa0db-828c-431e-8a4c-75
ba270ac252
14/02/23 23:57:47 INFO HttpServer: Starting HTTP Server
14/02/23 23:58:17 INFO SparkUI: Started Spark Web UI at http://192.168.0.4:4040

2014-02-23 23:58:18.005 java[97753:1003] Unable to load realm info from SCDynamicStore
14/02/23 23:58:18 INFO SparkContext: Added JAR /opt/local/repos/incubator-spark/examples/target/scala-2.10/spark-examples-assembly-1.0.0-incubating-SN
APSHOT.jar at http://192.168.0.4:51861/jars/spark-examples-assembly-1.0.0-incubating-SNAPSHOT.jar with timestamp 1393167498743
14/02/23 23:58:19 INFO MemoryStore: ensureFreeSpace(35456) called with curMem=0, maxMem=77974732
14/02/23 23:58:19 INFO MemoryStore: Block broadcast_0 stored as values to memory (estimated size 34.6 KB, free 74.3 MB)
14/02/23 23:58:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/23 23:58:19 WARN LoadSnappy: Snappy native library not loaded
14/02/23 23:58:19 INFO FileInputFormat: Total input paths to process : 1
14/02/23 23:58:19 INFO SparkContext: Starting job: collect at SparkPageRank.scala:57
14/02/23 23:58:19 INFO DAGScheduler: Registering RDD 4 (distinct at SparkPageRank.scala:46)
14/02/23 23:58:19 INFO DAGScheduler: Registering RDD 7 (distinct at SparkPageRank.scala:46)
14/02/23 23:58:19 INFO DAGScheduler: Registering RDD 16 (reduceByKey at SparkPageRank.scala:54)
14/02/23 23:58:19 INFO DAGScheduler: Registering RDD 25 (reduceByKey at SparkPageRank.scala:54)
14/02/23 23:58:19 INFO DAGScheduler: Registering RDD 34 (reduceByKey at SparkPageRank.scala:54)
14/02/23 23:58:19 INFO DAGScheduler: Got job 0 (collect at SparkPageRank.scala:57) with 1 output partitions (allowLocal=false)
14/02/23 23:58:19 INFO DAGScheduler: Final stage: Stage 0 (collect at SparkPageRank.scala:57)
14/02/23 23:58:19 INFO DAGScheduler: Parents of final stage: List(Stage 1)
14/02/23 23:58:19 INFO DAGScheduler: Missing parents: List(Stage 1)
14/02/23 23:58:19 INFO DAGScheduler: Submitting Stage 3 (MapPartitionsRDD[4] at distinct at SparkPageRank.scala:46), which has no missing parents
14/02/23 23:58:19 INFO DAGScheduler: Submitting 1 missing tasks from Stage 3 (MapPartitionsRDD[4] at distinct at SparkPageRank.scala:46)
14/02/23 23:58:19 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks
14/02/23 23:58:19 INFO TaskSetManager: Starting task 3.0:0 as TID 0 on executor localhost: localhost (PROCESS_LOCAL)
14/02/23 23:58:19 INFO TaskSetManager: Serialized task 3.0:0 as 2005 bytes in 8 ms
14/02/23 23:58:19 INFO Executor: Running task ID 0
14/02/23 23:58:19 INFO Executor: Fetching http://192.168.0.4:51861/jars/spark-examples-assembly-1.0.0-incubating-SNAPSHOT.jar with timestamp 139316749
8743
14/02/23 23:58:19 INFO Utils: Fetching http://192.168.0.4:51861/jars/spark-examples-assembly-1.0.0-incubating-SNAPSHOT.jar to /var/folders/vf/wjh62mkn
1lj7bc5g4hh0nf340000gn/T/fetchFileTemp7215571424853875723.tmp
14/02/23 23:58:20 INFO Executor: Adding file:/var/folders/vf/wjh62mkn1lj7bc5g4hh0nf340000gn/T/spark-2821ee7e-7c77-4ae5-8a14-ba641bd3aaf0/spark-example
s-assembly-1.0.0-incubating-SNAPSHOT.jar to class loader
14/02/23 23:58:20 INFO BlockManager: Found block broadcast_0 locally
14/02/23 23:58:20 INFO HadoopRDD: Input split: file:/opt/local/repos/spark/pagerank/input/yahoo.txt:0+285
14/02/23 23:58:20 INFO Executor: Serialized size of result for 0 is 747                                                                      [102/527]
14/02/23 23:58:20 INFO Executor: Sending result for 0 directly to driver
14/02/23 23:58:20 INFO Executor: Finished task ID 0
14/02/23 23:58:20 INFO TaskSetManager: Finished TID 0 in 935 ms on localhost (progress: 1/1)
14/02/23 23:58:20 INFO DAGScheduler: Completed ShuffleMapTask(3, 0)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool 
14/02/23 23:58:20 INFO DAGScheduler: Stage 3 (distinct at SparkPageRank.scala:46) finished in 0.947 s
14/02/23 23:58:20 INFO DAGScheduler: looking for newly runnable stages
14/02/23 23:58:20 INFO DAGScheduler: running: Set()
14/02/23 23:58:20 INFO DAGScheduler: waiting: Set(Stage 0, Stage 1, Stage 5, Stage 2, Stage 4)
14/02/23 23:58:20 INFO DAGScheduler: failed: Set()
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 0: List(Stage 1)
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 1: List(Stage 2, Stage 4)
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 5: List(Stage 2)
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 2: List()
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 4: List(Stage 5, Stage 2)
14/02/23 23:58:20 INFO DAGScheduler: Submitting Stage 2 (MappedRDD[7] at distinct at SparkPageRank.scala:46), which is now runnable
14/02/23 23:58:20 INFO DAGScheduler: Submitting 1 missing tasks from Stage 2 (MappedRDD[7] at distinct at SparkPageRank.scala:46)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
14/02/23 23:58:20 INFO TaskSetManager: Starting task 2.0:0 as TID 1 on executor localhost: localhost (PROCESS_LOCAL)
14/02/23 23:58:20 INFO TaskSetManager: Serialized task 2.0:0 as 1853 bytes in 0 ms
14/02/23 23:58:20 INFO Executor: Running task ID 1
14/02/23 23:58:20 INFO BlockManager: Found block broadcast_0 locally
14/02/23 23:58:20 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 1 non-zero-bytes blocks out of 1 blocks
14/02/23 23:58:20 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote gets in  6 ms
14/02/23 23:58:20 INFO Executor: Serialized size of result for 1 is 962
14/02/23 23:58:20 INFO Executor: Sending result for 1 directly to driver
14/02/23 23:58:20 INFO Executor: Finished task ID 1
14/02/23 23:58:20 INFO DAGScheduler: Completed ShuffleMapTask(2, 0)
14/02/23 23:58:20 INFO TaskSetManager: Finished TID 1 in 45 ms on localhost (progress: 1/1)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 
14/02/23 23:58:20 INFO DAGScheduler: Stage 2 (distinct at SparkPageRank.scala:46) finished in 0.046 s
14/02/23 23:58:20 INFO DAGScheduler: looking for newly runnable stages
14/02/23 23:58:20 INFO DAGScheduler: running: Set()
14/02/23 23:58:20 INFO DAGScheduler: waiting: Set(Stage 0, Stage 1, Stage 5, Stage 4)                                                         [68/527]
14/02/23 23:58:20 INFO DAGScheduler: failed: Set()
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 0: List(Stage 1)
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 1: List(Stage 4)
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 5: List()
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 4: List(Stage 5)
14/02/23 23:58:20 INFO DAGScheduler: Submitting Stage 5 (MapPartitionsRDD[16] at reduceByKey at SparkPageRank.scala:54), which is now runnable
14/02/23 23:58:20 INFO DAGScheduler: Submitting 1 missing tasks from Stage 5 (MapPartitionsRDD[16] at reduceByKey at SparkPageRank.scala:54)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Adding task set 5.0 with 1 tasks
14/02/23 23:58:20 INFO TaskSetManager: Starting task 5.0:0 as TID 2 on executor localhost: localhost (PROCESS_LOCAL)
14/02/23 23:58:20 INFO TaskSetManager: Serialized task 5.0:0 as 6892 bytes in 3 ms
14/02/23 23:58:20 INFO Executor: Running task ID 2
14/02/23 23:58:20 INFO BlockManager: Found block broadcast_0 locally
14/02/23 23:58:20 INFO BlockManager: Found block broadcast_0 locally
14/02/23 23:58:20 INFO CacheManager: Partition rdd_9_0 not found, computing it
14/02/23 23:58:20 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 1 non-zero-bytes blocks out of 1 blocks
14/02/23 23:58:20 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote gets in  0 ms
14/02/23 23:58:20 INFO MemoryStore: ensureFreeSpace(912) called with curMem=35456, maxMem=77974732
14/02/23 23:58:20 INFO MemoryStore: Block rdd_9_0 stored as values to memory (estimated size 912.0 B, free 74.3 MB)
14/02/23 23:58:20 INFO BlockManagerMasterActor$BlockManagerInfo: Added rdd_9_0 in memory on 192.168.0.4:51859 (size: 912.0 B, free: 74.4 MB)
14/02/23 23:58:20 INFO BlockManagerMaster: Updated info of block rdd_9_0
14/02/23 23:58:20 INFO BlockManager: Found block rdd_9_0 locally
14/02/23 23:58:20 INFO Executor: Serialized size of result for 2 is 962
14/02/23 23:58:20 INFO Executor: Sending result for 2 directly to driver
14/02/23 23:58:20 INFO Executor: Finished task ID 2
14/02/23 23:58:20 INFO DAGScheduler: Completed ShuffleMapTask(5, 0)
14/02/23 23:58:20 INFO TaskSetManager: Finished TID 2 in 57 ms on localhost (progress: 1/1)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Removed TaskSet 5.0, whose tasks have all completed, from pool 
14/02/23 23:58:20 INFO DAGScheduler: Stage 5 (reduceByKey at SparkPageRank.scala:54) finished in 0.058 s
14/02/23 23:58:20 INFO DAGScheduler: looking for newly runnable stages
14/02/23 23:58:20 INFO DAGScheduler: running: Set()
14/02/23 23:58:20 INFO DAGScheduler: waiting: Set(Stage 0, Stage 1, Stage 4)
14/02/23 23:58:20 INFO DAGScheduler: failed: Set()
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 0: List(Stage 1)
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 1: List(Stage 4)
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 4: List()
14/02/23 23:58:20 INFO DAGScheduler: Submitting Stage 4 (MapPartitionsRDD[25] at reduceByKey at SparkPageRank.scala:54), which is now runnable
14/02/23 23:58:20 INFO DAGScheduler: Submitting 1 missing tasks from Stage 4 (MapPartitionsRDD[25] at reduceByKey at SparkPageRank.scala:54)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Adding task set 4.0 with 1 tasks
14/02/23 23:58:20 INFO TaskSetManager: Starting task 4.0:0 as TID 3 on executor localhost: localhost (PROCESS_LOCAL)
14/02/23 23:58:20 INFO TaskSetManager: Serialized task 4.0:0 as 8489 bytes in 3 ms
14/02/23 23:58:20 INFO Executor: Running task ID 3
14/02/23 23:58:20 INFO BlockManager: Found block broadcast_0 locally
14/02/23 23:58:20 INFO BlockManager: Found block broadcast_0 locally
14/02/23 23:58:20 INFO BlockManager: Found block rdd_9_0 locally
14/02/23 23:58:20 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 1 non-zero-bytes blocks out of 1 blocks
14/02/23 23:58:20 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote gets in  0 ms
14/02/23 23:58:20 INFO Executor: Serialized size of result for 3 is 962
14/02/23 23:58:20 INFO Executor: Sending result for 3 directly to driver
14/02/23 23:58:20 INFO Executor: Finished task ID 3
14/02/23 23:58:20 INFO DAGScheduler: Completed ShuffleMapTask(4, 0)
14/02/23 23:58:20 INFO TaskSetManager: Finished TID 3 in 30 ms on localhost (progress: 1/1)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Removed TaskSet 4.0, whose tasks have all completed, from pool 
14/02/23 23:58:20 INFO DAGScheduler: Stage 4 (reduceByKey at SparkPageRank.scala:54) finished in 0.031 s
14/02/23 23:58:20 INFO DAGScheduler: looking for newly runnable stages
14/02/23 23:58:20 INFO DAGScheduler: running: Set()
14/02/23 23:58:20 INFO DAGScheduler: waiting: Set(Stage 0, Stage 1)
14/02/23 23:58:20 INFO DAGScheduler: failed: Set()
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 0: List(Stage 1)
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 1: List()
14/02/23 23:58:20 INFO DAGScheduler: Submitting Stage 1 (MapPartitionsRDD[34] at reduceByKey at SparkPageRank.scala:54), which is now runnable
14/02/23 23:58:20 INFO DAGScheduler: Submitting 1 missing tasks from Stage 1 (MapPartitionsRDD[34] at reduceByKey at SparkPageRank.scala:54)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
14/02/23 23:58:20 INFO TaskSetManager: Starting task 1.0:0 as TID 4 on executor localhost: localhost (PROCESS_LOCAL)
14/02/23 23:58:20 INFO TaskSetManager: Serialized task 1.0:0 as 9209 bytes in 5 ms
14/02/23 23:58:20 INFO Executor: Running task ID 4
14/02/23 23:58:20 INFO BlockManager: Found block broadcast_0 locally
14/02/23 23:58:20 INFO BlockManager: Found block broadcast_0 locally
14/02/23 23:58:20 INFO BlockManager: Found block rdd_9_0 locally
14/02/23 23:58:20 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 0 non-zero-bytes blocks out of 1 blocks
14/02/23 23:58:20 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote gets in  0 ms
14/02/23 23:58:20 INFO Executor: Serialized size of result for 4 is 962
14/02/23 23:58:20 INFO Executor: Sending result for 4 directly to driver
14/02/23 23:58:20 INFO Executor: Finished task ID 4
14/02/23 23:58:20 INFO DAGScheduler: Completed ShuffleMapTask(1, 0)
14/02/23 23:58:20 INFO TaskSetManager: Finished TID 4 in 33 ms on localhost (progress: 1/1)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
14/02/23 23:58:20 INFO DAGScheduler: Stage 1 (reduceByKey at SparkPageRank.scala:54) finished in 0.034 s
14/02/23 23:58:20 INFO DAGScheduler: looking for newly runnable stages
14/02/23 23:58:20 INFO DAGScheduler: running: Set()
14/02/23 23:58:20 INFO DAGScheduler: waiting: Set(Stage 0)
14/02/23 23:58:20 INFO DAGScheduler: failed: Set()
14/02/23 23:58:20 INFO DAGScheduler: Missing parents for Stage 0: List()
14/02/23 23:58:20 INFO DAGScheduler: Submitting Stage 0 (MappedValuesRDD[37] at mapValues at SparkPageRank.scala:54), which is now runnable
14/02/23 23:58:20 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (MappedValuesRDD[37] at mapValues at SparkPageRank.scala:54)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
14/02/23 23:58:20 INFO TaskSetManager: Starting task 0.0:0 as TID 5 on executor localhost: localhost (PROCESS_LOCAL)
14/02/23 23:58:20 INFO TaskSetManager: Serialized task 0.0:0 as 2526 bytes in 1 ms
14/02/23 23:58:20 INFO Executor: Running task ID 5
14/02/23 23:58:20 INFO BlockManager: Found block broadcast_0 locally
14/02/23 23:58:20 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 0 non-zero-bytes blocks out of 1 blocks
14/02/23 23:58:20 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote gets in  0 ms
14/02/23 23:58:20 INFO Executor: Serialized size of result for 5 is 813
14/02/23 23:58:20 INFO Executor: Sending result for 5 directly to driver
14/02/23 23:58:20 INFO Executor: Finished task ID 5
14/02/23 23:58:20 INFO DAGScheduler: Completed ResultTask(0, 0)
14/02/23 23:58:20 INFO TaskSetManager: Finished TID 5 in 17 ms on localhost (progress: 1/1)
14/02/23 23:58:20 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
14/02/23 23:58:20 INFO DAGScheduler: Stage 0 (collect at SparkPageRank.scala:57) finished in 0.017 s
14/02/23 23:58:20 INFO SparkContext: Job finished: collect at SparkPageRank.scala:57, took 1.384382 s
./bin/run-example org.apache.spark.examples.SparkPageRank local  3  6.46s user 1.30s system 11% cpu 1:05.90 total
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment