Created
August 17, 2018 17:01
-
-
Save msukmanowsky/b9cb6700e8ccaf93f265962000403f28 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2018-08-17 13:01:07 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable | |
2018-08-17 13:01:07 INFO SparkContext:54 - Running Spark version 2.3.1 | |
2018-08-17 13:01:07 INFO SparkContext:54 - Submitted application: pandas_udf | |
2018-08-17 13:01:07 INFO SecurityManager:54 - Changing view acls to: mikesukmanowsky | |
2018-08-17 13:01:07 INFO SecurityManager:54 - Changing modify acls to: mikesukmanowsky | |
2018-08-17 13:01:07 INFO SecurityManager:54 - Changing view acls groups to: | |
2018-08-17 13:01:07 INFO SecurityManager:54 - Changing modify acls groups to: | |
2018-08-17 13:01:07 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(mikesukmanowsky); groups with view permissions: Set(); users with modify permissions: Set(mikesukmanowsky); groups with modify permissions: Set() | |
2018-08-17 13:01:08 INFO Utils:54 - Successfully started service 'sparkDriver' on port 51078. | |
2018-08-17 13:01:08 INFO SparkEnv:54 - Registering MapOutputTracker | |
2018-08-17 13:01:08 INFO SparkEnv:54 - Registering BlockManagerMaster | |
2018-08-17 13:01:08 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information | |
2018-08-17 13:01:08 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up | |
2018-08-17 13:01:08 INFO DiskBlockManager:54 - Created local directory at /private/var/folders/07/gnp1f3hs7kn7p0j7kcs2g4s80000gn/T/blockmgr-82f6215e-9b12-4124-a513-cb8d5a0bb750 | |
2018-08-17 13:01:08 INFO MemoryStore:54 - MemoryStore started with capacity 366.3 MB | |
2018-08-17 13:01:08 INFO SparkEnv:54 - Registering OutputCommitCoordinator | |
2018-08-17 13:01:08 INFO log:192 - Logging initialized @2046ms | |
2018-08-17 13:01:08 INFO Server:346 - jetty-9.3.z-SNAPSHOT | |
2018-08-17 13:01:08 INFO Server:414 - Started @2108ms | |
2018-08-17 13:01:08 INFO AbstractConnector:278 - Started ServerConnector@780d9633{HTTP/1.1,[http/1.1]}{0.0.0.0:4040} | |
2018-08-17 13:01:08 INFO Utils:54 - Successfully started service 'SparkUI' on port 4040. | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7d97f5ee{/jobs,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@12309e0b{/jobs/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@407108cf{/jobs/job,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@366763bc{/jobs/job/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@203163bd{/stages,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@68ce3004{/stages/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3c4a7dfc{/stages/stage,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36d7f08{/stages/stage/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6ca74704{/stages/pool,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@315dbe09{/stages/pool/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7abe7a83{/storage,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1243dfe3{/storage/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@448a25c8{/storage/rdd,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@187b76a6{/storage/rdd/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@11987d48{/environment,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1d28145f{/environment/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@899c657{/executors,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4b945fa4{/executors/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@19b419a3{/executors/threadDump,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@44e37c26{/executors/threadDump/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@20eb193{/static,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3dc2eef3{/,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@287b3151{/api,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4e5c8d1{/jobs/job/kill,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@687d1e94{/stages/stage/kill,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://192.168.1.9:4040 | |
2018-08-17 13:01:08 INFO SparkContext:54 - Added file file:/Users/mikesukmanowsky/code/parsely/engineering/casterisk-realtime/spark-simple.py at file:/Users/mikesukmanowsky/code/parsely/engineering/casterisk-realtime/spark-simple.py with timestamp 1534525268603 | |
2018-08-17 13:01:08 INFO Utils:54 - Copying /Users/mikesukmanowsky/code/parsely/engineering/casterisk-realtime/spark-simple.py to /private/var/folders/07/gnp1f3hs7kn7p0j7kcs2g4s80000gn/T/spark-5fc00be2-3067-4905-8c0e-dc8516137fad/userFiles-e52c2c94-708f-4ec6-9c51-b27035b5f307/spark-simple.py | |
2018-08-17 13:01:08 INFO Executor:54 - Starting executor ID driver on host localhost | |
2018-08-17 13:01:08 INFO Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51079. | |
2018-08-17 13:01:08 INFO NettyBlockTransferService:54 - Server created on 192.168.1.9:51079 | |
2018-08-17 13:01:08 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy | |
2018-08-17 13:01:08 INFO BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, 192.168.1.9, 51079, None) | |
2018-08-17 13:01:08 INFO BlockManagerMasterEndpoint:54 - Registering block manager 192.168.1.9:51079 with 366.3 MB RAM, BlockManagerId(driver, 192.168.1.9, 51079, None) | |
2018-08-17 13:01:08 INFO BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, 192.168.1.9, 51079, None) | |
2018-08-17 13:01:08 INFO BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, 192.168.1.9, 51079, None) | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@540e144{/metrics/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO SharedState:54 - Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/mikesukmanowsky/code/parsely/engineering/casterisk-realtime/spark-warehouse/'). | |
2018-08-17 13:01:08 INFO SharedState:54 - Warehouse path is 'file:/Users/mikesukmanowsky/code/parsely/engineering/casterisk-realtime/spark-warehouse/'. | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1032332a{/SQL,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1ee0e9de{/SQL/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2522c537{/SQL/execution,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@758a303a{/SQL/execution/json,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:08 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@8f9f390{/static/sql,null,AVAILABLE,@Spark} | |
2018-08-17 13:01:09 INFO StateStoreCoordinatorRef:54 - Registered StateStoreCoordinator endpoint | |
/Users/mikesukmanowsky/.pyenv/versions/3.6.6/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88 | |
return f(*args, **kwds) | |
2018-08-17 13:01:11 INFO ContextCleaner:54 - Cleaned accumulator 1 | |
2018-08-17 13:01:12 INFO CodeGenerator:54 - Code generated in 144.472757 ms | |
2018-08-17 13:01:12 INFO CodeGenerator:54 - Code generated in 20.947228 ms | |
2018-08-17 13:01:12 INFO CodeGenerator:54 - Code generated in 17.026646 ms | |
2018-08-17 13:01:12 INFO CodeGenerator:54 - Code generated in 9.059243 ms | |
2018-08-17 13:01:12 INFO SparkContext:54 - Starting job: showString at NativeMethodAccessorImpl.java:0 | |
2018-08-17 13:01:12 INFO DAGScheduler:54 - Registering RDD 7 (showString at NativeMethodAccessorImpl.java:0) | |
2018-08-17 13:01:12 INFO DAGScheduler:54 - Got job 0 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions | |
2018-08-17 13:01:12 INFO DAGScheduler:54 - Final stage: ResultStage 1 (showString at NativeMethodAccessorImpl.java:0) | |
2018-08-17 13:01:12 INFO DAGScheduler:54 - Parents of final stage: List(ShuffleMapStage 0) | |
2018-08-17 13:01:12 INFO DAGScheduler:54 - Missing parents: List(ShuffleMapStage 0) | |
2018-08-17 13:01:12 INFO DAGScheduler:54 - Submitting ShuffleMapStage 0 (MapPartitionsRDD[7] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents | |
2018-08-17 13:01:12 INFO MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 12.4 KB, free 366.3 MB) | |
2018-08-17 13:01:12 INFO MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 6.4 KB, free 366.3 MB) | |
2018-08-17 13:01:12 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 192.168.1.9:51079 (size: 6.4 KB, free: 366.3 MB) | |
2018-08-17 13:01:12 INFO SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1039 | |
2018-08-17 13:01:12 INFO DAGScheduler:54 - Submitting 2 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[7] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1)) | |
2018-08-17 13:01:12 INFO TaskSchedulerImpl:54 - Adding task set 0.0 with 2 tasks | |
2018-08-17 13:01:12 INFO TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 7874 bytes) | |
2018-08-17 13:01:12 INFO TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 7905 bytes) | |
2018-08-17 13:01:12 INFO Executor:54 - Running task 0.0 in stage 0.0 (TID 0) | |
2018-08-17 13:01:12 INFO Executor:54 - Running task 1.0 in stage 0.0 (TID 1) | |
2018-08-17 13:01:12 INFO Executor:54 - Fetching file:/Users/mikesukmanowsky/code/parsely/engineering/casterisk-realtime/spark-simple.py with timestamp 1534525268603 | |
2018-08-17 13:01:12 INFO Utils:54 - /Users/mikesukmanowsky/code/parsely/engineering/casterisk-realtime/spark-simple.py has been previously copied to /private/var/folders/07/gnp1f3hs7kn7p0j7kcs2g4s80000gn/T/spark-5fc00be2-3067-4905-8c0e-dc8516137fad/userFiles-e52c2c94-708f-4ec6-9c51-b27035b5f307/spark-simple.py | |
2018-08-17 13:01:13 INFO CodeGenerator:54 - Code generated in 11.482855 ms | |
2018-08-17 13:01:13 INFO CodeGenerator:54 - Code generated in 13.829726 ms | |
2018-08-17 13:01:13 INFO PythonRunner:54 - Times: total = 508, boot = 505, init = 2, finish = 1 | |
2018-08-17 13:01:13 INFO PythonRunner:54 - Times: total = 505, boot = 497, init = 8, finish = 0 | |
2018-08-17 13:01:13 INFO Executor:54 - Finished task 1.0 in stage 0.0 (TID 1). 2115 bytes result sent to driver | |
2018-08-17 13:01:13 INFO Executor:54 - Finished task 0.0 in stage 0.0 (TID 0). 2115 bytes result sent to driver | |
2018-08-17 13:01:13 INFO TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 731 ms on localhost (executor driver) (1/2) | |
2018-08-17 13:01:13 INFO TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 723 ms on localhost (executor driver) (2/2) | |
2018-08-17 13:01:13 INFO TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool | |
2018-08-17 13:01:13 INFO DAGScheduler:54 - ShuffleMapStage 0 (showString at NativeMethodAccessorImpl.java:0) finished in 0.869 s | |
2018-08-17 13:01:13 INFO DAGScheduler:54 - looking for newly runnable stages | |
2018-08-17 13:01:13 INFO DAGScheduler:54 - running: Set() | |
2018-08-17 13:01:13 INFO DAGScheduler:54 - waiting: Set(ResultStage 1) | |
2018-08-17 13:01:13 INFO DAGScheduler:54 - failed: Set() | |
2018-08-17 13:01:13 INFO DAGScheduler:54 - Submitting ResultStage 1 (MapPartitionsRDD[13] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents | |
2018-08-17 13:01:13 INFO MemoryStore:54 - Block broadcast_1 stored as values in memory (estimated size 21.3 KB, free 366.3 MB) | |
2018-08-17 13:01:13 INFO MemoryStore:54 - Block broadcast_1_piece0 stored as bytes in memory (estimated size 10.8 KB, free 366.3 MB) | |
2018-08-17 13:01:13 INFO BlockManagerInfo:54 - Added broadcast_1_piece0 in memory on 192.168.1.9:51079 (size: 10.8 KB, free: 366.3 MB) | |
2018-08-17 13:01:13 INFO SparkContext:54 - Created broadcast 1 from broadcast at DAGScheduler.scala:1039 | |
2018-08-17 13:01:13 INFO DAGScheduler:54 - Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[13] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0)) | |
2018-08-17 13:01:13 INFO TaskSchedulerImpl:54 - Adding task set 1.0 with 1 tasks | |
2018-08-17 13:01:13 INFO TaskSetManager:54 - Starting task 0.0 in stage 1.0 (TID 2, localhost, executor driver, partition 0, PROCESS_LOCAL, 7754 bytes) | |
2018-08-17 13:01:13 INFO Executor:54 - Running task 0.0 in stage 1.0 (TID 2) | |
2018-08-17 13:01:13 INFO ShuffleBlockFetcherIterator:54 - Getting 0 non-empty blocks out of 2 blocks | |
2018-08-17 13:01:13 INFO ShuffleBlockFetcherIterator:54 - Started 0 remote fetches in 6 ms | |
2018-08-17 13:01:13 INFO CodeGenerator:54 - Code generated in 9.873753 ms | |
2018-08-17 13:01:13 INFO CodeGenerator:54 - Code generated in 6.998625 ms | |
2018-08-17 13:01:13 INFO CodeGenerator:54 - Code generated in 8.157707 ms | |
/Users/mikesukmanowsky/.pyenv/versions/3.6.6/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88 | |
return f(*args, **kwds) | |
objc[42215]: +[__NSPlaceholderDictionary initialize] may have been in progress in another thread when fork() was called. | |
objc[42215]: +[__NSPlaceholderDictionary initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. | |
2018-08-17 13:01:13 ERROR Executor:91 - Exception in task 0.0 in stage 1.0 (TID 2) | |
org.apache.spark.SparkException: Python worker exited unexpectedly (crashed) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:333) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:322) | |
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:177) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:121) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:252) | |
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) | |
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) | |
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) | |
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown Source) | |
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614) | |
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253) | |
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247) | |
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) | |
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) | |
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) | |
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) | |
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) | |
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) | |
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) | |
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) | |
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) | |
at org.apache.spark.scheduler.Task.run(Task.scala:109) | |
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) | |
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) | |
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) | |
at java.lang.Thread.run(Thread.java:745) | |
Caused by: java.io.EOFException | |
at java.io.DataInputStream.readInt(DataInputStream.java:392) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:158) | |
... 24 more | |
2018-08-17 13:01:13 WARN TaskSetManager:66 - Lost task 0.0 in stage 1.0 (TID 2, localhost, executor driver): org.apache.spark.SparkException: Python worker exited unexpectedly (crashed) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:333) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:322) | |
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:177) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:121) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:252) | |
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) | |
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) | |
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) | |
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown Source) | |
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614) | |
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253) | |
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247) | |
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) | |
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) | |
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) | |
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) | |
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) | |
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) | |
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) | |
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) | |
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) | |
at org.apache.spark.scheduler.Task.run(Task.scala:109) | |
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) | |
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) | |
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) | |
at java.lang.Thread.run(Thread.java:745) | |
Caused by: java.io.EOFException | |
at java.io.DataInputStream.readInt(DataInputStream.java:392) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:158) | |
... 24 more | |
2018-08-17 13:01:13 ERROR TaskSetManager:70 - Task 0 in stage 1.0 failed 1 times; aborting job | |
2018-08-17 13:01:13 INFO TaskSchedulerImpl:54 - Removed TaskSet 1.0, whose tasks have all completed, from pool | |
2018-08-17 13:01:13 INFO TaskSchedulerImpl:54 - Cancelling stage 1 | |
2018-08-17 13:01:13 INFO DAGScheduler:54 - ResultStage 1 (showString at NativeMethodAccessorImpl.java:0) failed in 0.594 s due to Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 2, localhost, executor driver): org.apache.spark.SparkException: Python worker exited unexpectedly (crashed) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:333) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:322) | |
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:177) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:121) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:252) | |
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) | |
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) | |
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) | |
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown Source) | |
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614) | |
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253) | |
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247) | |
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) | |
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) | |
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) | |
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) | |
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) | |
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) | |
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) | |
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) | |
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) | |
at org.apache.spark.scheduler.Task.run(Task.scala:109) | |
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) | |
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) | |
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) | |
at java.lang.Thread.run(Thread.java:745) | |
Caused by: java.io.EOFException | |
at java.io.DataInputStream.readInt(DataInputStream.java:392) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:158) | |
... 24 more | |
Driver stacktrace: | |
2018-08-17 13:01:13 INFO DAGScheduler:54 - Job 0 failed: showString at NativeMethodAccessorImpl.java:0, took 1.507249 s | |
Traceback (most recent call last): | |
File "/Users/mikesukmanowsky/code/parsely/engineering/casterisk-realtime/spark-simple.py", line 25, in <module> | |
.apply(normalize) | |
File "/Users/mikesukmanowsky/.opt/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 350, in show | |
File "/Users/mikesukmanowsky/.opt/spark-2.3.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__ | |
File "/Users/mikesukmanowsky/.opt/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco | |
File "/Users/mikesukmanowsky/.opt/spark-2.3.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value | |
py4j.protocol.Py4JJavaError: An error occurred while calling o59.showString. | |
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 2, localhost, executor driver): org.apache.spark.SparkException: Python worker exited unexpectedly (crashed) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:333) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:322) | |
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:177) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:121) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:252) | |
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) | |
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) | |
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) | |
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown Source) | |
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614) | |
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253) | |
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247) | |
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) | |
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) | |
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) | |
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) | |
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) | |
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) | |
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) | |
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) | |
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) | |
at org.apache.spark.scheduler.Task.run(Task.scala:109) | |
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) | |
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) | |
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) | |
at java.lang.Thread.run(Thread.java:745) | |
Caused by: java.io.EOFException | |
at java.io.DataInputStream.readInt(DataInputStream.java:392) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:158) | |
... 24 more | |
Driver stacktrace: | |
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1602) | |
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1590) | |
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1589) | |
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) | |
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) | |
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1589) | |
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) | |
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) | |
at scala.Option.foreach(Option.scala:257) | |
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831) | |
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1823) | |
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772) | |
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761) | |
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) | |
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642) | |
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034) | |
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055) | |
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2074) | |
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:363) | |
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) | |
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3273) | |
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2484) | |
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2484) | |
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3254) | |
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) | |
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3253) | |
at org.apache.spark.sql.Dataset.head(Dataset.scala:2484) | |
at org.apache.spark.sql.Dataset.take(Dataset.scala:2698) | |
at org.apache.spark.sql.Dataset.showString(Dataset.scala:254) | |
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) | |
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) | |
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) | |
at java.lang.reflect.Method.invoke(Method.java:498) | |
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) | |
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) | |
at py4j.Gateway.invoke(Gateway.java:282) | |
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) | |
at py4j.commands.CallCommand.execute(CallCommand.java:79) | |
at py4j.GatewayConnection.run(GatewayConnection.java:238) | |
at java.lang.Thread.run(Thread.java:745) | |
Caused by: org.apache.spark.SparkException: Python worker exited unexpectedly (crashed) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:333) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:322) | |
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:177) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:121) | |
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:252) | |
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) | |
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) | |
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) | |
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown Source) | |
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614) | |
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253) | |
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247) | |
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) | |
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) | |
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) | |
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) | |
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) | |
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) | |
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) | |
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) | |
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) | |
at org.apache.spark.scheduler.Task.run(Task.scala:109) | |
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) | |
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) | |
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) | |
... 1 more | |
Caused by: java.io.EOFException | |
at java.io.DataInputStream.readInt(DataInputStream.java:392) | |
at org.apache.spark.sql.execution.python.ArrowPythonRunner$$anon$1.read(ArrowPythonRunner.scala:158) | |
... 24 more | |
2018-08-17 13:01:13 INFO SparkContext:54 - Invoking stop() from shutdown hook | |
2018-08-17 13:01:13 INFO AbstractConnector:318 - Stopped Spark@780d9633{HTTP/1.1,[http/1.1]}{0.0.0.0:4040} | |
2018-08-17 13:01:13 INFO SparkUI:54 - Stopped Spark web UI at http://192.168.1.9:4040 | |
2018-08-17 13:01:13 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped! | |
2018-08-17 13:01:13 INFO MemoryStore:54 - MemoryStore cleared | |
2018-08-17 13:01:13 INFO BlockManager:54 - BlockManager stopped | |
2018-08-17 13:01:13 INFO BlockManagerMaster:54 - BlockManagerMaster stopped | |
2018-08-17 13:01:13 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped! | |
2018-08-17 13:01:13 INFO SparkContext:54 - Successfully stopped SparkContext | |
2018-08-17 13:01:13 INFO ShutdownHookManager:54 - Shutdown hook called | |
2018-08-17 13:01:13 INFO ShutdownHookManager:54 - Deleting directory /private/var/folders/07/gnp1f3hs7kn7p0j7kcs2g4s80000gn/T/spark-5fc00be2-3067-4905-8c0e-dc8516137fad/pyspark-4ae4108c-ea13-44d9-a714-3a42c0778703 | |
2018-08-17 13:01:13 INFO ShutdownHookManager:54 - Deleting directory /private/var/folders/07/gnp1f3hs7kn7p0j7kcs2g4s80000gn/T/spark-89c6c777-5431-4f30-b020-dcc25e322b3a | |
2018-08-17 13:01:13 INFO ShutdownHookManager:54 - Deleting directory /private/var/folders/07/gnp1f3hs7kn7p0j7kcs2g4s80000gn/T/spark-5fc00be2-3067-4905-8c0e-dc8516137fad |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment