Created
September 10, 2025 14:06
-
-
Save paulnicholsen27/f26bbcc7521f2e9110cbd28c011a4f12 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| 25/09/10 10:02:45 INFO FileSourceStrategy: Post-Scan Filters: | |
| 25/09/10 10:02:45 INFO ShufflePartitionsUtil: For shuffle(6), advisory target size: 67108864, actual target size 1048576, minimum partition size: 1048576 | |
| 25/09/10 10:02:45 INFO SparkContext: Starting job: $anonfun$withThreadLocalCaptured$1 at FutureTask.java:264 | |
| 25/09/10 10:02:45 INFO DAGScheduler: Got job 16 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:264) with 1 output partitions | |
| 25/09/10 10:02:45 INFO DAGScheduler: Final stage: ResultStage 26 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:264) | |
| 25/09/10 10:02:45 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 25) | |
| 25/09/10 10:02:45 INFO DAGScheduler: Missing parents: List() | |
| 25/09/10 10:02:45 INFO DAGScheduler: Submitting ResultStage 26 (MapPartitionsRDD[65] at $anonfun$withThreadLocalCaptured$1 at FutureTask.java:264), which has no missing parents | |
| 25/09/10 10:02:45 INFO MemoryStore: Block broadcast_29 stored as values in memory (estimated size 113.4 KiB, free 426.6 MiB) | |
| 25/09/10 10:02:45 INFO MemoryStore: Block broadcast_29_piece0 stored as bytes in memory (estimated size 40.8 KiB, free 426.6 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Added broadcast_29_piece0 in memory on ncias-d3613-v.nci.nih.gov:43165 (size: 40.8 KiB, free: 433.8 MiB) | |
| 25/09/10 10:02:45 INFO SparkContext: Created broadcast 29 from broadcast at DAGScheduler.scala:1611 | |
| 25/09/10 10:02:45 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 26 (MapPartitionsRDD[65] at $anonfun$withThreadLocalCaptured$1 at FutureTask.java:264) (first 15 tasks are for partitions Vector(0)) | |
| 25/09/10 10:02:45 INFO TaskSchedulerImpl: Adding task set 26.0 with 1 tasks resource profile 0 | |
| 25/09/10 10:02:45 INFO TaskSetManager: Starting task 0.0 in stage 26.0 (TID 17) (ncias-d3613-v.nci.nih.gov, executor driver, partition 0, NODE_LOCAL, 9265 bytes) | |
| 25/09/10 10:02:45 INFO Executor: Running task 0.0 in stage 26.0 (TID 17) | |
| 25/09/10 10:02:45 INFO ShuffleBlockFetcherIterator: Getting 1 (12.3 KiB) non-empty blocks including 1 (12.3 KiB) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 0 (0.0 B) remote blocks | |
| 25/09/10 10:02:45 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms | |
| 25/09/10 10:02:45 INFO CodeGenerator: Code generated in 5.752978 ms | |
| 25/09/10 10:02:45 INFO CodeGenerator: Code generated in 9.67193 ms | |
| 25/09/10 10:02:45 INFO Executor: Finished task 0.0 in stage 26.0 (TID 17). 27006 bytes result sent to driver | |
| 25/09/10 10:02:45 INFO TaskSetManager: Finished task 0.0 in stage 26.0 (TID 17) in 59 ms on ncias-d3613-v.nci.nih.gov (executor driver) (1/1) | |
| 25/09/10 10:02:45 INFO TaskSchedulerImpl: Removed TaskSet 26.0, whose tasks have all completed, from pool | |
| 25/09/10 10:02:45 INFO DAGScheduler: ResultStage 26 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:264) finished in 0.076 s | |
| 25/09/10 10:02:45 INFO DAGScheduler: Job 16 is finished. Cancelling potential speculative or zombie tasks for this job | |
| 25/09/10 10:02:45 INFO TaskSchedulerImpl: Killing all running tasks in stage 26: Stage finished | |
| 25/09/10 10:02:45 INFO DAGScheduler: Job 16 finished: $anonfun$withThreadLocalCaptured$1 at FutureTask.java:264, took 0.079833 s | |
| 25/09/10 10:02:45 INFO CodeGenerator: Code generated in 7.817072 ms | |
| 25/09/10 10:02:45 INFO MemoryStore: Block broadcast_30 stored as values in memory (estimated size 1026.0 KiB, free 425.6 MiB) | |
| 25/09/10 10:02:45 INFO MemoryStore: Block broadcast_30_piece0 stored as bytes in memory (estimated size 5.8 KiB, free 425.6 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Added broadcast_30_piece0 in memory on ncias-d3613-v.nci.nih.gov:43165 (size: 5.8 KiB, free: 433.8 MiB) | |
| 25/09/10 10:02:45 INFO SparkContext: Created broadcast 30 from $anonfun$withThreadLocalCaptured$1 at FutureTask.java:264 | |
| 25/09/10 10:02:45 INFO FileSourceStrategy: Pushed Filters: | |
| 25/09/10 10:02:45 INFO FileSourceStrategy: Post-Scan Filters: | |
| 25/09/10 10:02:45 INFO FileSourceStrategy: Pushed Filters: | |
| 25/09/10 10:02:45 INFO FileSourceStrategy: Post-Scan Filters: | |
| 25/09/10 10:02:45 INFO ParquetUtils: Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false | |
| 25/09/10 10:02:45 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false | |
| 25/09/10 10:02:45 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter | |
| 25/09/10 10:02:45 INFO CodeGenerator: Code generated in 23.576945 ms | |
| 25/09/10 10:02:45 INFO MemoryStore: Block broadcast_31 stored as values in memory (estimated size 203.2 KiB, free 425.4 MiB) | |
| 25/09/10 10:02:45 INFO MemoryStore: Block broadcast_31_piece0 stored as bytes in memory (estimated size 35.7 KiB, free 425.3 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Added broadcast_31_piece0 in memory on ncias-d3613-v.nci.nih.gov:43165 (size: 35.7 KiB, free: 433.8 MiB) | |
| 25/09/10 10:02:45 INFO SparkContext: Created broadcast 31 from parquet at ekg_export.java:1285 | |
| 25/09/10 10:02:45 INFO FileSourceScanExec: Planning scan with bin packing, max size: 4194304 bytes, open cost is considered as scanning 4194304 bytes. | |
| 25/09/10 10:02:45 INFO CodeGenerator: Code generated in 29.606664 ms | |
| 25/09/10 10:02:45 INFO MemoryStore: Block broadcast_32 stored as values in memory (estimated size 203.6 KiB, free 425.1 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_23_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 22.6 KiB, free: 433.8 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_28_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 54.2 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO MemoryStore: Block broadcast_32_piece0 stored as bytes in memory (estimated size 35.8 KiB, free 425.4 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Added broadcast_32_piece0 in memory on ncias-d3613-v.nci.nih.gov:43165 (size: 35.8 KiB, free: 433.8 MiB) | |
| 25/09/10 10:02:45 INFO SparkContext: Created broadcast 32 from parquet at ekg_export.java:1285 | |
| 25/09/10 10:02:45 INFO FileSourceScanExec: Planning scan with bin packing, max size: 4194304 bytes, open cost is considered as scanning 4194304 bytes. | |
| 25/09/10 10:02:45 INFO SparkContext: Starting job: parquet at ekg_export.java:1285 | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_9_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 8.5 KiB, free: 433.8 MiB) | |
| 25/09/10 10:02:45 INFO DAGScheduler: Got job 17 (parquet at ekg_export.java:1285) with 2 output partitions | |
| 25/09/10 10:02:45 INFO DAGScheduler: Final stage: ResultStage 27 (parquet at ekg_export.java:1285) | |
| 25/09/10 10:02:45 INFO DAGScheduler: Parents of final stage: List() | |
| 25/09/10 10:02:45 INFO DAGScheduler: Missing parents: List() | |
| 25/09/10 10:02:45 INFO DAGScheduler: Submitting ResultStage 27 (MapPartitionsRDD[73] at parquet at ekg_export.java:1285), which has no missing parents | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_6_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 7.0 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_20_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 22.0 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_19_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 22.6 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_7_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 15.0 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO MemoryStore: Block broadcast_33 stored as values in memory (estimated size 254.4 KiB, free 425.5 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_29_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 40.8 KiB, free: 434.0 MiB) | |
| 25/09/10 10:02:45 INFO MemoryStore: Block broadcast_33_piece0 stored as bytes in memory (estimated size 86.8 KiB, free 425.5 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Added broadcast_33_piece0 in memory on ncias-d3613-v.nci.nih.gov:43165 (size: 86.8 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_26_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 24.7 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO SparkContext: Created broadcast 33 from broadcast at DAGScheduler.scala:1611 | |
| 25/09/10 10:02:45 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 27 (MapPartitionsRDD[73] at parquet at ekg_export.java:1285) (first 15 tasks are for partitions Vector(0, 1)) | |
| 25/09/10 10:02:45 INFO TaskSchedulerImpl: Adding task set 27.0 with 2 tasks resource profile 0 | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_11_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 8.5 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO TaskSetManager: Starting task 0.0 in stage 27.0 (TID 18) (ncias-d3613-v.nci.nih.gov, executor driver, partition 0, PROCESS_LOCAL, 10115 bytes) | |
| 25/09/10 10:02:45 INFO TaskSetManager: Starting task 1.0 in stage 27.0 (TID 19) (ncias-d3613-v.nci.nih.gov, executor driver, partition 1, PROCESS_LOCAL, 10064 bytes) | |
| 25/09/10 10:02:45 INFO Executor: Running task 0.0 in stage 27.0 (TID 18) | |
| 25/09/10 10:02:45 INFO Executor: Running task 1.0 in stage 27.0 (TID 19) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_14_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 8.3 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_15_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 16.9 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO BlockManagerInfo: Removed broadcast_25_piece0 on ncias-d3613-v.nci.nih.gov:43165 in memory (size: 25.3 KiB, free: 433.9 MiB) | |
| 25/09/10 10:02:45 INFO CodeGenerator: Code generated in 21.971419 ms | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false | |
| 25/09/10 10:02:45 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false | |
| 25/09/10 10:02:45 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter | |
| 25/09/10 10:02:45 INFO CodeGenerator: Code generated in 26.947958 ms | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false | |
| 25/09/10 10:02:45 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 | |
| 25/09/10 10:02:45 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false | |
| 25/09/10 10:02:45 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter | |
| 25/09/10 10:02:45 INFO FileScanRDD: Reading File path: file:///data/users/nicholsenpm/airflow_extractions/BTRIS_for_Databricks_09082025_140444/input/ekg_notional/ekg-notional.parquet, range: 0-33734, partition values: [empty row] | |
| 25/09/10 10:02:45 INFO CodecConfig: Compression: GZIP | |
| 25/09/10 10:02:45 INFO CodecConfig: Compression: GZIP | |
| 25/09/10 10:02:45 INFO CodecPool: Got brand-new decompressor [.zstd] | |
| 25/09/10 10:02:45 INFO ParquetOutputFormat: ParquetRecordWriter [block size: 134217728b, row group padding size: 8388608b, validating: false] | |
| 25/09/10 10:02:45 ERROR Executor: Exception in task 1.0 in stage 27.0 (TID 19) | |
| java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| 25/09/10 10:02:45 INFO ParquetWriteSupport: Initialized Parquet WriteSupport with Catalyst schema: | |
| { | |
| "type" : "struct", | |
| "fields" : [ { | |
| "name" : "date_administered", | |
| "type" : "date", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "observation_text", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "btris_category", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "administering_provider", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "cris_orderset_category", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "cris_orderset_subcategory", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "ekg_id", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "sequence", | |
| "type" : "double", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "observation_name", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "unit_of_measure", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "timestamp_date_administered", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "allocated_to_protocol", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "other_order_detail", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "priority", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "protocol_number_fbum", | |
| "type" : { | |
| "type" : "array", | |
| "elementType" : "string", | |
| "containsNull" : true | |
| }, | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "mrn", | |
| "type" : "string", | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "protocol_number", | |
| "type" : { | |
| "type" : "array", | |
| "elementType" : "string", | |
| "containsNull" : true | |
| }, | |
| "nullable" : true, | |
| "metadata" : { } | |
| }, { | |
| "name" : "primary_date_time", | |
| "type" : "date", | |
| "nullable" : true, | |
| "metadata" : { } | |
| } ] | |
| } | |
| and corresponding Parquet message type: | |
| message spark_schema { | |
| optional int32 date_administered (DATE); | |
| optional binary observation_text (STRING); | |
| optional binary btris_category (STRING); | |
| optional binary administering_provider (STRING); | |
| optional binary cris_orderset_category (STRING); | |
| optional binary cris_orderset_subcategory (STRING); | |
| optional binary ekg_id (STRING); | |
| optional double sequence; | |
| optional binary observation_name (STRING); | |
| optional binary unit_of_measure (STRING); | |
| optional binary timestamp_date_administered (STRING); | |
| optional binary allocated_to_protocol (STRING); | |
| optional binary other_order_detail (STRING); | |
| optional binary priority (STRING); | |
| optional group protocol_number_fbum (LIST) { | |
| repeated group list { | |
| optional binary element (STRING); | |
| } | |
| } | |
| optional binary mrn (STRING); | |
| optional group protocol_number (LIST) { | |
| repeated group list { | |
| optional binary element (STRING); | |
| } | |
| } | |
| optional int32 primary_date_time (DATE); | |
| } | |
| 25/09/10 10:02:45 WARN TaskSetManager: Lost task 1.0 in stage 27.0 (TID 19) (ncias-d3613-v.nci.nih.gov executor driver): java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| 25/09/10 10:02:45 ERROR TaskSetManager: Task 1 in stage 27.0 failed 1 times; aborting job | |
| 25/09/10 10:02:45 INFO TaskSchedulerImpl: Cancelling stage 27 | |
| 25/09/10 10:02:45 INFO TaskSchedulerImpl: Killing all running tasks in stage 27: Stage cancelled: Job aborted due to stage failure: Task 1 in stage 27.0 failed 1 times, most recent failure: Lost task 1.0 in stage 27.0 (TID 19) (ncias-d3613-v.nci.nih.gov executor driver): java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| Driver stacktrace: | |
| 25/09/10 10:02:45 INFO Executor: Executor is trying to kill task 0.0 in stage 27.0 (TID 18), reason: Stage cancelled: Job aborted due to stage failure: Task 1 in stage 27.0 failed 1 times, most recent failure: Lost task 1.0 in stage 27.0 (TID 19) (ncias-d3613-v.nci.nih.gov executor driver): java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| Driver stacktrace: | |
| 25/09/10 10:02:45 INFO TaskSchedulerImpl: Stage 27 was cancelled | |
| 25/09/10 10:02:45 INFO DAGScheduler: ResultStage 27 (parquet at ekg_export.java:1285) failed in 0.196 s due to Job aborted due to stage failure: Task 1 in stage 27.0 failed 1 times, most recent failure: Lost task 1.0 in stage 27.0 (TID 19) (ncias-d3613-v.nci.nih.gov executor driver): java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| Driver stacktrace: | |
| 25/09/10 10:02:45 INFO DAGScheduler: Job 17 failed: parquet at ekg_export.java:1285, took 0.203010 s | |
| 25/09/10 10:02:45 ERROR FileFormatWriter: Aborting job b03decbb-2ceb-49e5-b107-bde4d9b57b78. | |
| org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 27.0 failed 1 times, most recent failure: Lost task 1.0 in stage 27.0 (TID 19) (ncias-d3613-v.nci.nih.gov executor driver): java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| Driver stacktrace: | |
| at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2898) | |
| at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2834) | |
| at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2833) | |
| at scala.collection.immutable.List.foreach(List.scala:333) | |
| at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2833) | |
| at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1253) | |
| at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1253) | |
| at scala.Option.foreach(Option.scala:437) | |
| at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1253) | |
| at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:3102) | |
| at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3036) | |
| at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3025) | |
| at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) | |
| at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:995) | |
| at org.apache.spark.SparkContext.runJob(SparkContext.scala:2393) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeWrite$4(FileFormatWriter.scala:307) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.writeAndCommit(FileFormatWriter.scala:271) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeWrite(FileFormatWriter.scala:304) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:190) | |
| at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:190) | |
| at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:113) | |
| at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:111) | |
| at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:125) | |
| at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$executeCollect$1(AdaptiveSparkPlanExec.scala:392) | |
| at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.withFinalPlanUpdate(AdaptiveSparkPlanExec.scala:420) | |
| at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.executeCollect(AdaptiveSparkPlanExec.scala:392) | |
| at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:107) | |
| at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125) | |
| at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201) | |
| at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108) | |
| at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900) | |
| at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66) | |
| at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107) | |
| at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98) | |
| at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461) | |
| at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) | |
| at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461) | |
| at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32) | |
| at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) | |
| at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) | |
| at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) | |
| at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) | |
| at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437) | |
| at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98) | |
| at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85) | |
| at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83) | |
| at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:142) | |
| at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:869) | |
| at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:391) | |
| at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:364) | |
| at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:243) | |
| at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:802) | |
| at ekg_export.main(ekg_export.java:1285) | |
| at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) | |
| at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) | |
| at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) | |
| at java.base/java.lang.reflect.Method.invoke(Method.java:566) | |
| at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) | |
| at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1034) | |
| at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:199) | |
| at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:222) | |
| at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) | |
| at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1125) | |
| at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1134) | |
| at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) | |
| Caused by: java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| 25/09/10 10:02:45 INFO CodecPool: Got brand-new compressor [.gz] | |
| 25/09/10 10:02:45 ERROR Utils: Aborting task | |
| org.apache.spark.TaskKilledException | |
| at org.apache.spark.TaskContextImpl.killTaskIfInterrupted(TaskContextImpl.scala:267) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:130) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage21.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage21.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.writeWithIterator(FileFormatDataWriter.scala:91) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:403) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1397) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:410) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| 25/09/10 10:02:45 WARN FileOutputCommitter: Could not delete file:/data/users/nicholsenpm/airflow_extractions/BTRIS_for_Databricks_09082025_140444/output/ekg_export/_temporary/0/_temporary/attempt_202509101002456434935337330780011_0027_m_000000_18 | |
| 25/09/10 10:02:45 ERROR FileFormatWriter: Job job_202509101002456434935337330780011_0027 aborted. | |
| Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 27.0 failed 1 times, most recent failure: Lost task 1.0 in stage 27.0 (TID 19) (ncias-d3613-v.nci.nih.gov executor driver): java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| Driver stacktrace: | |
| at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2898) | |
| at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2834) | |
| at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2833) | |
| at scala.collection.immutable.List.foreach(List.scala:333) | |
| at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2833) | |
| at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1253) | |
| at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1253) | |
| at scala.Option.foreach(Option.scala:437) | |
| at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1253) | |
| at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:3102) | |
| at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3036) | |
| at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3025) | |
| at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) | |
| at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:995) | |
| at org.apache.spark.SparkContext.runJob(SparkContext.scala:2393) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeWrite$4(FileFormatWriter.scala:307) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.writeAndCommit(FileFormatWriter.scala:271) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeWrite(FileFormatWriter.scala:304) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:190) | |
| at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:190) | |
| at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:113) | |
| at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:111) | |
| at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:125) | |
| at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$executeCollect$1(AdaptiveSparkPlanExec.scala:392) | |
| at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.withFinalPlanUpdate(AdaptiveSparkPlanExec.scala:420) | |
| at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.executeCollect(AdaptiveSparkPlanExec.scala:392) | |
| at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:107) | |
| at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125) | |
| at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201) | |
| at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108) | |
| at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900) | |
| at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66) | |
| at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107) | |
| at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98) | |
| at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461) | |
| at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) | |
| at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461) | |
| at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32) | |
| at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) | |
| at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) | |
| at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) | |
| at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) | |
| at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437) | |
| at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98) | |
| at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85) | |
| at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83) | |
| at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:142) | |
| at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:869) | |
| at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:391) | |
| at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:364) | |
| at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:243) | |
| at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:802) | |
| at ekg_export.main(ekg_export.java:1285) | |
| at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) | |
| at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) | |
| at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) | |
| at java.base/java.lang.reflect.Method.invoke(Method.java:566) | |
| at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) | |
| at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1034) | |
| at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:199) | |
| at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:222) | |
| at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) | |
| at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1125) | |
| at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1134) | |
| at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) | |
| Caused by: java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| 25/09/10 10:02:46 INFO SparkContext: Invoking stop() from shutdown hook | |
| 25/09/10 10:02:46 INFO SparkContext: SparkContext is stopping with exitCode 0. | |
| 25/09/10 10:02:46 INFO Executor: Executor interrupted and killed task 0.0 in stage 27.0 (TID 18), reason: Stage cancelled: Job aborted due to stage failure: Task 1 in stage 27.0 failed 1 times, most recent failure: Lost task 1.0 in stage 27.0 (TID 19) (ncias-d3613-v.nci.nih.gov executor driver): java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| Driver stacktrace: | |
| 25/09/10 10:02:46 WARN TaskSetManager: Lost task 0.0 in stage 27.0 (TID 18) (ncias-d3613-v.nci.nih.gov executor driver): TaskKilled (Stage cancelled: Job aborted due to stage failure: Task 1 in stage 27.0 failed 1 times, most recent failure: Lost task 1.0 in stage 27.0 (TID 19) (ncias-d3613-v.nci.nih.gov executor driver): java.lang.UnsatisfiedLinkError: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: /tmp/libzstd-jni-1.5.5-411526472785064416777.so: failed to map segment from shared object | |
| no zstd-jni-1.5.5-4 in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib] | |
| Unsupported OS/arch, cannot find /linux/amd64/libzstd-jni-1.5.5-4.so or load zstd-jni-1.5.5-4 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-4 in your system. | |
| at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2678) | |
| at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) | |
| at java.base/java.lang.System.loadLibrary(System.java:1890) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:69) | |
| at com.github.luben.zstd.util.Native$1.run(Native.java:67) | |
| at java.base/java.security.AccessController.doPrivileged(Native Method) | |
| at com.github.luben.zstd.util.Native.loadLibrary(Native.java:67) | |
| at com.github.luben.zstd.util.Native.load(Native.java:154) | |
| at com.github.luben.zstd.util.Native.load(Native.java:85) | |
| at com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18) | |
| at com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90) | |
| at org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83) | |
| at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112) | |
| at org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.<init>(VectorizedColumnReader.java:120) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initColumnReader(VectorizedParquetRecordReader.java:437) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:427) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:335) | |
| at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:233) | |
| at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:286) | |
| at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:131) | |
| at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:593) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.columnartorow_nextBatch_0$(Unknown Source) | |
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.processNext(Unknown Source) | |
| at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | |
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | |
| at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:385) | |
| at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) | |
| at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) | |
| at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) | |
| at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) | |
| at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) | |
| at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) | |
| at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) | |
| at org.apache.spark.scheduler.Task.run(Task.scala:141) | |
| at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:621) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) | |
| at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) | |
| at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) | |
| at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:624) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) | |
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) | |
| at java.base/java.lang.Thread.run(Thread.java:829) | |
| Driver stacktrace:) | |
| 25/09/10 10:02:46 INFO TaskSchedulerImpl: Removed TaskSet 27.0, whose tasks have all completed, from pool | |
| 25/09/10 10:02:46 INFO SparkUI: Stopped Spark web UI at http://ncias-d3613-v.nci.nih.gov:4042 | |
| 25/09/10 10:02:46 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! | |
| 25/09/10 10:02:46 INFO MemoryStore: MemoryStore cleared | |
| 25/09/10 10:02:46 INFO BlockManager: BlockManager stopped | |
| 25/09/10 10:02:46 INFO BlockManagerMaster: BlockManagerMaster stopped | |
| 25/09/10 10:02:46 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! | |
| 25/09/10 10:02:46 INFO SparkContext: Successfully stopped SparkContext | |
| 25/09/10 10:02:46 INFO ShutdownHookManager: Shutdown hook called | |
| 25/09/10 10:02:46 INFO ShutdownHookManager: Deleting directory /tmp/spark-9f42b74d-0290-4162-acc7-fc8ae3aeb5b4 | |
| 25/09/10 10:02:46 INFO ShutdownHookManager: Deleting directory /tmp/spark-162e62e9-074b-48dd-8df6-95a706b47fe0 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment