Skip to content

Instantly share code, notes, and snippets.

View nsivabalan's full-sized avatar

Sivabalan Narayanan nsivabalan

View GitHub Profile
21/11/13 17:19:43 ERROR UpsertPartitioner: Error trying to compute average bytes/record
org.apache.hudi.exception.HoodieIOException: Could not read commit details from /tmp/hudi-deltastreamer-ny-mw/.hoodie/20211113171516287.commit
at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.readDataFromPath(HoodieActiveTimeline.java:669)
at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.getInstantDetails(HoodieActiveTimeline.java:285)
at org.apache.hudi.common.table.timeline.HoodieDefaultTimeline.getInstantDetails(HoodieDefaultTimeline.java:339)
at org.apache.hudi.table.action.commit.UpsertPartitioner.averageBytesPerRecord(UpsertPartitioner.java:348)
at org.apache.hudi.table.action.commit.UpsertPartitioner.assignInserts(UpsertPartitioner.java:161)
at org.apache.hudi.table.action.commit.UpsertPartitioner.<init>(UpsertPartitioner.java:102)
at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.getUpsertPartitioner(BaseSparkCommitActionExecutor.java:370)
at org.apache.hudi.tabl
21/11/13 09:12:05 INFO AppInfoParser: Kafka commitId : 3402a8361b734732
21/11/13 09:12:05 INFO Metadata: Cluster ID: pzQYsU3bQX6hDUJjSJB08A
21/11/13 09:12:05 INFO KafkaOffsetGen: SourceLimit not configured, set numEvents to default value : 5000000
21/11/13 09:12:05 INFO AvroKafkaSource: About to read 20802 from Kafka for topic :impressions
21/11/13 09:12:05 INFO HoodieDeltaStreamer: Delta Sync shutdown. Error ?false
21/11/13 09:12:05 ERROR HoodieAsyncService: Monitor noticed one or more threads failed. Requesting graceful shutdown of other threads
java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: scala/Product$class
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at org.apache.hudi.async.HoodieAsyncService.lambda$monitorThreads$1(HoodieAsyncService.java:158)
/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/bin/java -ea -Dlog4j.configuration=file:///Users/nsb/Documents/personal/projects/oct21_3/hudi/hudi-client/hudi-spark-client/src/test/resources/log4j-surefire.properties -Didea.test.cyclic.buffer.size=1048576 -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=60882:/Applications/IntelliJ IDEA CE.app/Contents/bin -Dfile.encoding=UTF-8 -classpath /Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar:/Applications/IntelliJ IDEA CE.app/Contents/plugins/junit/lib/junit5-rt.jar:/Applications/IntelliJ IDEA CE.app/Contents/plugins/junit/lib/junit-rt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre/lib/deploy.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre/lib/ext/cldrdata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/j
scala> df.write.format("hudi").
| option("hoodie.datasource.write.recordkey.field","col_str_0005").
| option("hoodie.datasource.write.keygenerator.class","org.apache.hudi.keygen.NonpartitionedKeyGenerator").
| option("hoodie.datasource.write.operation","bulk_insert").
| option("hoodie.table.name","hudi_binsert_base").
| mode(Overwrite).
| save(basePath)
21/11/10 21:59:31 WARN util.package: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
21/11/10 22:11:41 ERROR client.TransportClient: Failed to send RPC RPC 6262720062168238709 to /172.31.35.73:50926: java.nio.channels.ClosedChannelException
scala> df.write.format("hudi").
| option("hoodie.datasource.write.recordkey.field","col_str_0005").
| option("hoodie.datasource.write.keygenerator.class","org.apache.hudi.keygen.NonpartitionedKeyGenerator").
| option("hoodie.datasource.write.operation","bulk_insert").
| option("hoodie.table.name","hudi_binsert_base").
| mode(Overwrite).
| save(hudi_base)
21/11/10 22:47:34 WARN scheduler.TaskSetManager: Lost task 1504.0 in stage 12.0 (TID 4596, ip-172-31-35-206.us-east-2.compute.internal, executor 11): java.io.IOException: No space left on device
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)
docker exec -it adhoc-1 /bin/bash
$SPARK_INSTALL/bin/spark-shell \
--jars $HUDI_SPARK_BUNDLE \
--master local[2] \
--driver-class-path $HADOOP_CONF_DIR \
--conf spark.sql.hive.convertMetastoreParquet=false \
--deploy-mode client \
--driver-memory 1G \
import org.apache.spark.sql.SaveMode._
val df = Seq(
(1, "key1", "abc"),
(1, "key1", "def"),
(2, "key2", "ghi"),
(2, "key3", "jkl")
).toDF("typeId","recordKey", "str")
org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve files in partition from metadata
at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartitions(BaseTableMetadata.java:148)
at org.apache.hudi.client.functional.TestHoodieBackedMetadata.validateMetadata(TestHoodieBackedMetadata.java:1155)
at org.apache.hudi.client.functional.TestHoodieBackedMetadata.testCleaningArchivingAndCompaction(TestHoodieBackedMetadata.java:871)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:688)
@nsivabalan
nsivabalan / gist:f261c6d3e0e3a4178533e922272e162e
Created September 25, 2021 19:46
TestHoodieClientMultiWriter.testMultiWriterWithAsyncTableServicesWithConflictCOW
2021-09-25T18:58:59.3537396Z 259585 [main] WARN org.apache.hudi.client.transaction.lock.LockManager - Acquiring lock succeeded for 004
2021-09-25T18:58:59.3587949Z 259590 [main] WARN org.apache.hudi.metadata.HoodieTableMetadataUtil - MDT. applying commit 004
2021-09-25T18:58:59.3589640Z 259590 [main] WARN org.apache.hudi.metadata.HoodieTableMetadataUtil - for partition 2016/03/15
2021-09-25T18:58:59.3598936Z 259590 [main] WARN org.apache.hudi.metadata.HoodieTableMetadataUtil - new file .8f467155-779b-4a45-9b2a-bd4313afbbbb-1_001.log.2_0-73-97, size 14064
2021-09-25T18:58:59.3600180Z 259590 [main] WARN org.apache.hudi.metadata.HoodieTableMetadataUtil - for partition 2015/03/16
2021-09-25T18:58:59.3601164Z 259590 [main] WARN org.apache.hudi.metadata.HoodieTableMetadataUtil - new file .73acfaf0-1231-4db5-a47f-253e93567e06-0_001.log.2_1-73-98, size 11117
2021-09-25T18:58:59.3602057Z 259590 [main] WARN org.apache.hudi.metadata.HoodieTableMetadataUtil - for partition 2015/03/17
2021-09-25
@nsivabalan
nsivabalan / gist:29f3b094438e5f52af2c5e8c7608c701
Created September 25, 2021 19:38
TestHoodieDeltaStreamer.testAsyncClusteringService
2021-09-25T18:57:34.4371472Z 231792 [pool-1312-thread-1] WARN org.apache.hudi.client.AbstractHoodieWriteClient - ABWC. committing 20210925185728
2021-09-25T18:57:34.4372950Z 231792 [pool-1312-thread-1] WARN org.apache.hudi.client.AbstractHoodieWriteClient - for partition default
2021-09-25T18:57:34.4376569Z 231793 [pool-1312-thread-1] WARN org.apache.hudi.client.AbstractHoodieWriteClient - file info ff519e52-4b57-4b6f-a211-08fd3bc23451-0, path default/ff519e52-4b57-4b6f-a211-08fd3bc23451-0_0-72-110_20210925185728.parquet, total bytes written 671735
2021-09-25T18:57:35.3616803Z 232716 [dispatcher-event-loop-1] WARN org.apache.spark.scheduler.TaskSetManager - Stage 109 contains a task of very large size (200 KB). The maximum recommended task size is 100 KB.
2021-09-25T18:57:35.5429489Z 232898 [dispatcher-event-loop-0] WARN org.apache.spark.scheduler.TaskSetManager - Stage 110 contains a task of very large size (200 KB). The maximum recommended task size is 100 KB.
2021-09-25T18:57:35.7284484Z