Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save nsivabalan/4426fe32e112344ec99defa51f6efb5e to your computer and use it in GitHub Desktop.

Select an option

Save nsivabalan/4426fe32e112344ec99defa51f6efb5e to your computer and use it in GitHub Desktop.
spark-submit \
> --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer $HUDI_UTILITIES_BUNDLE \
> --table-type COPY_ON_WRITE \
> --source-class org.apache.hudi.utilities.sources.JsonKafkaSource \
> --source-ordering-field ts \
> --target-base-path /user/hive/warehouse/stock_ticks_cow \
> --target-table stock_ticks_cow \
> --transformer-class org.apache.hudi.utilities.transform.DeleteTransformer \
> --props /var/demo/config/kafka-source.properties \
> --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider
20/07/16 13:05:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/07/16 13:05:56 WARN deltastreamer.SchedulerConfGenerator: Job Scheduling Configs will not be in effect as spark.scheduler.mode is not set to FAIR at instantiation time. Continuing without scheduling configs
20/07/16 13:05:57 INFO spark.SparkContext: Running Spark version 2.4.4
20/07/16 13:05:57 INFO spark.SparkContext: Submitted application: delta-streamer-stock_ticks_cow
20/07/16 13:05:57 INFO spark.SecurityManager: Changing view acls to: root
20/07/16 13:05:57 INFO spark.SecurityManager: Changing modify acls to: root
20/07/16 13:05:57 INFO spark.SecurityManager: Changing view acls groups to:
20/07/16 13:05:57 INFO spark.SecurityManager: Changing modify acls groups to:
20/07/16 13:05:57 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
20/07/16 13:05:57 INFO Configuration.deprecation: mapred.output.compression.codec is deprecated. Instead, use mapreduce.output.fileoutputformat.compress.codec
20/07/16 13:05:57 INFO Configuration.deprecation: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
20/07/16 13:05:57 INFO Configuration.deprecation: mapred.output.compression.type is deprecated. Instead, use mapreduce.output.fileoutputformat.compress.type
20/07/16 13:05:57 INFO util.Utils: Successfully started service 'sparkDriver' on port 41431.
20/07/16 13:05:57 INFO spark.SparkEnv: Registering MapOutputTracker
20/07/16 13:05:57 INFO spark.SparkEnv: Registering BlockManagerMaster
20/07/16 13:05:57 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/07/16 13:05:57 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/07/16 13:05:57 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-21cef692-afeb-45e9-9aa1-b8d08f366253
20/07/16 13:05:57 INFO memory.MemoryStore: MemoryStore started with capacity 366.3 MB
20/07/16 13:05:57 INFO spark.SparkEnv: Registering OutputCommitCoordinator
20/07/16 13:05:57 INFO util.log: Logging initialized @3195ms
20/07/16 13:05:58 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
20/07/16 13:05:58 INFO server.Server: Started @3356ms
20/07/16 13:05:58 INFO server.AbstractConnector: Started ServerConnector@17ae98d7{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20/07/16 13:05:58 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6979efad{/jobs,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4c432866{/jobs/json,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@12365c88{/jobs/job,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2237bada{/jobs/job/json,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@77e2a6e2{/stages,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5710768a{/stages/json,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@199e4c2b{/stages/stage,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@30272916{/stages/stage/json,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5bb3d42d{/stages/pool,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5bf61e67{/stages/pool/json,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2c1dc8e{/storage,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@b273a59{/storage/json,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4e7095ac{/storage/rdd,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@251ebf23{/storage/rdd/json,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@29b732a2{/environment,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1b70203f{/environment/json,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@51671b08{/executors,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@15051a0{/executors/json,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1162410a{/executors/threadDump,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@b09fac1{/executors/threadDump/json,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@62df0ff3{/static,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@22175d4f{/,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@9fecdf1{/api,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5c84624f{/jobs/job/kill,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@63034ed1{/stages/stage/kill,null,AVAILABLE,@Spark}
20/07/16 13:05:58 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://adhoc-2:4040
20/07/16 13:05:58 INFO spark.SparkContext: Added JAR file:/var/hoodie/ws/docker/hoodie/hadoop/hive_base/target/hoodie-utilities.jar at spark://adhoc-2:41431/jars/hoodie-utilities.jar with timestamp 1594904758298
20/07/16 13:05:58 INFO executor.Executor: Starting executor ID driver on host localhost
20/07/16 13:05:58 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42749.
20/07/16 13:05:58 INFO netty.NettyBlockTransferService: Server created on adhoc-2:42749
20/07/16 13:05:58 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
20/07/16 13:05:58 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, adhoc-2, 42749, None)
20/07/16 13:05:58 INFO storage.BlockManagerMasterEndpoint: Registering block manager adhoc-2:42749 with 366.3 MB RAM, BlockManagerId(driver, adhoc-2, 42749, None)
20/07/16 13:05:58 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, adhoc-2, 42749, None)
20/07/16 13:05:58 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, adhoc-2, 42749, None)
20/07/16 13:05:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@78c1a023{/metrics/json,null,AVAILABLE,@Spark}
20/07/16 13:05:59 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://namenode:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1811196397_1, ugi=root (auth:SIMPLE)]]]
20/07/16 13:05:59 WARN spark.SparkContext: Using an existing SparkContext; some configuration may not take effect.
20/07/16 13:06:00 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hive/warehouse/stock_ticks_cow
20/07/16 13:06:00 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://namenode:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1811196397_1, ugi=root (auth:SIMPLE)]]]
20/07/16 13:06:00 INFO table.HoodieTableConfig: Loading table properties from /user/hive/warehouse/stock_ticks_cow/.hoodie/hoodie.properties
20/07/16 13:06:00 INFO table.HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /user/hive/warehouse/stock_ticks_cow
20/07/16 13:06:00 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://namenode:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1811196397_1, ugi=root (auth:SIMPLE)]]]
20/07/16 13:06:00 INFO deltastreamer.HoodieDeltaStreamer: Creating delta streamer with configs : {hoodie.datasource.write.partitionpath.field=date, hoodie.embed.timeline.server=true, auto.offset.reset=earliest, bootstrap.servers=kafkabroker:9092, hoodie.filesystem.view.type=EMBEDDED_KV_STORE, hoodie.bulkinsert.shuffle.parallelism=2, hoodie.upsert.shuffle.parallelism=2, hoodie.deltastreamer.schemaprovider.target.schema.file=/var/demo/config/schema.avsc, hoodie.datasource.write.recordkey.field=key, hoodie.deltastreamer.schemaprovider.source.schema.file=/var/demo/config/schema.avsc, hoodie.insert.shuffle.parallelism=2, hoodie.compact.inline=false, hoodie.deltastreamer.source.kafka.topic=stock_ticks}
20/07/16 13:06:00 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://namenode:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1811196397_1, ugi=root (auth:SIMPLE)]]]
20/07/16 13:06:00 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hive/warehouse/stock_ticks_cow
20/07/16 13:06:00 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://namenode:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1811196397_1, ugi=root (auth:SIMPLE)]]]
20/07/16 13:06:00 INFO table.HoodieTableConfig: Loading table properties from /user/hive/warehouse/stock_ticks_cow/.hoodie/hoodie.properties
20/07/16 13:06:00 INFO table.HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /user/hive/warehouse/stock_ticks_cow
20/07/16 13:06:00 INFO timeline.HoodieActiveTimeline: Loaded instants [[20200716130254__commit__COMPLETED]]
20/07/16 13:06:00 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"stock_ticks","fields":[{"name":"volume","type":"long"},{"name":"ts","type":"string"},{"name":"symbol","type":"string"},{"name":"year","type":"int"},{"name":"month","type":"string"},{"name":"high","type":"double"},{"name":"low","type":"double"},{"name":"key","type":"string"},{"name":"date","type":"string"},{"name":"close","type":"double"},{"name":"open","type":"double"},{"name":"day","type":"string"},{"name":"_hoodie_is_deleted","type":"boolean","default":false}]}, {"type":"record","name":"stock_ticks","fields":[{"name":"volume","type":"long"},{"name":"ts","type":"string"},{"name":"symbol","type":"string"},{"name":"year","type":"int"},{"name":"month","type":"string"},{"name":"high","type":"double"},{"name":"low","type":"double"},{"name":"key","type":"string"},{"name":"date","type":"string"},{"name":"close","type":"double"},{"name":"open","type":"double"},{"name":"day","type":"string"},{"name":"_hoodie_is_deleted","type":"boolean","default":false}]}]
20/07/16 13:06:00 INFO deltastreamer.HoodieDeltaStreamer: Delta Streamer running only single round
20/07/16 13:06:00 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hive/warehouse/stock_ticks_cow
20/07/16 13:06:00 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://namenode:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1811196397_1, ugi=root (auth:SIMPLE)]]]
20/07/16 13:06:00 INFO table.HoodieTableConfig: Loading table properties from /user/hive/warehouse/stock_ticks_cow/.hoodie/hoodie.properties
20/07/16 13:06:00 INFO table.HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /user/hive/warehouse/stock_ticks_cow
20/07/16 13:06:00 INFO timeline.HoodieActiveTimeline: Loaded instants [[20200716130254__commit__COMPLETED]]
20/07/16 13:06:01 INFO deltastreamer.DeltaSync: Checkpoint to resume from : Option{val=stock_ticks,0:3482}
20/07/16 13:06:01 INFO consumer.ConsumerConfig: ConsumerConfig values:
auto.commit.interval.ms = 5000
auto.offset.reset = earliest
bootstrap.servers = [kafkabroker:9092]
check.crcs = true
client.id =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = true
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id =
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.schemaprovider.source.schema.file' was supplied but isn't a known config.
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.schemaprovider.target.schema.file' was supplied but isn't a known config.
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.partitionpath.field' was supplied but isn't a known config.
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.compact.inline' was supplied but isn't a known config.
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.upsert.shuffle.parallelism' was supplied but isn't a known config.
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.recordkey.field' was supplied but isn't a known config.
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.source.kafka.topic' was supplied but isn't a known config.
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.insert.shuffle.parallelism' was supplied but isn't a known config.
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.embed.timeline.server' was supplied but isn't a known config.
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.bulkinsert.shuffle.parallelism' was supplied but isn't a known config.
20/07/16 13:06:01 WARN consumer.ConsumerConfig: The configuration 'hoodie.filesystem.view.type' was supplied but isn't a known config.
20/07/16 13:06:01 INFO utils.AppInfoParser: Kafka version : 2.0.0
20/07/16 13:06:01 INFO utils.AppInfoParser: Kafka commitId : 3402a8361b734732
20/07/16 13:06:01 INFO clients.Metadata: Cluster ID: OrXWy2hESqa9Y-DgEbibaw
20/07/16 13:06:01 INFO helpers.KafkaOffsetGen: SourceLimit not configured, set numEvents to default value : 5000000
20/07/16 13:06:01 INFO sources.JsonKafkaSource: About to read 1668 from Kafka for topic :stock_ticks
20/07/16 13:06:01 WARN kafka010.KafkaUtils: overriding enable.auto.commit to false for executor
20/07/16 13:06:01 WARN kafka010.KafkaUtils: overriding auto.offset.reset to none for executor
20/07/16 13:06:01 ERROR kafka010.KafkaUtils: group.id is null, you should probably set it
20/07/16 13:06:01 WARN kafka010.KafkaUtils: overriding executor group.id to spark-executor-null
20/07/16 13:06:01 WARN kafka010.KafkaUtils: overriding receive.buffer.bytes to 65536 see KAFKA-3135
20/07/16 13:06:02 INFO deltastreamer.HoodieDeltaStreamer: Shut down delta streamer
20/07/16 13:06:02 INFO server.AbstractConnector: Stopped Spark@17ae98d7{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20/07/16 13:06:02 INFO ui.SparkUI: Stopped Spark web UI at http://adhoc-2:4040
20/07/16 13:06:02 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
20/07/16 13:06:02 INFO memory.MemoryStore: MemoryStore cleared
20/07/16 13:06:02 INFO storage.BlockManager: BlockManager stopped
20/07/16 13:06:02 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
20/07/16 13:06:02 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
20/07/16 13:06:02 INFO spark.SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/avro/SchemaConverters$
at org.apache.hudi.AvroConversionUtils$.convertAvroSchemaToStructType(AvroConversionUtils.scala:79)
at org.apache.hudi.AvroConversionUtils.convertAvroSchemaToStructType(AvroConversionUtils.scala)
at org.apache.hudi.utilities.deltastreamer.SourceFormatAdapter.fetchNewDataInRowFormat(SourceFormatAdapter.java:109)
at org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:292)
at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:223)
at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:133)
at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:321)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.avro.SchemaConverters$
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 19 more
20/07/16 13:06:02 INFO util.ShutdownHookManager: Shutdown hook called
20/07/16 13:06:02 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-16dcbcff-612c-451d-8262-51dfbee358f2
20/07/16 13:06:02 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7486119b-ac0d-4308-b739-aa3614d43cfa
root@adhoc-2:/opt# exit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment