Skip to content

Instantly share code, notes, and snippets.

@abruzzi
Last active August 29, 2015 14:01
Show Gist options
  • Save abruzzi/dd24747373832933ce0d to your computer and use it in GitHub Desktop.
Save abruzzi/dd24747373832933ce0d to your computer and use it in GitHub Desktop.
How to solve the readObject: can not find class org.apache.hadoop.hive.commom.conf.HiveConf
<property>
<name>fs.default.name</name>
<value>hdfs://nameservice1:8020/</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://nameservice1:8020/</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>NONE</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>NONE</value>
</property>
Spark:0.9.1
Shark 0.9.1 (in yarn-client mode)
Hadoop 2.2.0 (Single node cluster)
Yarn master and slave both on localhost, works well
Spark build from src
export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
mvn -Pyarn -Dhadoop.version=2.2.0 -Dyarn.version=2.2.0 -DskipTests clean package
SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true sbt/sbt  assembly
Spark Example SparkPi work well in both yarn-standalone mode and yarn-client mode
Shark build from src
SHARK_HADOOP_VERSION=2.2.0 SHARK_YARN=true sbt/sbt clean package gen-idea
set shark-env.sh: 
HADOOP_HOME, SPARK_HOME, MASTER="yarn-client" SHARK_EXEC_MODE=yarn, 
SPARK_ASSEMBLY_JAR /absolute/path/here/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar 
SHARK_ASSEMBLY_JAR /absolute/path/here/shark_2.10-0.9.1.jar
Remove all the hadoop 1.0.4 jar find in lib_managed folder
And then I start shark as follows: SPARK_JAR=/absolute/path/here/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar  bin/shark

In shark cli : "show tables" and "CREATE TABLE" is working.

When I run "select count(1) from src", the task failed with stack trace: 
org.apache.spark.SparkException: Job aborted: Task 1.0:1 failed 4 times (most recent failure: Exception failure: java.lang.RuntimeException: readObject can't find class org.apache.hadoop.hive.conf.HiveConf
And I attach to the SharkCliDriver, my job failed when calling runJob() in SparkContext.

It seems it's a classpath error. If it is, how to set classpath to make it work?

xxx

I solved the problem by moving all the jars in shark's lib_managed to hadoop lib and adding hadoop lib dir to the yarn.application.classpath in yarn_site.xml. Now it works.

xxx

<property><name>yarn.application.classpath</name> <value>$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*,$SHARK_HIVE/hive-hbase-handler/hive-hbase-handler-0.11.0-shark-0.9.1.jar,$SHARK_HIVE/hive-anttasks/hive-anttasks-0.11.0-shark-0.9.1.jar,$SHARK_HIVE/hive-service/hive-service-0.11.0-shark-0.9.1. jar,$SHARK_HIVE/hive-serde/hive-serde-0.11.0-shark-0.9.1.jar,$SHARK_HIVE/hive-metastore/hive-metastore-0.11.0-shark-0.9.1.jar,$SHARK_HIVE/hive-hwi/hive-hwi-0.11.0-shark-0.9.1.jar,$SHARK_HIVE/hive-exec/hive-exec-0.11.0-shark-0.9.1.jar,$SHARK_HIVE/hive-beeline/hive-beeline-0.11.0-shark-0.9.1.jar,$SHARK_HIVE/hive-shims/hive-shims-0.11.0-shark-0.9.1.jar,$SHARK_HIVE/hive-common/hive-common-0.11.0-shark-0.9.1.jar</value></property>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment