Reported through private contact ([email protected])
CDH 5.12.0 oozie/hive tasks can fail intermittently when array field types exist on the MapReduce engine. The stack trace of a failing task is below.
I believe the oozie hive shared lib provides a non deterministic classpath. joda-time-2-1.jar
exists explicitly, but hive-exec.jar
is a fat jar also including joda-time
but without it being relocated to a different package. I believe it is version 1.6 due to the hive-common
transient dependency.
Our workaround has been to duplicate the standard Cloudera manager installed Oozie shared lib, but remove the joda-time-2-1.jar
. I am unsure if this will affect execution on Spark.