Skip to content

Instantly share code, notes, and snippets.

@bentito
Last active December 10, 2020 21:50
Show Gist options
  • Save bentito/3c64e21d3d0ad6f121b6de3c4cb81d4e to your computer and use it in GitHub Desktop.
Save bentito/3c64e21d3d0ad6f121b6de3c4cb81d4e to your computer and use it in GitHub Desktop.
Hadoop + Hive Build/Runtime issues for Metering Operator -- trying to get the tip of Hadoop 3.3 branch and latest Hive master (Oct 15, 2020) to work to see if will fix a bug related to S3 bucket contents deletion
│ Exception in thread "main" 2020-10-13T17:23:10.464510059Z java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V2020-10-13T17:23:10.464518738Z │
│ at org.apache.hadoop.conf.Configuration.set(Configuration.java:1382)2020-10-13T17:23:10.464616395Z │
│ at org.apache.hadoop.conf.Configuration.set(Configuration.java:1363)2020-10-13T17:23:10.464630106Z │
│ at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:536)2020-10-13T17:23:10.464669465Z │
│ at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:554) │
│ at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:448)2020-10-13T17:23:10.464698814Z │
│ at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:4047)2020-10-13T17:23:10.464708189Z │
│ at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:4010) │
│ at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:7002)2020-10-13T17:23:10.464737483Z │
│ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) │
│ at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)2020-10-13T17:23:10.464765419Z │
│ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) │
│ at java.lang.reflect.Method.invoke(Method.java:498)2020-10-13T17:23:10.464793618Z │
│ at org.apache.hadoop.util.RunJar.run(RunJar.java:323)2020-10-13T17:23:10.464803494Z │
│ at org.apache.hadoop.util.RunJar.main(RunJar.java:236) │
│ stream closed
@bentito
Copy link
Author

bentito commented Nov 11, 2020

Trying again:

a23509823c0f0df9cd9f04a33e8050297d922755 (HEAD -> from_rel-3.3.0, origin/from_rel-3.3.0) Update image names/tags in scripts
5531ef14e727b555cb99aa5d523512ada20c80d0 Add OWNERS file
a5c964d8e5cc0f43fba64430e8be08a9dc8b095a Switch to explicitly using COPY instead of whitelist based dockerignore
a5cfb0c1e8a1795641425a8301694e942a474d30 .dockerignore: Ignore everything not explicitly required for building
fb23be1eea06af901c8ae6e6923e12cb3511a01f Dockerfile*: Minor updates for image builder/clarity
ed19db17d057aaa9aeb5fe922f9983c2361bfc7a Dockerfile*: Remove existing networkaddress.cache property options from java.security config file before setting new value
f50cd8d38dcb703c8121ac186e1e5a03ac9216fc Dockerfile*: Pass --setopt=skip_missing_names_on_install=False to yum install
9fa80d32c30d6463af7986960da17e62b7a94427 Remove spaces from java.security networkaddress values
7323ee0c10ad4e5cc5bd4cef5ac165c42686f543 Dockerfile*: Install openjdk-devel for debugging tools
1523ce4cfef0259e93e9892f77cacd6d6c6da555 Dockerfile*: Tell JVM to not cache DNS results
b4abe2e23685ec6255df2580397779c042966501 Update docker-build-rhel.sh to tag image with short name also
785feb09695ea47384a35ee631112f5bc4dd8752 Create hadoop user in image
9252a92231d07492ea8a9dbaa4f44f7acccacb09 Fix syntax in rhel dockerfile
c5e9f9562b0aa5f9633a7d8f1ddfd773a5848b27 Update rhel Dockerfile to use rhel instead of centos and add build scripts
75fb6dc964f963bbf53f9ba0933e779f4fd60847 Rename Dockerfile.rhel7 to Dockerfile.rhel
*** 8b43fa7038a2733e3c9b7774b6a7dca3fae25544 Add rhel7 dockerfile
64263d4976899c51e390852180694a207040c2c8 make dockerignore recursively ignore target/ dirs
c289d42333906addca3e59076dc5612943a15dfb hadoop-dist: Set artifact classifier to bin
9cb8450d5e44746d4888a5e37b53faf8d5df0f2f hadoop-dist: Don't skip deploy for hadoop-dist project
*** 0242adc8c5f35aff7baec4187e7178293334e01a Exclude org.jsr-305:ri frmo shading in hadoop-client-runtime
   ^^^ broken ^^^ 
* e4476f6e108907a76c2fbbaa39544d1f75e994ce Add exclusion at top level hadoop-project on jersey-guice targeting javax.inject:javax.inject
* c75a270372026390b797389de57ea94b793e704d Dockerfile: Skip using tar and copy distribution directory directly
* b01be28d2e591506933d22b454e3503f62b3e84d Add Dockerfile
* 1772b677796c19d4a76b3f84124fdeba029969f3 Fix invalid xml caused by xlint compiler argument in pom.xml
* aa96f1871bfd858f9bac59cf2a81ec470da649af Updated the index as per 3.3.0 release
b064f09bd687cdecbbfc8af5db487d834182049f Preparing for 3.3.0 Release
a883752df1ca5b8f2523e77179033bcb889ab80c HDFS-15421. IBR leak causes standby NN to be stuck in safe mode.

@bentito
Copy link
Author

bentito commented Nov 11, 2020

This patch allows 0242adc8c5f35aff7baec4187e7178293334e01a to build:

diff --git a/hadoop-client-modules/hadoop-client-runtime/pom.xml b/hadoop-client-modules/hadoop-client-runtime/pom.xml
index f8dda3db20..b9eb68feda 100644
--- a/hadoop-client-modules/hadoop-client-runtime/pom.xml
+++ b/hadoop-client-modules/hadoop-client-runtime/pom.xml
@@ -13,16 +13,16 @@
  limitations under the License. See accompanying LICENSE file.
 -->
 <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
-         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
   <modelVersion>4.0.0</modelVersion>
 <parent>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-project</artifactId>
-   <version>3.1.1</version>
+   <version>3.3.0</version>
    <relativePath>../../hadoop-project</relativePath>
 </parent>
   <artifactId>hadoop-client-runtime</artifactId>
-  <version>3.1.1</version>
+  <version>3.3.0</version>
   <packaging>jar</packaging>
 
   <description>Apache Hadoop Client</description>
@@ -157,7 +157,12 @@
                       <!-- Leave javax APIs that are stable -->
                       <!-- the jdk ships part of the javax.annotation namespace, so if we want to relocate this we'll have to care it out by class :( -->
                       <exclude>com.google.code.findbugs:jsr305</exclude>
-                      <exclude>org.jsr-305:ri</exclude>
+                      <exclude>io.dropwizard.metrics:metrics-core</exclude>
+                      <exclude>org.eclipse.jetty:jetty-servlet</exclude>
+                      <exclude>org.eclipse.jetty:jetty-security</exclude>
+                      <exclude>org.ow2.asm:*</exclude>
+                      <!-- Leave bouncycastle unshaded because it's signed with a special Oracle certificate so it can be a custom JCE security provider -->
+                      <exclude>org.bouncycastle:*</exclude>
                     </excludes>
                   </artifactSet>
                   <filters>

And allows master for metering hadoop to build after checkout upstream/rel/release-3.3.0 and cherry-pick all metering related commits onto the branch.

@bentito
Copy link
Author

bentito commented Nov 11, 2020

modified metering for newer hadoop/hive, meant for 3.3 branch tip. but hopefully will work with the rel/release-3.3.0 tagged version:

bin/deploy-metering install --repo quay.io/btofel/metering-operator --tag metering-operator-hadoop33-mods

@bentito
Copy link
Author

bentito commented Nov 11, 2020

TO DO:

podman build -f Dockerfile -t quay.io/btofel/metering-hive:3.1.2-hadoop-rel-3.3.0 .

@bentito
Copy link
Author

bentito commented Nov 12, 2020

Above was resolved by changes to pom around exclusions.

@bentito
Copy link
Author

bentito commented Nov 12, 2020

With 3.3.0 Hadoop + 3.1.2 Hive deploy, we have an startup problem:

│ 20/11/12 19:03:37 [main]: INFO conf.MetastoreConf: Found configuration file file:/opt/hive/conf/hive-site.xml                                                        │
│ 20/11/12 19:03:37 [main]: INFO conf.MetastoreConf: Unable to find config file hivemetastore-site.xml                                                                 │
│ 20/11/12 19:03:37 [main]: INFO conf.MetastoreConf: Found configuration file null                                                                                     │
│ 20/11/12 19:03:37 [main]: INFO conf.MetastoreConf: Unable to find config file metastore-site.xml                                                                     │
│ 20/11/12 19:03:37 [main]: INFO conf.MetastoreConf: Found configuration file null                                                                                     │
│ Exception in thread "main" 2020-11-12T19:03:37.329610002Z java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/l │
│     at org.apache.hadoop.conf.Configuration.set(Configuration.java:1380)                                                                                             │
│     at org.apache.hadoop.conf.Configuration.set(Configuration.java:1361)                                                                                             │
│     at org.apache.hadoop.hive.metastore.conf.MetastoreConf.lambda$newMetastoreConf$1(MetastoreConf.java:1191)                                                        │
│     at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)                                                                                       │
│     at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)                                                                                     │
│     at java.util.Iterator.forEachRemaining(Iterator.java:116)                                                                                                        │
│     at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)                                                                           │
│     at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)                                                                                         │
│     at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)                                                                                  │
│     at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)2020-11-12T19:03:37.329860362Z                                                   │
│     at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)                                                                           │
│     at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)2020-11-12T19:03:37.329910228Z                                                           │
│     at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)2020-11-12T19:03:37.329931256Z                                                          │
│     at org.apache.hadoop.hive.metastore.conf.MetastoreConf.newMetastoreConf(MetastoreConf.java:1188)2020-11-12T19:03:37.329956148Z                                   │
│     at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:8770)2020-11-12T19:03:37.329994081Z                                                    │
│     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)                                                                                                   │
│     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)2020-11-12T19:03:37.330023524Z                                                   │
│     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)2020-11-12T19:03:37.330047776Z                                           │
│     at java.lang.reflect.Method.invoke(Method.java:498)2020-11-12T19:03:37.330062836Z                                                                                │
│     at org.apache.hadoop.util.RunJar.run(RunJar.java:323)2020-11-12T19:03:37.330087858Z                                                                              │
│     at org.apache.hadoop.util.RunJar.main(RunJar.java:236)2020-11-12T19:03:37.330101452Z

@bentito
Copy link
Author

bentito commented Nov 12, 2020

Above problem seems to be multiple guava lib problem:
And from Hive container:

bash-4.4$ find / -name "guava.*jar" -print
/opt/hive/lib/guava-19.0.jar
/opt/hadoop/share/hadoop/common/lib/guava-27.0-jre.jar
/opt/hadoop/share/hadoop/hdfs/lib/guava-27.0-jre.jar
/opt/hadoop/share/hadoop/yarn/csi/lib/guava-20.0.jar

@bentito
Copy link
Author

bentito commented Nov 12, 2020

Suggested fix like:

$ rm /opt/shared/apache-hive-3.1.2-bin/lib/guava-19.0.jar
$ cp /opt/shared/hadoop-3.2.1/share/hadoop/hdfs/lib/guava-27.0-jre.jar /opt/shared/apache-hive-3.1.2-bin/lib/

@bentito
Copy link
Author

bentito commented Nov 17, 2020

Having a problem where Hadoop version created by checkout tag rel/release-3.3.0 and cherry pick all "our" commits creates a fully working Hadoop and tests okay with Hive 3.1.2 (through Metering deploy and reportdatasources data being added).

But... PR has needs-rebase, likely due to these not being the same:

hadoop from_rel-3.3.0 $ git show-ref --heads -s origin master
6ace76f403981964e7b6714530f5e01948c10b09
hadoop from_rel-3.3.0 $ git merge-base origin/master from_rel-3.3.0
49c747ab187d0650143205ba57ca19607ec4c6bd

@bentito
Copy link
Author

bentito commented Nov 18, 2020

Trying to rebase instead onto 3.2.1, but we run into:

Caused by: org.apache.maven.plugin.MojoExecutionException: org.apache.maven.plugin.MojoExecutionException: protoc version is 'libprotoc 3.7.1', expected version is '2.5.0'

and the whole point of this move is to move to newer protobuf

@bentito
Copy link
Author

bentito commented Nov 18, 2020

But this rebased onto 3.2.1 still fails with error I was also seeing on the bigger jump to 3.3.0:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-common: Compilation failure: Compilation failure:
[ERROR] /build/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java:[162,38] cannot find symbol
[ERROR] symbol:   class AbstractDelegationTokenSelector
[ERROR] location: class org.apache.hadoop.crypto.key.kms.KMSClientProvider
[ERROR] /build/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java:[1010,35] cannot find symbol
[ERROR] symbol:   method selectToken(org.apache.hadoop.io.Text,java.util.Collection<org.apache.hadoop.security.token.Token<? extends org.apache.hadoop.security.token.TokenIdentifier>
>)
[ERROR] location: variable INSTANCE of type org.apache.hadoop.crypto.key.kms.KMSClientProvider.TokenSelector

@bentito
Copy link
Author

bentito commented Nov 18, 2020

Above is possible related to this commit:

752092a8608 Revert "HADOOP-14445. Delegation tokens are not shared between KMS instances. Contributed by Xiao Chen and Rushabh S Shah."

@bentito
Copy link
Author

bentito commented Dec 1, 2020

A New Beginning....
Successfully used roughly the kube/master Dockerfile and rebased from 3.1.1 to 3.1.4 just to see how that would go and maybe more nearly align with what has to be the next stop for this change, 3.2.0

https://github.com/bentito/hadoop/blob/yarb_rel_3.1.4/Dockerfile

@bentito
Copy link
Author

bentito commented Dec 2, 2020

rebase to 3.1.4 and 3.2.0 common ancestor compiled successfully:

git pull --rebase upstream  a39296260f8c77f3808e27b43c623a0edebe4a17

This branch is: bentito/yarb_rel_3.1.4-3.2.0

@bentito
Copy link
Author

bentito commented Dec 2, 2020

After the, now, shorter rebase from ☝️ to upstream/branch_3.2.0 seeing this error on build:

[ERROR] /build/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:[529,54] cannot find symbol

@bentito
Copy link
Author

bentito commented Dec 9, 2020

Built 3.3.0 with a "our" Dockerfile that moves:
cmake -> 3.19.1;
mvn -> 3.3.9;
protobuf -> 3.7.1;
this branch is: hadoop-3.3.0-from-3.2.2

@bentito
Copy link
Author

bentito commented Dec 10, 2020

So we still face a needs-rebase tag. Going with

git cherry-pick 2b9a8c1d3a2^..6ace76f4039

onto hadoop-3.3.0-from-3.2.2 to pickup everything of "ours" from kube-reporting/hadoop/master

@bentito
Copy link
Author

bentito commented Dec 10, 2020

Seeing this as before on this path:

Caused by: org.apache.hadoop.hive.metastore.api.MetaException: User hadoop is not allowed to perform this API call                                                                                                 ``` 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment