Skip to content

Instantly share code, notes, and snippets.

@bentito
Last active December 10, 2020 21:50
Show Gist options
  • Save bentito/3c64e21d3d0ad6f121b6de3c4cb81d4e to your computer and use it in GitHub Desktop.
Save bentito/3c64e21d3d0ad6f121b6de3c4cb81d4e to your computer and use it in GitHub Desktop.
Hadoop + Hive Build/Runtime issues for Metering Operator -- trying to get the tip of Hadoop 3.3 branch and latest Hive master (Oct 15, 2020) to work to see if will fix a bug related to S3 bucket contents deletion
│ Exception in thread "main" 2020-10-13T17:23:10.464510059Z java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V2020-10-13T17:23:10.464518738Z │
│ at org.apache.hadoop.conf.Configuration.set(Configuration.java:1382)2020-10-13T17:23:10.464616395Z │
│ at org.apache.hadoop.conf.Configuration.set(Configuration.java:1363)2020-10-13T17:23:10.464630106Z │
│ at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:536)2020-10-13T17:23:10.464669465Z │
│ at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:554) │
│ at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:448)2020-10-13T17:23:10.464698814Z │
│ at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:4047)2020-10-13T17:23:10.464708189Z │
│ at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:4010) │
│ at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:7002)2020-10-13T17:23:10.464737483Z │
│ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) │
│ at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)2020-10-13T17:23:10.464765419Z │
│ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) │
│ at java.lang.reflect.Method.invoke(Method.java:498)2020-10-13T17:23:10.464793618Z │
│ at org.apache.hadoop.util.RunJar.run(RunJar.java:323)2020-10-13T17:23:10.464803494Z │
│ at org.apache.hadoop.util.RunJar.main(RunJar.java:236) │
│ stream closed
@bentito
Copy link
Author

bentito commented Nov 11, 2020

TO DO:

podman build -f Dockerfile -t quay.io/btofel/metering-hive:3.1.2-hadoop-rel-3.3.0 .

@bentito
Copy link
Author

bentito commented Nov 12, 2020

Above was resolved by changes to pom around exclusions.

@bentito
Copy link
Author

bentito commented Nov 12, 2020

With 3.3.0 Hadoop + 3.1.2 Hive deploy, we have an startup problem:

│ 20/11/12 19:03:37 [main]: INFO conf.MetastoreConf: Found configuration file file:/opt/hive/conf/hive-site.xml                                                        │
│ 20/11/12 19:03:37 [main]: INFO conf.MetastoreConf: Unable to find config file hivemetastore-site.xml                                                                 │
│ 20/11/12 19:03:37 [main]: INFO conf.MetastoreConf: Found configuration file null                                                                                     │
│ 20/11/12 19:03:37 [main]: INFO conf.MetastoreConf: Unable to find config file metastore-site.xml                                                                     │
│ 20/11/12 19:03:37 [main]: INFO conf.MetastoreConf: Found configuration file null                                                                                     │
│ Exception in thread "main" 2020-11-12T19:03:37.329610002Z java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/l │
│     at org.apache.hadoop.conf.Configuration.set(Configuration.java:1380)                                                                                             │
│     at org.apache.hadoop.conf.Configuration.set(Configuration.java:1361)                                                                                             │
│     at org.apache.hadoop.hive.metastore.conf.MetastoreConf.lambda$newMetastoreConf$1(MetastoreConf.java:1191)                                                        │
│     at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)                                                                                       │
│     at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)                                                                                     │
│     at java.util.Iterator.forEachRemaining(Iterator.java:116)                                                                                                        │
│     at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)                                                                           │
│     at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)                                                                                         │
│     at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)                                                                                  │
│     at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)2020-11-12T19:03:37.329860362Z                                                   │
│     at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)                                                                           │
│     at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)2020-11-12T19:03:37.329910228Z                                                           │
│     at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)2020-11-12T19:03:37.329931256Z                                                          │
│     at org.apache.hadoop.hive.metastore.conf.MetastoreConf.newMetastoreConf(MetastoreConf.java:1188)2020-11-12T19:03:37.329956148Z                                   │
│     at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:8770)2020-11-12T19:03:37.329994081Z                                                    │
│     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)                                                                                                   │
│     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)2020-11-12T19:03:37.330023524Z                                                   │
│     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)2020-11-12T19:03:37.330047776Z                                           │
│     at java.lang.reflect.Method.invoke(Method.java:498)2020-11-12T19:03:37.330062836Z                                                                                │
│     at org.apache.hadoop.util.RunJar.run(RunJar.java:323)2020-11-12T19:03:37.330087858Z                                                                              │
│     at org.apache.hadoop.util.RunJar.main(RunJar.java:236)2020-11-12T19:03:37.330101452Z

@bentito
Copy link
Author

bentito commented Nov 12, 2020

Above problem seems to be multiple guava lib problem:
And from Hive container:

bash-4.4$ find / -name "guava.*jar" -print
/opt/hive/lib/guava-19.0.jar
/opt/hadoop/share/hadoop/common/lib/guava-27.0-jre.jar
/opt/hadoop/share/hadoop/hdfs/lib/guava-27.0-jre.jar
/opt/hadoop/share/hadoop/yarn/csi/lib/guava-20.0.jar

@bentito
Copy link
Author

bentito commented Nov 12, 2020

Suggested fix like:

$ rm /opt/shared/apache-hive-3.1.2-bin/lib/guava-19.0.jar
$ cp /opt/shared/hadoop-3.2.1/share/hadoop/hdfs/lib/guava-27.0-jre.jar /opt/shared/apache-hive-3.1.2-bin/lib/

@bentito
Copy link
Author

bentito commented Nov 17, 2020

Having a problem where Hadoop version created by checkout tag rel/release-3.3.0 and cherry pick all "our" commits creates a fully working Hadoop and tests okay with Hive 3.1.2 (through Metering deploy and reportdatasources data being added).

But... PR has needs-rebase, likely due to these not being the same:

hadoop from_rel-3.3.0 $ git show-ref --heads -s origin master
6ace76f403981964e7b6714530f5e01948c10b09
hadoop from_rel-3.3.0 $ git merge-base origin/master from_rel-3.3.0
49c747ab187d0650143205ba57ca19607ec4c6bd

@bentito
Copy link
Author

bentito commented Nov 18, 2020

Trying to rebase instead onto 3.2.1, but we run into:

Caused by: org.apache.maven.plugin.MojoExecutionException: org.apache.maven.plugin.MojoExecutionException: protoc version is 'libprotoc 3.7.1', expected version is '2.5.0'

and the whole point of this move is to move to newer protobuf

@bentito
Copy link
Author

bentito commented Nov 18, 2020

But this rebased onto 3.2.1 still fails with error I was also seeing on the bigger jump to 3.3.0:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-common: Compilation failure: Compilation failure:
[ERROR] /build/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java:[162,38] cannot find symbol
[ERROR] symbol:   class AbstractDelegationTokenSelector
[ERROR] location: class org.apache.hadoop.crypto.key.kms.KMSClientProvider
[ERROR] /build/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java:[1010,35] cannot find symbol
[ERROR] symbol:   method selectToken(org.apache.hadoop.io.Text,java.util.Collection<org.apache.hadoop.security.token.Token<? extends org.apache.hadoop.security.token.TokenIdentifier>
>)
[ERROR] location: variable INSTANCE of type org.apache.hadoop.crypto.key.kms.KMSClientProvider.TokenSelector

@bentito
Copy link
Author

bentito commented Nov 18, 2020

Above is possible related to this commit:

752092a8608 Revert "HADOOP-14445. Delegation tokens are not shared between KMS instances. Contributed by Xiao Chen and Rushabh S Shah."

@bentito
Copy link
Author

bentito commented Dec 1, 2020

A New Beginning....
Successfully used roughly the kube/master Dockerfile and rebased from 3.1.1 to 3.1.4 just to see how that would go and maybe more nearly align with what has to be the next stop for this change, 3.2.0

https://github.com/bentito/hadoop/blob/yarb_rel_3.1.4/Dockerfile

@bentito
Copy link
Author

bentito commented Dec 2, 2020

rebase to 3.1.4 and 3.2.0 common ancestor compiled successfully:

git pull --rebase upstream  a39296260f8c77f3808e27b43c623a0edebe4a17

This branch is: bentito/yarb_rel_3.1.4-3.2.0

@bentito
Copy link
Author

bentito commented Dec 2, 2020

After the, now, shorter rebase from ☝️ to upstream/branch_3.2.0 seeing this error on build:

[ERROR] /build/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java:[529,54] cannot find symbol

@bentito
Copy link
Author

bentito commented Dec 9, 2020

Built 3.3.0 with a "our" Dockerfile that moves:
cmake -> 3.19.1;
mvn -> 3.3.9;
protobuf -> 3.7.1;
this branch is: hadoop-3.3.0-from-3.2.2

@bentito
Copy link
Author

bentito commented Dec 10, 2020

So we still face a needs-rebase tag. Going with

git cherry-pick 2b9a8c1d3a2^..6ace76f4039

onto hadoop-3.3.0-from-3.2.2 to pickup everything of "ours" from kube-reporting/hadoop/master

@bentito
Copy link
Author

bentito commented Dec 10, 2020

Seeing this as before on this path:

Caused by: org.apache.hadoop.hive.metastore.api.MetaException: User hadoop is not allowed to perform this API call                                                                                                 ``` 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment