Notes to self while reviewing the PR for BEAM-5107.
ElasticsearchIOITcommon JDoc references mvn. To work around this quickly I did the following hack(!).
Added this to the elasticsearch-tests-common/build.gradle
| occurrenceCount, | |
| // verbatim fields in records | |
| v_kingdom, | |
| v_phylum, | |
| v_class, | |
| v_order, | |
| v_family, | |
| v_genus, | |
| v_scientificName, |
| 3:43:23.690 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://c4hivemetastore.gbif-uat.org:9083 | |
| 13:43:23.738 [main] INFO hive.metastore - Connected to metastore. | |
| 13:43:23.742 [main] DEBUG org.apache.beam.sdk.Pipeline - Adding SqlTransform to Pipeline#2021601975 | |
| Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Property 'org.apache.beam.sdk.extensions.sql.impl.planner.BeamRelDataTypeSystem' not valid for plugin type org.apache.calcite.rel.type.RelDataTypeSystem | |
| at org.apache.beam.repackaged.sql.org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:159) | |
| at org.apache.beam.repackaged.sql.org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:114) | |
| at org.apache.beam.repackaged.sql.org.apache.calcite.prepare.PlannerImpl.ready(PlannerImpl.java:143) | |
| at org.apache.beam.repackaged.sql.org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.java:170) | |
| at org.apache.beam.repackaged.sql.org.apache.calcite.tools.Planner.parse(Planner.java |
| @NoArgsConstructor(access = AccessLevel.PRIVATE) | |
| public class BackbonePreRelease { | |
| private static final String SELECT_SQL = "SELECT kingdom, count(*) AS c FROM `hive`.`%s`.`%s`"; | |
| public static void main(String[] args) { | |
| PipelineOptionsFactory.register(BackbonePreReleaseOptions.class); | |
| BackbonePreReleaseOptions options = PipelineOptionsFactory.fromArgs(args).as(BackbonePreReleaseOptions.class); | |
| options.setRunner(SparkRunner.class); | |
| Pipeline p = Pipeline.create(options); |
| Actions | |
| 1) Verify the correct mailing list is in place (TR / DM) | |
| 2) Ensure the participants in the Kilkenny accord are happy that it be finalised (DM mail to list) | |
| - with the change the 5,000€ is not a "hard limit" | |
| 3) Draft a communication to be sent to mailing list covering (DM) | |
| - The Kilkenny Accord status and share it |
| ADD JAR /tmp/hadoop-compress-1.3-SNAPSHOT.jar; | |
| ADD JAR /tmp/occurrence-hive-0.89-20181017.084448-7.jar; | |
| ADD JAR /tmp/brickhouse-0.6.0.jar; | |
| ADD JAR /tmp/occurrence-common-0.89-20181017.084442-7.jar; | |
| ADD JAR /tmp/gbif-api-0.72-20181012.105547-3.jar; | |
| SET io.seqfile.compression.type=BLOCK; | |
| SET mapred.output.compression.codec=org.gbif.hadoop.compress.d2.D2Codec; | |
| SET io.compression.codecs=org.gbif.hadoop.compress.d2.D2Codec; |
| taxonomicstatus | c | |
|---|---|---|
| NULL | 987614012 | |
| accepted | 21892669 | |
| Aceptado | 3513846 | |
| accepted name | 1213782 | |
| Accepted | 956810 | |
| valid | 675813 | |
| ACCEPTED | 336521 | |
| Temporal | 317255 | |
| válido | 277952 |
| v_dynamicproperties | count | |
|---|---|---|
| NULL | 967996798 | |
| {"Activity":"Forage"} | 2922013 | |
| "{'coverScaleCode':'+'}" | 2440492 | |
| "{'coverScaleCode':'r'}" | 1456845 | |
| "{'coverScaleCode':'1'}" | 1428278 | |
| {} | 1075352 | |
| {"Activity":"Display/Song"} | 870676 | |
| {"Activity":"Resting"} | 674730 | |
| "{'coverScaleCode':'3'}" | 648481 |
Notes to self while reviewing the PR for BEAM-5107.
ElasticsearchIOITcommon JDoc references mvn. To work around this quickly I did the following hack(!).
Added this to the elasticsearch-tests-common/build.gradle
| /** | |
| * Decodes the protobuf bytes into {@link Operation} instances. | |
| * | |
| * <p>The encoded format is defined as follows: | |
| * | |
| * <ol> | |
| * <li>"rows" is a byte array encoding of: | |
| * <ol> | |
| * <li>The operation type (e.g. Upsert) encoded as a byte | |
| * <li>The "isSet" bitSet encoded as one or more bytes |
| package com.opencore.demo; | |
| import com.google.common.collect.ImmutableList; | |
| import org.apache.beam.sdk.testing.PAssert; | |
| import org.apache.beam.sdk.testing.TestPipeline; | |
| import org.apache.beam.sdk.transforms.Create; | |
| import org.apache.beam.sdk.transforms.DoFn; | |
| import org.apache.beam.sdk.transforms.ParDo; | |
| import org.apache.beam.sdk.values.PCollection; | |
| import org.junit.Rule; |