538's data on 45 years of Scrabble games turned into an Arrow file
$ python scrabble.py https://media.githubusercontent.com/media/fivethirtyeight/data/master/scrabble-games/scrabble_games.csv scrabble.arrow
> Task :sdks:java:io:hadoop-file-system:compileJava | |
/beam/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemRegistrar.java:60: error: An unhandled exception was thrown by the Error Prone static analysis plugin. | |
checkArgument( | |
^ | |
Please report this at https://github.com/google/error-prone/issues/new and include the following: | |
error-prone version: 2.10.0 | |
BugPattern: ArgumentSelectionDefectChecker | |
Stack Trace: | |
java.lang.NoSuchMethodError: 'java.util.stream.Stream com.google.common.base.Splitter.splitToStream(java.lang.CharSequence)' |
Linkage Check difference on beam-sdks-java-extensions-sql between master(00ed8a87) and datacatalog-client(9bd21c3a): | |
Lines starting with '<' mean the branch remedies the errors (good) | |
Lines starting with '>' mean the branch introduces new errors (bad) | |
9022a9023,9028 | |
> Class com.fasterxml.jackson.core.TSFBuilder is not found; | |
> referenced by 1 class file | |
> com.fasterxml.jackson.dataformat.csv.CsvFactoryBuilder (jackson-dataformat-csv-2.10.0.jar) | |
> Class com.fasterxml.jackson.databind.cfg.MapperBuilder is not found; | |
> referenced by 1 class file | |
> com.fasterxml.jackson.dataformat.csv.CsvMapper (jackson-dataformat-csv-2.10.0.jar) |
name: projects/apache-beam-testing/topics/java_mobile_gaming_topic | |
name: projects/apache-beam-testing/topics/testpipeline-jenkins-0208193512-b2c6d3ca | |
name: projects/apache-beam-testing/topics/testpipeline-jenkins-0208192737-e75f3cf5 | |
name: projects/apache-beam-testing/topics/testpipeline-jenkins-0210041931-7dbd3392 | |
name: projects/apache-beam-testing/topics/testpipeline-jenkins-0210041202-e68aa32b | |
name: projects/apache-beam-testing/topics/wc_topic_input1f7fc593-1fb1-4590-b806-c373d1f4d9fa | |
name: projects/apache-beam-testing/topics/wc_topic_output1f7fc593-1fb1-4590-b806-c373d1f4d9fa | |
name: projects/apache-beam-testing/topics/game_stats_it_input_topic3f311c11-e954-4628-8889-f8dac2c855e7 | |
name: projects/apache-beam-testing/topics/game_stats_it_input_topiccb2205dd-2d68-4b55-a0dd-e8e72df6182f | |
name: projects/apache-beam-testing/topics/testpipeline-ajamato-0220012558-66bf781b |
❯ cat /tmp/topics | grep PubsubJsonIT | cut -d'-' -f7-9 | sort | uniq -c | |
14 2019-10-03 | |
32 2019-10-04 | |
12 2019-10-05 | |
8 2019-10-06 | |
28 2019-10-07 | |
16 2019-10-08 | |
22 2019-10-09 | |
20 2019-10-10 | |
20 2019-10-11 |
from timeit import timeit | |
N = int(1E6) | |
def bench_conversion(int_size): | |
np_to_int = timeit('int(i)', setup='import numpy as np; i=np.int%d(4528)' % int_size, number=N) | |
int_to_np = timeit('np.int%d(i)' % int_size, setup='import numpy as np; i=int(4528)', number=N) | |
np_to_np = timeit('np.int%d(i)' % int_size, setup='import numpy as np; i=np.int%d(4528)' % int_size, number=N) | |
print("np.int%d to int:\t%.3f ns/op" % (int_size, np_to_int*1E9/N)) | |
print("int to np.int%d:\t%.3f ns/op" % (int_size, np_to_int*1E9/N)) |
> [email protected] perf /home/hulettbh/working_dir/arrow/js | |
> node ./perf/index.js | |
Running apache-arrow performance tests... | |
Parse "tracks": | |
Table.from | |
x 6,199 ops/sec ±2.83% (81 runs sampled) | |
avg: 0.16ms |
538's data on 45 years of Scrabble games turned into an Arrow file
$ python scrabble.py https://media.githubusercontent.com/media/fivethirtyeight/data/master/scrabble-games/scrabble_games.csv scrabble.arrow
""" | |
=================== | |
Label image regions | |
=================== | |
This example shows how to segment an image with image labelling. The following | |
steps are applied: | |
1. Thresholding with automatic Otsu method | |
2. Close small holes with binary closing |