layout | title | description | tags | ||
---|---|---|---|---|---|
default |
SQL Style Guide |
A guide to writing clean, clear, and consistent SQL. |
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import math | |
import mmh3 | |
from functools import partial | |
from itertools import izip | |
def estimateCardinality(self, significant_bits) | |
''' | |
Taken and slightly adapted from http://blog.notdot.net/2012/09/Dam-Cool-Algorithms-Cardinality-Estimation | |
Estimates the number of unique elements in the input set values. | |
significant_bits: The number of bits of hash to use as a bucket number; there will be 2**k buckets. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def error(estimate, size): | |
return abs(estimate - size) / float(size) | |
def uniform(): | |
for x in xrange(100000): | |
yield x % 100 | |
rdd = sc.parallelize([x for x in uniform()]) | |
assert(error(rdd._jrdd.rdd().countApproxDistinct(4, 0), 100) < 0.4) | |
assert(error(rdd._jrdd.rdd().countApproxDistinct(8, 0), 100) < 0.1) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Java library Debian package Platform Exact version match | |
metrics-core-2.2.0.jar Could not find it | |
metrics-annotation-2.2.0.jar Could not find it | |
zkclient-0.2.jar Seems to have been deprecated at V0.1 | |
jopt-simple-3.2.jar libjoptsimple-java 3.1-3 precise/universe No | |
scala-compiler.jar scala Possibly | |
slf4j-api-1.7.2.jar libslf4j-java 1.6.4-1 precise/universe No | |
snappy-java-1.0.4.1.jar libsnappy-java (1.0.4.1~dfsg-1) Yes |