Skip to content

Instantly share code, notes, and snippets.

View janbenetka's full-sized avatar

Jan Benetka janbenetka

  • Unacast
  • Pilsen, Czech Republic
View GitHub Profile
@janbenetka
janbenetka / example_tracts.txt
Last active December 7, 2020 09:09
[Example Tracts] #geo
36061000202 -> New York, lower east side
01063060200 -> Census Tract 602, Greene County, Alabama
53033008100 -> Seattle, Census Tract 81, King County, Washington
53033032203 -> Seattle, Census Tract 322.03, King County, Washington
08035014202 -> Colorado, Census Tract 142.02, Douglas County, Colorado
08031002000 -> Denver, Colorado, Census Tract 20, Denver County, Colorado
@janbenetka
janbenetka / employee_count.sparql
Created January 31, 2021 17:23
[SPARQL for employee count on DBPedia] #sparql #dbpedia
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT *
WHERE {
?s1 dbo:numberOfEmployees ?employee_count .
?s1 geo:long ?long .
?s1 geo:lat ?lat .
OPTIONAL {?s1 dbp:locationCountry ?country .}
@janbenetka
janbenetka / hist_bq.sql
Last active November 4, 2022 12:19
[Histogram in BigQuery] #bigquery
DECLARE HIST_STEPS INT64 DEFAULT 20;
WITH data AS (
SELECT green_fraction as value
FROM `uc-x-data.14_day_retention.green_counties`
)
, stats AS (
SELECT min+step*i min, min+step*(i+1)max
FROM (
SELECT max-min diff, min, max, (max-min)/HIST_STEPS step, GENERATE_ARRAY(0, HIST_STEPS, 1) i
@janbenetka
janbenetka / median.sql
Created February 12, 2021 02:31
[Median in BigQuery] #bigquery
SELECT
APPROX_QUANTILES(home_duration, 100)[OFFSET(50)] as approx_median
FROM `uc-prox-core-dev.21_day_retention.hw_test6`
@janbenetka
janbenetka / percentiles.sql
Created February 12, 2021 02:31
[Percentiles in BigQuery] #bigquery
SELECT
percentiles[offset(10)] as p10,
percentiles[offset(25)] as p25,
percentiles[offset(50)] as p50,
percentiles[offset(75)] as p75,
percentiles[offset(90)] as p90,
FROM (
SELECT APPROX_QUANTILES(home_duration, 100) as percentiles
FROM `uc-prox-core-dev.21_day_retention.hw_test6`
);
@janbenetka
janbenetka / geohash_to_polygon.py
Created March 22, 2021 11:47
[Geohash to polygon] #python #geo
import sys
!{sys.executable} -m pip install polygon-geohasher;
from polygon_geohasher.polygon_geohasher import *
from shapely import geometry
str(geohash_to_polygon('u2ce02j'))
@janbenetka
janbenetka / h3_in_bigquery.sql
Last active October 27, 2022 10:02
[H3 hexagon functions in BigQuery] #h3 #hex
SELECT jslibs.h3.ST_GEOGPOINTFROMH3("u2ce02j")
SELECT jslibs.h3.ST_H3_BOUNDARY(jslibs.h3.ST_H3(ST_GEOGPOINT(13.377534960188237, 49.747300576525554), 11))
SELECT jslibs.h3.h3GetResolution("u2ce02j");
# lat/lon to hexagon
SELECT jslibs.h3.ST_H3(ST_GEOGPOINT(statistics.coordinate.longitude, statistics.coordinate.latitude), 10) AS dwell_hex_id,
WITH polygon AS (
@janbenetka
janbenetka / bq_date_loop.sql
Last active November 20, 2022 12:46
[BQ Loop over days] Date loop for backfills #bigquery #sql #date #loop
DECLARE i INT64 DEFAULT 0;
DECLARE DATES ARRAY<DATE>;
DECLARE event_date DATE;
SET DATES = GENERATE_DATE_ARRAY(DATE(2019,1,1), DATE(2019, 12, 31), INTERVAL 1 DAY);
LOOP
SET i = i + 1;
IF i > ARRAY_LENGTH(DATES) THEN
@janbenetka
janbenetka / bq_geojson_load.sql
Created June 15, 2021 06:35
[GeoJson load into BQ] #bigquery #json #geo
bq load \
--source_format=NEWLINE_DELIMITED_JSON \
--json_extension=GEOJSON \
--project_id=uc-project-id\
--autodetect \
dataset.new_table \
Arkansas.geojson
@janbenetka
janbenetka / polar_plot.py
Created June 22, 2021 09:08
[Polar/Radar plot in plotly] #plotly
import plotly.graph_objects as go
categories = ['accommodation','financial','tourism',
'shopping', 'food&drinks']
fig = go.Figure()
fig.add_trace(go.Scatterpolar(
r=[1, 5, 2, 2, 3],
theta=categories,