Skip to content

Instantly share code, notes, and snippets.

View acmiyaguchi's full-sized avatar
🌎

Anthony Miyaguchi acmiyaguchi

🌎
View GitHub Profile
From a2d8c5fd36fd654dc7d7c324c93a6706fd7b66d0 Mon Sep 17 00:00:00 2001
From: Anthony Miyaguchi <[email protected]>
Date: Mon, 12 Oct 2020 10:53:19 -0700
Subject: [PATCH] Add storybook test for nested search (with css fix from #74
@linh)
---
.storybook/story.css | 1 +
stories/schemaviewer.stories.js | 46 +++++++++++++++++++++++++++++++++
2 files changed, 47 insertions(+)
@acmiyaguchi
acmiyaguchi / result.diff
Last active October 1, 2020 21:06
glam fenix etl changes
Only in glean_all_fenix_incremental_diff/bcb95bb449431aff13322a7bcf9e0a5c0cd240d7: glam_etl
diff -r glean_all_fenix_incremental_diff/bcb95bb449431aff13322a7bcf9e0a5c0cd240d7/org_mozilla_fenix_glam_beta__clients_histogram_aggregates_v1/init.sql glean_all_fenix_incremental_diff/cee5fb0cbba4ac61c87b9778d33157238fd1ab49/org_mozilla_fenix_glam_beta__clients_histogram_aggregates_v1/init.sql
13d12
< latest_version INT64,
diff -r glean_all_fenix_incremental_diff/bcb95bb449431aff13322a7bcf9e0a5c0cd240d7/org_mozilla_fenix_glam_beta__clients_histogram_aggregates_v1/query.sql glean_all_fenix_incremental_diff/cee5fb0cbba4ac61c87b9778d33157238fd1ab49/org_mozilla_fenix_glam_beta__clients_histogram_aggregates_v1/query.sql
5d4
< latest_version INT64,
22d20
< latest_version,
31,32d28
From a1c8ca16aad05292f747335ce6b8333392a8c6de Mon Sep 17 00:00:00 2001
From: Anthony Miyaguchi <[email protected]>
Date: Sat, 12 Sep 2020 02:13:08 +0000
Subject: [PATCH] Install full version of 3.0.1 of Spark
---
Dockerfile | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/Dockerfile b/Dockerfile
@acmiyaguchi
acmiyaguchi / stage_info.json
Last active August 24, 2020 19:43
Spark event logs
{
"Event":"SparkListenerStageCompleted",
"Stage Info":{
"Stage ID":2,
"Stage Attempt ID":2,
"Stage Name":"json at NativeMethodAccessorImpl.java:0",
"Number of Tasks":2,
"RDD Info":[
{
"RDD ID":20,
#!/bin/bash
set -ex
function ds_range {
DS_START=$1 DS_END=$2 python3 - <<EOD
from datetime import date, timedelta, datetime
from os import environ
def parse(ds):
return datetime.strptime(ds, "%Y-%m-%d")
DECLARE four_twenty DEFAULT DATE('2020-04-20');
WITH per_build_client_day AS (
SELECT
PARSE_DATETIME("%Y%m%d%H%M%S", application.build_id) AS build_id,
client_id,
`moz-fx-data-shared-prod`.udf.histogram_normalize(
`moz-fx-data-shared-prod`.udf.histogram_merge(
ARRAY_AGG(
`moz-fx-data-shared-prod`.udf.json_extract_histogram(
{
"my_payload": "hello world"
}
from sklearn.neighbors import NearestNeighbors
def create_knn_graph(data, n_neighbors):
ann = NearestNeighbors(algorithm="ball_tree", n_jobs=-1)
index = ann.fit(data)
smat = index.kneighbors_graph(data, n_neighbors)
g = nx.from_scipy_sparse_matrix(smat)
print(nx.info(g))
return g
#!/usr/bin/env python3
data = []
with open("bigquery_etl/format_sql/tokenizer.py") as fp:
for line in fp.readlines():
if "class" in line and ":" in line:
data.append(line)
print("graph LR")
lines = []
for line in data:
% ./dataproc.sh            
Copying file://test.py [Content-Type=text/x-python]...
/ [1 files][  346.0 B/  346.0 B]                                                
Operation completed over 1 objects/346.0 B.                                      
Waiting on operation [projects/amiyaguchi-dev/regions/us-west1/operations/38876a55-8a25-30f3-9a33-b9bd960ea881].
Waiting for cluster creation operation...done.                                                                                                                                                                                 
Created [https://dataproc.googleapis.com/v1beta2/projects/amiyaguchi-dev/regions/us-west1/clusters/test-3364] Cluster placed in zone [us-west1-a].
Job [3b00983a98754c8b8a7d22aed68931eb] submitted.
Waiting for job output...