This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- s1 is not used | |
WITH s1 AS ( | |
SELECT | |
linkedin_slug | |
, count(*) OVER (PARTITION BY 1) | |
, max(updated_at) max_updated_at | |
FROM person | |
GROUP BY linkedin_slug | |
HAVING count(*) > 1 | |
) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# full process. Ingests data from postgres to tb, calculates duplicates, inserts them to a new datasource and removes the rows from that datasource that appear in the original one. | |
source $(pwd)/.env | |
# related, to ingest data from postgres: https://blog.tinybird.co/2019/10/14/the-one-cron-job-that-will-speed-up-your-analytical-queries-in-postgres-a-hundred-fold/ | |
echo 'Ingesting most recently update jobs' | |
psql $HEROKU_POSTGRES_URL -c "COPY (SELECT id, url, job_title, company, description, description_cleaned, date_posted, now() FROM job WHERE COALESCE(description, '') <> '' AND updated_at > now() - interval '70 minutes') TO STDOUT WITH (FORMAT CSV)" | curl -F csv=@- "https://api.tinybird.co/v0/datasources?name=jobs&mode=append&token=$TINYBIRD_ADMIN_TOKEN"; | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import json | |
import io | |
from fastapi.responses import StreamingResponse | |
from fastapi import Request, Response | |
from fastapi import FastAPI | |
app = FastAPI() | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"$schema": "https://vega.github.io/schema/vega-lite/v5.json", | |
"data": {"url": "data/stocks.csv"}, | |
"width": 400, | |
"height": 300, | |
"encoding": {"x": {"field": "date", "type": "temporal"}}, | |
"layer": [ | |
{ | |
"encoding": { | |
"color": {"field": "symbol", "type": "nominal"}, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DROP TABLE IF EXISTS events; | |
CREATE TABLE events | |
( | |
date timestamp, | |
product_id text, | |
user_id bigint, | |
event text, | |
extra_data json | |
); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CREATE OR REPLACE FUNCTION public.regexp_quote(IN TEXT) | |
RETURNS TEXT | |
LANGUAGE plpgsql | |
STABLE | |
AS $$ | |
/******************************************************************************* | |
* Function Name: regexp_quote | |
* In-coming Param: | |
* The string to decoded and convert into a set of text arrays. | |
* Returns: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# loads the result of a query on Tinybird to a Pandas DataFrame | |
import pandas as pd | |
import urllib.parse | |
def build_tinybird_query_url(query: str, token: str) -> str: | |
url = f"https://api.tinybird.co/v0/sql?q={urllib.parse.quote(query + ' FORMAT CSVWithNames')}&token={token}" | |
return url | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import urllib.parse | |
import json | |
def jprint(data: str): | |
"""Prints JSON-like string (data) nicely""" | |
print(json.dumps(data, indent=4, ensure_ascii=False)) | |
def ingest_data(datasource: str, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SELECT | |
date, | |
link, | |
tweet_text, | |
ifNull(url, '') url, | |
ifNull(title, '') title, | |
ifNull(description, '') description, | |
media, | |
format('https://twitter.com/{}', screenname) user_link, | |
format('{} (@{})', username, screenname) user_name_handle, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
% | |
SELECT * FROM all_likes | |
WHERE match(lower(text), replaceRegexpAll({{String(search_term, ' ', description="Return tweets whose text matches this word (can include a regex pattern)", required=True)}}, ' ', '.?')) | |
AND ( | |
multiSearchAnyCaseInsensitive(screenname, splitByString(',', {{String(usernames, ',', description="User names or handles, comma-separated")}})) | |
OR multiSearchAnyCaseInsensitive(username, splitByString(',', {{String(usernames, ',', description="User names or handles, comma-separated")}})) | |
) | |
AND date >= toStartOfDay(toDate({{Date(since, '2000-09-09', description="Start date, YYYY-MM-DD")}})) | |
AND date <= addDays(toStartOfDay(toDate({{Date(until, '2100-09-09', description="End date, YYYY-MM-DD")}})), 1) | |
ORDER BY date desc |