Skip to content

Instantly share code, notes, and snippets.

View morsapaes's full-sized avatar
👹

Marta Paes morsapaes

👹
View GitHub Profile
@morsapaes
morsapaes / twitch_source_postgres.sql
Last active August 31, 2021 09:19
Analyzing Twitch streams with Materialize: Postgres source
CREATE MATERIALIZED SOURCE mz_source
FROM POSTGRES
CONNECTION 'host=postgres port=5432 user=postgres dbname=postgres password=postgres'
PUBLICATION 'mz_source';
CREATE VIEWS FROM SOURCE mz_source (stream_tag_ids);
@morsapaes
morsapaes / twitch_source_kafka.sql
Created August 31, 2021 09:06
Analyzing Twitch streams with Materialize: Kafka source
CREATE SOURCE kafka_twitch
FROM KAFKA BROKER 'kafka:9092' TOPIC 'twitch-streams'
KEY FORMAT BYTES
VALUE FORMAT BYTES
ENVELOPE UPSERT;
@morsapaes
morsapaes / twitch_lateral_join.sql
Created August 31, 2021 08:56
Analyzing Twitch streams with Materialize: who are the most popular streamers for each of the top10 games?
--Use a view to create a "shortcut"
--to the top10 games query
CREATE VIEW v_stream_game_top10 AS
SELECT game_id,
game_name,
agg_viewer_cnt
FROM mv_agg_stream_game
ORDER BY agg_viewer_cnt DESC
LIMIT 10;
@morsapaes
morsapaes / twitch_join_unnest.sql
Last active November 4, 2021 13:15
Analyzing Twitch streams with Materialize: what are the most used tags?
CREATE MATERIALIZED VIEW mv_agg_stream_tag AS
SELECT st.localization_name AS tag,
cnt_tag
FROM (
--Expand the tags array for each event
--and aggregate before joining
SELECT tg, COUNT(*) AS cnt_tag
FROM v_twitch_stream ts,
unnest(tag_ids) tg
WHERE game_id IS NOT NULL
@morsapaes
morsapaes / twitch_temporal_filter.sql
Last active August 31, 2021 09:04
Analyzing Twitch streams with Materialize: what gaming streams started in the last 15 min?
CREATE MATERIALIZED VIEW mv_stream_15min AS
SELECT title,
user_name,
game_name,
started_at
FROM v_twitch_stream
WHERE game_id IS NOT NULL
--Use temporal filters to filter events with
--a sliding window of 15 minutes (900000 miliseconds)
AND (mz_logical_timestamp() >= (extract('epoch' from started_at)*1000)::bigint
@morsapaes
morsapaes / twitch_aggregation.sql
Last active October 7, 2021 09:43
Analyzing Twitch streams with Materialize: what are the most popular games?
CREATE MATERIALIZED VIEW mv_agg_stream_game AS
SELECT game_id,
game_name,
COUNT(id) AS cnt_streams,
SUM(viewer_count) AS agg_viewer_cnt
FROM v_twitch_stream
WHERE game_id IS NOT NULL
GROUP BY game_id, game_name;