Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowhere. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?
I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.
This is inspired by A half-hour to learn Rust and Zig in 30 minutes.
Your first Go program as a classical "Hello World" is pretty simple:
First we create a workspace for our project:
PostgreSQL Data Types | AWS DMS Data Types | Redshift Data Types | |
---|---|---|---|
INTEGER | INT4 | INT4 | |
SMALLINT | INT2 | INT2 | |
BIGINT | INT8 | INT8 | |
NUMERIC (p,s) | If precision is 39 or greater, then use STRING. | If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length) | |
DECIMAL(P,S) | If precision is 39 or greater, then use STRING. | If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length) | |
REAL | REAL4 | FLOAT4 | |
DOUBLE | REAL8 | FLOAT8 | |
SMALLSERIAL | INT2 | INT2 | |
SERIAL | INT4 | INT4 |
-- Table information like sortkeys, unsorted percentage | |
-- see http://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLE_INFO.html | |
SELECT * FROM svv_table_info; | |
-- Table sizes in GB | |
SELECT t.name, COUNT(tbl) / 1000.0 AS gb | |
FROM ( | |
SELECT DISTINCT datname, id, name | |
FROM stv_tbl_perm | |
JOIN pg_database ON pg_database.oid = db_id |
These are the Kickstarter Engineering and Data role definitions for both teams.
-- Gets all queries for a given date range | |
select starttime, endtime, trim(querytxt) as query | |
from stl_query | |
where starttime between '2014-11-04' and '2014-11-05' | |
order by starttime desc; | |
-- Gets all queries that have been aborted for a given date range | |
select starttime, endtime, trim(querytxt) as query, aborted | |
from stl_query | |
where aborted=1 |
# Slopegraphs in matplotlib | |
# Trey Causey (@treycausey) | |
# Problems arise when equal values occur | |
# within the same time unit | |
import matplotlib.pyplot as plt | |
import numpy as np | |
import pandas as pd | |
units = ['A', 'B', 'C', 'D'] |