Skip to content

Instantly share code, notes, and snippets.

View dacort's full-sized avatar
🤔
Thinking about some new idea...

Damon P. Cortesi dacort

🤔
Thinking about some new idea...
View GitHub Profile
@dacort
dacort / README.md
Created February 7, 2023 18:47
Reading Athena views from Spark
  1. Download the JDBC driver from here: https://docs.aws.amazon.com/athena/latest/ug/connect-with-jdbc.html - I used the JDBC driver with the Athena SDK, AthenaJDBC42-2.0.35.1000.jar.

  2. Start pyspark with the --jars option.

pyspark --jars AthenaJDBC42-2.0.35.1000.jar
  1. Use spark.read.jdbc to connect to Athena. You need to specify either a User/Password in the properties or set the AwsCredentialsProviderClass property.
@dacort
dacort / base36.py
Created October 31, 2022 18:36
Terraform base36 external data source
import json
import sys
BS = "0123456789abcdefghijklmnopqrstuvwxyz"
def parse_input():
input = sys.stdin.read()
return json.loads(input)
def to_base(n, b):
@dacort
dacort / README.md
Created August 30, 2022 05:13
in-the-zone

In The Zone

When you're in the zone, nothing can stop you. Let your coworkers know what tunes you were listening to in the moment.

Overview

There are two modes in life:

  • (R)ecord - when you want to make a mixtape for somebody
  • (P)layback - when you want to feel the mood
@dacort
dacort / Dockerfile
Created August 19, 2021 22:48
Using Anaconda for Spark dependencies on EMR on EKS
FROM 895885662937.dkr.ecr.us-west-2.amazonaws.com/spark/emr-6.3.0:latest
### Switch to root for installation
USER root
### setup for conda
RUN yum install -y wget
### Install Conda into shared location
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py39_4.10.3-Linux-x86_64.sh \
@dacort
dacort / ctas_update.md
Created July 22, 2019 16:17
Athena CTAS Update

Athena CTAS Update

An approach for updating partitions in an existing table using CTAS queries.

Overview

With the release of CTAS functionality for Athena, you're now able to create derivative tables in Athena with different data formats or S3 locations.

Sometimes, though, you want to be able to add data to a partition of an existing table. As long as that partition doesn't already exist, you can do this with Athena by using CTAS with a temporary table.

import sys
import boto3
from slacker import Slacker
import ConfigParser
import csv
from datetime import datetime
reload(sys)
sys.setdefaultencoding('utf8')
#!/bin/sh
#
#
# Poor man's version. Good 'ol bash and shell commands.
# Initialize stat counters
COPIES=0
FROM_APPS=()
TO_APPS=()
@dacort
dacort / gist:f480bb3b99817dbbefdb
Created April 23, 2015 06:19
Extract Github URL counts from Chrome history
sqlite3 ~/Library/Application\ Support/Google/Chrome/Default/History "SELECT date(visit_time/1000000-11644473600, 'unixepoch'),urls.url,count(*) FROM visits INNER JOIN urls ON visits.url = urls.id WHERE urls.url LIKE '%github.com/%' GROUP BY date(visit_time/1000000-11644473600, 'unixepoch'),urls.url ORDER BY date(visit_time/1000000-11644473600, 'unixepoch') DESC,count(*);"
@dacort
dacort / cookiemonster.go
Created September 23, 2014 06:51
Simple script to extract (encrypted) cookies out of Chrome OS X cookie store. Usage: ./cookiemonster domain.com
package main
import (
"code.google.com/p/go.crypto/pbkdf2"
"crypto/aes"
"crypto/cipher"
"crypto/sha1"
"database/sql"
"fmt"
"log"

Keybase proof

I hereby claim:

  • I am dacort on github.
  • I am dacort (https://keybase.io/dacort) on keybase.
  • I have a public key whose fingerprint is 0931 E57B 9A91 8338 5B93 201B F594 2FCD 9F8D 1C6D

To claim this, I am signing this object: