Skip to content

Instantly share code, notes, and snippets.

View kmatt's full-sized avatar
😐

Matt Keranen kmatt

😐
  • SE US
View GitHub Profile
@kmatt
kmatt / df_to_sql_fast.py
Created November 14, 2024 23:07 — forked from bthaman/df_to_sql_fast.py
High-performance Pandas dataframe to SQL Server - uses pyodbc executemany with fast_executemany = True. This is an alternative to out-of-the-box Pandas df_to_sql, which is slow for larger dataframes.
def df_to_sql_fast(df, table_name, numeric_columns, date_columns, append_or_replace, conn):
"""
Appends or overwrites a SQL Server table
using data from a Pandas DataFrame.
Submits df records at once for faster performance
compared to df_to_sql.
Parameters:
df (DataFrame): df used to create/append table
table_name (str): Name of existing SQL Server table
@kmatt
kmatt / ddb_rowhash.txt
Created November 11, 2024 15:21
DuckDB rowhash
SELECT tbl::TEXT, HASH(tbl::TEXT), MD5(tbl::TEXT) FROM tbl;
D create table tbl as (select 1 as a, 2 as b, 3 as c);
D select tbl::text, hash(tbl::text), md5(tbl::text) from tbl;
┌──────────────────────────┬────────────────────────────┬──────────────────────────────────┐
│ CAST(tbl AS VARCHAR) │ hash(CAST(tbl AS VARCHAR)) │ md5(CAST(tbl AS VARCHAR)) │
│ varchar │ uint64 │ varchar │
├──────────────────────────┼────────────────────────────┼──────────────────────────────────┤
│ {'a': 1, 'b': 2, 'c': 3} │ 6764392534128998287 │ e31681d6e7ab078c9679fcd4f50136eb │
└──────────────────────────┴────────────────────────────┴──────────────────────────────────┘
@kmatt
kmatt / NPP_CLI.reg
Created August 19, 2024 16:18
Notepad++ CLI
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\App Paths\npp.exe]
@="C:\\Program Files (x86)\\Notepad++\\notepad++.exe"
@kmatt
kmatt / tigrc
Created July 3, 2024 02:18
Delta diffs in Tig
# View diffs using delta
# Via https://github.com/jonas/tig/issues/26#issuecomment-1923835137
bind diff D >sh -c "git show %(commit) | delta --paging always"
bind diff S >sh -c "git show %(commit) | delta --paging always --side-by-side"
bind stage D >sh -c "git diff HEAD -- %(file) | delta --paging always"
bind stage S >sh -c "git diff HEAD -- %(file) | delta --paging always --side-by-side"
bind status D >sh -c "git diff HEAD -- %(file) | delta --paging always"
bind status S >sh -c "git diff HEAD -- %(file) | delta --paging always --side-by-side"
@kmatt
kmatt / tmux.conf
Created April 30, 2024 14:46
Multiple timezones in tmux status bar
set-option -g status-right '#{client_tty} (#(TZ=US/Mountain date +%%H:%%M)MT #(TZ=UTC date +%%H:%%M)Z) %Y-%m-%d %H:%M'
@kmatt
kmatt / TDM.cmake
Created March 1, 2024 01:42
TDM-GCC and cmake on Windows notes
# To configure for MinGW instead of nmake
#
# C:\TDM-GCC-64\mingwvars.bat
# cmake . -G "MinGw Makefiles"
#
# Makefile: cmake ... -DCMAKE_TOOLCHAIN_FILE=TDM.cmake
set(CMAKE_SYSTEM_NAME Windows)
set(CMAKE_C_COMPILER C:/TDM-GCC-64/bin/gcc.exe)
@kmatt
kmatt / SQLAgentScripter.ps1
Created January 9, 2024 16:52 — forked from tcartwright/SQLAgentScripter.ps1
POWERSHELL: Generates SQL Server Agent objects to sql files
[cmdletbinding()]
Param(
[Parameter(Mandatory=$true)]
[string[]]$servers,
[ValidateScript({
if(-Not ($_ | Test-Path )) {
throw "Folder does not exist"
}
return $true
@kmatt
kmatt / zfsCommands.md
Created December 1, 2023 17:31 — forked from aaiezza/zfsCommands.md
ZFS

Here are some helpful commands for managing ZFS and ZPool on Ubuntu

VDEV

Useful for populating /etc/zfs/vdev_id.conf:

printDisks() {
    for i in /dev/sd[b-i]; do
        fdisk -l $i
@kmatt
kmatt / delta-jars.sh
Last active September 18, 2023 16:23
Spark Delta Jars
# Thrift is not finding Delta jars in Ivy2 cache, even when specified in spark-defaults.conf (spark.sql.catalog.spark_catalog) ?
# Probably a bad solution...
wget https://repo1.maven.org/maven2/io/delta/delta-core_2.12/2.4.0/delta-core_2.12-2.4.0.jar && \
mv delta-core_2.12-2.4.0.jar jars/
wget https://repo1.maven.org/maven2/io/delta/delta-storage/2.4.0/delta-storage-2.4.0.jar && \
mv delta-storage-2.4.0.jar jars/
@kmatt
kmatt / build-spark-pip.sh
Created September 1, 2023 21:13
Build Spark for Python Pip
#!/bin/bash
# build/build-spark-pip.sh
# https://spark.apache.org/docs/3.4.1/building-spark.html
export MAVEN_OPTS="-Xss64m -Xmx2g -XX:ReservedCodeCacheSize=1g"
#./build/mvn -DskipTests clean package
pushd ..