Skip to content

Instantly share code, notes, and snippets.

View jdthorpe's full-sized avatar
💭
Can we all just agree it's pronounced "kubernetes"

Jason Thorpe jdthorpe

💭
Can we all just agree it's pronounced "kubernetes"
View GitHub Profile
@jdthorpe
jdthorpe / Favorite Courses.md
Last active January 20, 2023 16:52
A few of my favorite courses

These are a few of my favorite courses on tech topics

General Coding:

Aside: when writing python packages (not just scripts), consider haveing the phrase "If its name has no dots, you cannot use relative imports" tattooed someplace readily accessible.

Aside 2: In python multiprocessing, you cannot use interactive telatype to interact with subprocesses, so if your subprocess that reaches a a call to pdb.set_trace(), the sub-process will simply hang waiting for a reponse, which can be a real pain to debug

@jdthorpe
jdthorpe / cloudSettings
Last active March 9, 2022 18:48
vscode settings
{"lastUpload":"2022-03-09T18:48:13.706Z","extensionVersion":"v3.4.3"}
@jdthorpe
jdthorpe / Data Dict.sql
Last active October 24, 2023 16:42
Code for generating data dictionaries for Databases with and without shcemas
-----------------------------------------
-- simple database
-----------------------------------------
SELECT CONVERT(VARCHAR(140),s.name) AS [Schema Name],
CONVERT(VARCHAR(140),o.name) AS [Object Name],
CONVERT(VARCHAR(140),c.name) AS [Field Name],
CONVERT(VARCHAR,t.name) AS [Field Type] ,
CONVERT( VARCHAR, CASE
WHEN o.type = 'U' THEN 'Table'
@jdthorpe
jdthorpe / keybindings.json
Last active January 10, 2020 01:48
vscode settings
[
{
"key": "ctrl+h",
"command": "workbench.action.focusPreviousGroup"
},
{
"key": "ctrl+l",
"command": "workbench.action.focusNextGroup"
},
{
@jdthorpe
jdthorpe / Shim for sklearn.impute.SimpleImputer.md
Last active September 27, 2019 20:08
Shim for sklearn.impute.SimpleImputer

When imputing using the sklearn.impute.SimpleImputer with the option strategy="most_frequent" calling the .fit() method takes an obserdly long time. This happens scipy.stats.mode is rediculously inefficient for string variables. For example, imputing a single feature with half a million values takes ~15 minutes without the shim and with this shim it takes less than one milliseond.

Usage:

import SimpleImputerShim  # thats it!
@jdthorpe
jdthorpe / README.md
Created July 3, 2019 05:52
Rational for adding collection wide typings

Having to add typings to each require is almost as useless as not having typings at all. For example,

import DB from "nedb"
const myCollection = new nedb(DB)
// insert one kind of document
myCollection.insert<{name:string}>({name:"nedb"})
// expect to find a different kind of document
myCollection.findOne<{name:number}>({},(err,doc) => console.log(typeof doc.name))
@jdthorpe
jdthorpe / kd.js
Created May 28, 2019 00:57
Kernel densties
function kernelDensityEstimator(kernel:{(x:number):number}, X:number[]) {
return function(V:number[]) {
return X.map(function(x) {
return [x, d3.d3.mean(V, function(v:number) { return kernel(x - v); })];
});
};
}
function kernelEpanechnikov(k:number): {(x:number):number}{
@jdthorpe
jdthorpe / bundle.js
Last active April 18, 2019 14:49
code emitted by webpack when packing `importScripts('./require.js');requirejs.config()`
/******/ (function(modules) { // webpackBootstrap
/******/ // The module cache
/******/ var installedModules = {};
/******/
/******/ // The require function
/******/ function __webpack_require__(moduleId) {
/******/
/******/ // Check if module is in cache
/******/ if(installedModules[moduleId]) {
/******/ return installedModules[moduleId].exports;
@jdthorpe
jdthorpe / Gradient Descent Constrained to the Unit Simplex
Last active April 10, 2019 04:40
Take a step in the direction of the gradient restricted to the positive simplex
<svg height="210" width="400">
<path d="M150 0 L75 200 L225 200 Z" />
Sorry, your browser does not support inline SVG.
</svg>
@jdthorpe
jdthorpe / Extending python-docx.md
Last active November 4, 2023 15:41
Extending python-docx

Extending python-docx

Background

I was recently tasked with scraping data from a few thousand of word documents each of which contained nested docx documents which were referenced by w:altChunk elements in the body of the main ./word/document.xml file, like so: