Skip to content

Instantly share code, notes, and snippets.

View pmbaumgartner's full-sized avatar

Peter Baumgartner pmbaumgartner

View GitHub Profile
@pmbaumgartner
pmbaumgartner / softie.py
Last active July 12, 2024 13:50
Create a soft label classifier from any scikit-learn regressor object
from sklearn.base import BaseEstimator, ClassifierMixin
from scipy.special import expit, logit
class SoftLabelClassifier(BaseEstimator, ClassifierMixin):
def __init__(self, regressor, eps=0.001):
self.regressor = regressor
self.eps = eps
def fit(self, X, y=None):
@pmbaumgartner
pmbaumgartner / cloud-init.yaml
Last active February 28, 2025 09:17
Multipass & Docker Setup
#cloud-config
package_upgrade: true
ssh_authorized_keys:
- <your key>
packages:
- apt-transport-https
- ca-certificates
- curl
@pmbaumgartner
pmbaumgartner / docx-cli-search.md
Created July 19, 2021 15:17
Search the contents of Word docs via CLI

Search Contents of Word Documents from the Terminal

You'll need ripgrep and pandoc to get started. You can read more about ripgrep here and pandoc here. I use both of these frequently and they're quite helpful.

You can install them both with homebrew:

brew install pandoc ripgrep
@pmbaumgartner
pmbaumgartner / tsss.py
Created January 13, 2021 22:43
TSSS Python
import numpy as np
import numba
@numba.njit()
def tsss(vec1, vec2):
euclidean_distance = np.linalg.norm(vec1 - vec2)
cosine_distance = np.dot(vec1, vec2.T) / (
np.linalg.norm(vec1) * np.linalg.norm(vec2)
)
@pmbaumgartner
pmbaumgartner / conda-pack-win.md
Last active January 7, 2025 13:54
Conda-Pack Windows Instructions

Packing Conda Environments

You must be using conda for this approach. You will need conda installed on the Source machine and the Target machine. The Source machine must have an internet connection, the Target does not. The OS in both environments must match; no going from macOS to Win10 for example.

1. (Source) Install conda-pack in your base python environment.

conda install -c conda-forge conda-pack
@pmbaumgartner
pmbaumgartner / streamlit_refactor.py
Last active December 20, 2019 16:12
An example of refactoring a streamlit app
# ⛔️ BAD EXAMPLE: PRE-REFACTOR
## app.py
import streamlit as st
import pandas as pd
data = pd.read_csv("data.csv") # no function → no cache, requires pandas import: 👎,👎
sample = data.head(100) # not input into streamlit object: 👎
described_sample = sample.describe() # input into streamlit object: ✅
st.write(described_sample)
@pmbaumgartner
pmbaumgartner / mpl_altair_streamlit_timing.py
Created December 20, 2019 15:33
Comparison of matplotlib and altair scatterplots in streamlit
from time import time
import altair as alt
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import streamlit as st
def mpl_scatter(dataset, x, y):
@pmbaumgartner
pmbaumgartner / dynamic_widget.py
Last active December 20, 2019 14:45
An example of a dynamic widget with streamlit
import altair as alt
import streamlit as st
from vega_datasets import data
cars = data.cars()
quantitative_variables = [
"Miles_per_Gallon",
"Cylinders",
"Displacement",
@pmbaumgartner
pmbaumgartner / cache_benchmarking.py
Created December 20, 2019 14:22
A Basic Caching Benchmark Example for Streamlit
import streamlit as st
from vega_datasets import data
from time import time
import pandas as pd
@st.cache
def load_data():
return pd.concat((data.airports() for _ in range(100)))
@pmbaumgartner
pmbaumgartner / format_func_with_dict.py
Created December 20, 2019 14:11
How to use `format_func` with a streamlit widget and a dictionary
import streamlit as st
from vega_datasets import data
@st.cache
def load_data():
return data.birdstrikes()
cols = {