Skip to content

Instantly share code, notes, and snippets.

View pmbaumgartner's full-sized avatar

Peter Baumgartner pmbaumgartner

View GitHub Profile
@pmbaumgartner
pmbaumgartner / applied nlp.ipynb
Last active February 7, 2019 00:29
⚡️Applied Natural Language Processing in Python ⚡️
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pmbaumgartner
pmbaumgartner / dockerpredict.sh
Last active September 26, 2019 17:18
Getting Bert Working!
#!/bin/sh
# use this to get predictions on a test.csv located in BERT_DATA_DIR
export OUTCOME={classification_task_name}
export NOTEBOOK=/notebooks # don't change me
docker run --runtime=nvidia -it --rm \
-v $(pwd):$NOTEBOOK/ \
-e "BERT_BASE_DIR=$NOTEBOOK/uncased_L-12_H-768_A-12" \
@pmbaumgartner
pmbaumgartner / performance_test.py
Last active November 10, 2022 17:24
Django/Postgres Data Load - Performance Comparison
from contextlib import closing
from io import StringIO
from time import time
import pandas as pd
from django.core.management.base import BaseCommand
from django.db import transaction
from faker import Faker
from core.models import Thing
@pmbaumgartner
pmbaumgartner / format_func_with_dict.py
Created December 20, 2019 14:11
How to use `format_func` with a streamlit widget and a dictionary
import streamlit as st
from vega_datasets import data
@st.cache
def load_data():
return data.birdstrikes()
cols = {
@pmbaumgartner
pmbaumgartner / cache_benchmarking.py
Created December 20, 2019 14:22
A Basic Caching Benchmark Example for Streamlit
import streamlit as st
from vega_datasets import data
from time import time
import pandas as pd
@st.cache
def load_data():
return pd.concat((data.airports() for _ in range(100)))
@pmbaumgartner
pmbaumgartner / dynamic_widget.py
Last active December 20, 2019 14:45
An example of a dynamic widget with streamlit
import altair as alt
import streamlit as st
from vega_datasets import data
cars = data.cars()
quantitative_variables = [
"Miles_per_Gallon",
"Cylinders",
"Displacement",
@pmbaumgartner
pmbaumgartner / mpl_altair_streamlit_timing.py
Created December 20, 2019 15:33
Comparison of matplotlib and altair scatterplots in streamlit
from time import time
import altair as alt
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import streamlit as st
def mpl_scatter(dataset, x, y):
@pmbaumgartner
pmbaumgartner / streamlit_refactor.py
Last active December 20, 2019 16:12
An example of refactoring a streamlit app
# ⛔️ BAD EXAMPLE: PRE-REFACTOR
## app.py
import streamlit as st
import pandas as pd
data = pd.read_csv("data.csv") # no function → no cache, requires pandas import: 👎,👎
sample = data.head(100) # not input into streamlit object: 👎
described_sample = sample.describe() # input into streamlit object: ✅
st.write(described_sample)
@pmbaumgartner
pmbaumgartner / conda-pack-win.md
Last active January 7, 2025 13:54
Conda-Pack Windows Instructions

Packing Conda Environments

You must be using conda for this approach. You will need conda installed on the Source machine and the Target machine. The Source machine must have an internet connection, the Target does not. The OS in both environments must match; no going from macOS to Win10 for example.

1. (Source) Install conda-pack in your base python environment.

conda install -c conda-forge conda-pack
@pmbaumgartner
pmbaumgartner / tsss.py
Created January 13, 2021 22:43
TSSS Python
import numpy as np
import numba
@numba.njit()
def tsss(vec1, vec2):
euclidean_distance = np.linalg.norm(vec1 - vec2)
cosine_distance = np.dot(vec1, vec2.T) / (
np.linalg.norm(vec1) * np.linalg.norm(vec2)
)