Skip to content

Instantly share code, notes, and snippets.

View wesslen's full-sized avatar

Ryan Wesslen wesslen

View GitHub Profile
@wesslen
wesslen / cfpb-complaints.Rmd
Created May 19, 2019 19:09
cfpb complaints - reticulate
---
name: "Ryan Wesslen"
title: "cfpb complaints/reticulate"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(reticulate); library(tidyverse)
```
@wesslen
wesslen / ctm-lda-experiment.Rmd
Created August 29, 2019 02:47
ctm-lda-experiment
---
title: "LDA vs CTM Experiment"
author: "Ryan Wesslen"
date: "Aug 28, 2019"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE)
```
@wesslen
wesslen / spaCy-entities.py
Created September 4, 2019 14:53
handling spaCy entities
import spacy
from spacy import displacy
path = "en_core_web_sm"
nlp = spacy.load(path)
path_folder = "/path/to/file/"
import pandas as pd
@wesslen
wesslen / boot-practice.R
Last active September 23, 2019 01:20
answers-to-boot-practice
# step 1: create 4 new columns in by_country (get freq stats)
# *Mean = donors_mean
# *Lower = donors_mean - 2*donors_stdev
# *Upper = donors_mean + 2*donors_stdev
# *Type = "Std Dev"
freq_df <- by_country %>%
mutate(Lower = donors_mean - 2*donors_stdev,
Upper = donors_mean + 2*donors_stdev,
Mean = donors_mean) %>%
select(-donors_mean, -donors_stdev) %>%
@wesslen
wesslen / keras.R
Last active November 30, 2019 15:41
rkeras hello worlds
# for more on keras/tf for R, see https://blogs.rstudio.com/tensorflow/
library(keras)
mnist <- dataset_mnist()
x_train <- mnist$train$x
y_train <- mnist$train$y
x_test <- mnist$test$x
y_test <- mnist$test$y
@wesslen
wesslen / textacy-kw.py
Last active January 21, 2020 00:47
textacy - keywords
import spacy
import textacy
import pandas as pd
# load flat file
df = pd.read_csv("data/vispapers.csv", engine = "python")
# texts + metadata
texts = {
"text": df.Abstract,
"Conference": df.Conference,
"Year": df.Year
@wesslen
wesslen / spacy-sent-sim.py
Last active February 3, 2020 03:25
spacy sentence-sentence similarity and altair heatmap
import spacy
import numpy as np
import pandas as pd
import altair as alt
#alt.renderers.enable('default') # if in jupyter, need to activate
def cos_sim(t1, t2):
return np.dot(t1.vector, t2.vector) / (t1.vector_norm * t2.vector_norm)
nlp = spacy.load("en_core_web_lg")
@wesslen
wesslen / vis-similiarity.py
Last active February 4, 2020 04:13
vis papers similarity
import pandas as pd
df = pd.read_csv("vispapers.csv", engine = "python")
df.shape
# count keywords
df['AuthorKeywords'] = df['AuthorKeywords'].apply(str)
raw = [_.lower() for _ in df['AuthorKeywords'] if _ != 'nan']
@wesslen
wesslen / Dockerfile
Last active March 30, 2025 06:13
streamlit-spacy-docker-container
FROM python:3.7
EXPOSE 8501
WORKDIR /app
COPY requirements.txt ./requirements.txt
RUN pip3 install -r requirements.txt
COPY . .
CMD streamlit run app.py
@wesslen
wesslen / get-returns.R
Created May 29, 2020 14:46
get-returns.R
library(tidyverse)
returns <- read_csv("data/returns.csv") %>%
select(Year, equities_sp, treasury_10yr) %>%
gather(key = "Asset", value = "Returns", -Year) %>%
mutate(Asset = ifelse(Asset=="equities_sp",
"Asset A: High risk, high return",
"Asset B: Low risk, low return"))
set.seed(123)