Skip to content

Instantly share code, notes, and snippets.

View trinker's full-sized avatar

Tyler Rinker trinker

View GitHub Profile
@trinker
trinker / roc.R
Created March 23, 2019 16:42
Calculating AUC: the area under a ROC Curve
## https://blog.revolutionanalytics.com/2016/11/calculating-auc.html
## Load Dependency
library(numform)
##=======================================================
## Make some fake data
##=======================================================
set.seed(10)
actual <- sample(0:1, 100, T, c(.8, .2))
@trinker
trinker / udpipeFormality.R
Created March 22, 2019 19:14
formality_with_udpipe
##==============================================================================
## Formality
##==============================================================================
## 1. tag parts of speech
## 2. convert to generic POS
## 3. COmpute formality off POS
udmodel <- udpipe::udpipe_download_model(language = "english")
udmodel <- udpipe::udpipe_load_model(file = udmodel$file_model)
@trinker
trinker / datasci_install.R
Last active April 20, 2019 11:47
Install datasci packages
#' @param packages An optional vector of Campus Labs packages to install.
#' @param pattern An optional grep pattern of campus labs packages to install.
install_cl <- function(packages = NULL, pattern = '.', ...){
try_install_cran <- function(package){
if (!require(package, character.only = TRUE, quietly = TRUE)) {
message(sprintf('The "%s" package is missing; do you want me to install it?', package))
ans <- menu(c("Yes", "No"))
if (ans == "2") {
@trinker
trinker / topicmodeling_bit.R
Created February 13, 2019 17:21
Bit Topic Modeling
if (!require("pacman")) install.packages("pacman")
pacman::p_load(udpipe, BTM)
data("brussels_reviews_anno", package = "udpipe")
## Get and load the udpipe model
engmod <- udpipe_download_model(language = "english", udpipe_model_repo = "bnosac/udpipe.models.ud")
ud_engmod <- udpipe_load_model(engmod$file_model)
## Annotate the text data and merge back together
nr <- nrow(sentimentr::presidential_debates_2012)
@trinker
trinker / EulerPlot.R
Created February 13, 2019 15:07
Euler Plot Demo
pacman::p_load(venneuler, tidyverse, ggforce)
## Make some fake data
set.seed(10)
dat <- data.frame(
Person = paste0('Person_', 1:10),
setNames(as.data.frame(matrix(rbinom(50, size = 1, prob=c(1/(1:5))), ncol = 5)), paste0('Attribute_', 1:5)),
stringsAsFactors = FALSE
)
@trinker
trinker / secret_message.R
Created February 3, 2019 22:22
Secret Message
library(tidyverse)
map <- data_frame(
ins = c(LETTERS, letters),
outs = c(rev(LETTERS), rev(letters))
)
map
@trinker
trinker / scaling.R
Created February 1, 2019 12:21
Class demo scaling techniques
min_max <- function(x, ...) {
m <- min(x, na.rm = TRUE)
(x - m)/(max(x, na.rm = TRUE) - m)
}
standardization <- function(x, ...) {
if (length(stats::na.omit(unique(x))) > 1) {
scale(x)
} else {
@trinker
trinker / install_github.R
Last active January 25, 2019 20:59
install_github
## Taken from: http://news.mrdwab.com/install_github.R
## source("https://gist.githubusercontent.com/trinker/8a385892fac7d14a1f861f61394a2621/raw/95bdc672c632eff8dcfcd1f19233a0ed3cf85045/install_github.R")
## install_github("trinker/sentimentr")
#' install_github installs R packages directly from GitHub
#'
#' install_github is a very lightweight function with no dependencies
#' that installs R packages from GitHub.
#'
@trinker
trinker / fuzzy_map.R
Created January 10, 2019 13:54
Fuzzy Mapping Between Multi-Word Strings. GCreates a map between 2 vectors with strings that are close but not identically named entities.
fuzzy_map <- function(x, y, distance = 'cosine', cutoff = .40,
remove = c('the', 'a', 'an', 'at', 'of', 'in', '-', ' - CL', ' - OS'),
substitution = data.frame(
pattern = c('univ\\.', '\\bst\\.', '\\bmt\\.', '&'),
replacement = c('university', 'saint', 'mount', ' and '),
stringsAsFactors = FALSE
),
ngrams = 1, # choose 1 or 2 (2 is 2 words)
downweight.words = c('suny', 'cuny', 'college', 'university', 'state'),
# words that are less important - make smaller to make the words LESS important
# Portraits