This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(igraph) | |
| # Experiment parameters | |
| n=10 # Primary class (i.e. subreddits) | |
| m=100 # Secondary class (i.e. users) | |
| threshold = .5 # edge threshold | |
| ###################################### | |
| seed(123) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import praw | |
| import string | |
| import re | |
| import nltk | |
| r = praw.Reddit('anger fuel comment monitor, by /u/shaggorama') | |
| targets = ['hillary', 'trump', 'hilary', 'election'] | |
| punc_pat = re.compile('['+string.punctuation+']') | |
| blacklist = ['AutoModerator', '2016VoteBot'] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #' Try to construct a dynamic graph object from an edgelist with sequential timestamps, to use render.d3movie per: | |
| #' https://rpubs.com/kateto/netviz | |
| #' | |
| #install.packages('statnet') | |
| #install.packages("ndtv") | |
| library(igraph) | |
| library(statnet) | |
| library(ndtv) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ''' | |
| Binomial coefficient algorithm tests | |
| Doing this in python just for basic prototyping, but we'll need to ultimately | |
| port this to oracle. | |
| ''' | |
| from __future__ import division | |
| import timeit | |
| import math |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def export(self, output_file_name): | |
| """Exports the current optimized pipeline as Python code. | |
| Parameters | |
| ---------- | |
| output_file_name: string | |
| String containing the path and file name of the desired output file | |
| Returns |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # tpot/operators/KNNc.py | |
| from base import LearnerOperator | |
| from sklearn.neighbors import KNeighborsClassifier | |
| import pandas as pd | |
| class KNNc(LearnerOperator): | |
| def __init__(self): | |
| super(KNNc, self).__init__( | |
| func = KNeighborsClassifier, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #' # Constructing a naive bayes classifer from scratch | |
| #' | |
| #' ## Background: bayes rule | |
| #' | |
| #' Recall bayes rule: | |
| #' | |
| #' $$P(\theta|X) = \frac{P(X|\theta)P(\theta)}{P(X)}$$ | |
| #' | |
| #' Each components of this formula has a name: | |
| #' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| install.packages('caret') | |
| install.packages('ccd') | |
| install.packages('d3Network') | |
| install.packages('data.table') | |
| install.packages('dplyr') | |
| install.packages('DMwR') | |
| install.packages('e1071') | |
| install.packages('ergm') | |
| install.packages('ff') | |
| install.packages('foreach') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #' This is the right way | |
| dense_to_sparse = function(m, binary=FALSE){ | |
| library(Matrix) | |
| xy = which(abs(m)>0, arr.ind=TRUE) | |
| if(binary){ | |
| dense = sparseMatrix(i=xy[,1], j=xy[,2], x=1, dims=dim(m) ) | |
| } else { | |
| dense = sparseMatrix(i=xy[,1], j=xy[,2], x=m[xy], dims=dim(m) ) | |
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #' Arrival rate of new max values given some generating distribution | |
| #' | |
| #' To do: Wrapper function to perform repeated simulations from same generating distribution. | |
| #' AFter each iteration, convert output into pairs of (time last max observed, wait time to next max). | |
| #' Spaghetti plot to try to infer a conditional distribution of wait time to next max, given when | |
| #' last max observed. Some kind of exteme value distribution, probably Gumbel or Frechet or something... | |
| ```{r} | |
| set.seed(123) |