Skip to content

Instantly share code, notes, and snippets.

View gdbassett's full-sized avatar

Gabe gdbassett

  • Liberty Mutual
  • US
View GitHub Profile
@gdbassett
gdbassett / MaximumEntropyGraph.py
Created January 8, 2018 00:19
A small set of functions to generate maximum entropy graph heirarchies for aggregating graphs
# Copywrite Gabriel Bassett 2018
# Not licensed for reuse
import copy
import operator
import uuid
import networkx as nx
import logging
import pprint
import simplejson as json
@gdbassett
gdbassett / crosswalk_github_issues.R
Created December 14, 2017 20:38
A function demonstrating tokenized pagination of graphQL for R, (querying the github API to get vz-risk/VCDB issue IDs to compare to a verisr object)
#' A function to find github issues which may have been closed w/o being added
#'
#' NOTE: Many github issues have 'na' for their issue #, leading to false positives
#' NOTE: Requires ghql package, currently only at https://github.com/ropensci/ghql
#' `devtools::install_github("ropensci/ghql")`
#'
#' @param veris a verisr dataframe
#' @param gh_token a github user token for api access
#' @return a list of github issue numbers not in veris
#' @export
@gdbassett
gdbassett / dbir_vega_bar_chart.json
Created December 8, 2017 15:17
A DBIR bar chart in VEGA
{
"$schema": "https://vega.github.io/schema/vega/v3.0.json",
"width": 500,
"height": 309,
"autosize": "pad",
"data": [
{
"name": ".",
"format": {
"type": "csv",
@gdbassett
gdbassett / flip.R
Created September 27, 2017 19:31
coord_flip for ggvis
#' Flip the x and y axis
#'
#' This is accomplished by updating the x & y marks, updating the flipping the
#' scales, and updating the axis labels.
#'
#' WARNING: This currently works for rectangular layer figures. It may not work with
#' multiple-layer figures, other marks, or signals.
#'
#' WARNING: No tests currently exist for this function
#'
@gdbassett
gdbassett / schema_to_graph.py
Last active September 21, 2017 20:55
function to convert
import networkx as nx # NOTE: written against dev networkx 2.0
import logging
import inspect
import json
logger = logging.getLogger()
fileLogger = logging.FileHandler("~/Documents/Development/tmp/vega.log")
fileLogger.setLevel(logging.DEBUG)
logger.addHandler(fileLogger)
@gdbassett
gdbassett / two_barcharts.json
Last active September 15, 2017 23:30
Two bar charts with the goal of controlling one from another
{
"$schema": "https://vega.github.io/schema/vega-lite/v2.json",
"vconcat": [
{
"data": {
"values": [
{
"enum": "victim.industry2.52",
"x": 471,
"n": 1935,
@gdbassett
gdbassett / bayesian_credible_intervals.R
Last active August 15, 2017 18:13
bayesian credible intervals on veris data
# pick an enumeration
enum <- "action.*.variety"
# establish filter criteria (easier than a complex standard-eval filter_ line)
df <- vcdb %>%
dplyr::filter(plus.dbir_year == 2016, subset.2017dbir) %>%
dplyr::filter(attribute.confidentiality.data_disclosure.Yes) %>%
dplyr::filter(victim.industry2.92)
# establish priors from previous year
priors <- df %>%
@gdbassett
gdbassett / livesplit.R
Last active June 26, 2017 20:02
basic R code to parse livesplit splits into a dataframe
speedrun <- XML::xmlParse("/livesplit.lss")
speedrun <- XML::xmlToList(speedrun)
chunk <- do.call(rbind, lapply(speedrun[['Segments']], function(segments) {
segments.df <- do.call(rbind, lapply(segments[['SegmentHistory']], function(segment) {
if ('RealTime' %in% names(segment))
data.frame(`attemptID` = segment$.attrs['id'], RealTime = segment$RealTime)
}))
segments.df$name <- rep(segments$Name, nrow(segments.df))
---
title: "Test"
author: "Gabe"
date: "November 03, 2016"
output: html_document
params:
df: data.frame()
a: ""
b: ""
c: "FALSE"
@gdbassett
gdbassett / linearKMeans.R
Last active February 27, 2016 16:45
A quick function to produce a kmeans like calculation, but using a line in place of the point centroid. Used to try and classify multiple linear relationships in a dataset.
#' @param df Dataframe with x and y columns. (Hopefully in the future this can be x)
#' @param nlines The number of clusters.
#' @param ab a dataframe with a 'slopes' and 'intercepts' column and one row per initial line. Dimensions must match nlines.
#' @param maxiter The maximum number of iterations to do
#' @export
#' @examples
linearKMeans <- function(df, ab=NULL, nlines=0, maxiter=1000) {
# default number of lines
nlines_default <- 5