Skip to content

Instantly share code, notes, and snippets.

View ivyleavedtoadflax's full-sized avatar
🥘

Matt Upson ivyleavedtoadflax

🥘
View GitHub Profile
@ivyleavedtoadflax
ivyleavedtoadflax / pyenv.md
Last active November 21, 2017 17:05
How to use pyenv with virtualenvwrapper

Using pyenv with virtualenvwrapper

  • pyenv is used to manage which version of python you are using
  • virtualenv is used to manage python dependencies

They can interact nicely together if you do the following:

Install the required packages (this assumes that you have virtualenv and virtualenvwrapper installed).

Install pyenv and pyenv-virtualenvwrapper

@ivyleavedtoadflax
ivyleavedtoadflax / .zshrc
Last active March 3, 2019 18:15
Ubuntu .zshrc
# If you come from bash you might have to change your $PATH.
# export PATH=$HOME/bin:/usr/local/bin:$PATH
# Add $HOME/bin to PATH for use with YouCompleteMe
# https://github.com/zchee/deoplete-jedi/wiki/Setting-up-Python-for-Neovim
export PATH=$PATH:$HOME/bin
export PATH=$PATH:$HOME/.local/bin
# Path to your oh-my-zsh installation.
@ivyleavedtoadflax
ivyleavedtoadflax / ping-csv.sh
Created February 22, 2018 15:31 — forked from dansimau/ping-csv.sh
Ping a host and output each reply in CSV format
#!/bin/bash
#
# Do a ping and output results as CSV.
#
# [email protected]
# 2011-12-23
#
if [ $# -lt 1 ]; then
echo "Usage: $0 [--add-timestamp] <ping host>"
@ivyleavedtoadflax
ivyleavedtoadflax / aws_comprehend.py
Created May 2, 2018 16:01
AWS Comprehend example
# coding: utf-8
# In[5]:
import boto3
import os
@ivyleavedtoadflax
ivyleavedtoadflax / init.vim
Last active April 4, 2020 23:25
NeoVim config
call plug#begin('~/.vim/plugged')
Plug 'Valloric/YouCompleteMe'
Plug 'jreybert/vimagit'
Plug 'tpope/vim-fugitive'
Plug 'tpope/vim-unimpaired'
Plug 'w0rp/ale'
Plug 'tpope/vim-sensible'
Plug 'dracula/vim'
@ivyleavedtoadflax
ivyleavedtoadflax / customer_tokenizer.py
Last active November 9, 2018 13:28
Custom date tokenizer
from spacy.util import (compile_prefix_regex, compile_infix_regex, compile_suffix_regex)
def _custom_tokenizer(self, nlp, regex=[r"[-/,.\n\s]"]):
"""Custom tokenizer to split date formats like 05-05-2015
and 05/05/2015
"""
# Use the default prefixes and suffixes
prefix_re = compile_prefix_regex(nlp.Defaults.prefixes)
suffix_re = compile_suffix_regex(nlp.Defaults.suffixes)
@ivyleavedtoadflax
ivyleavedtoadflax / spacy_doc_vectors.py
Last active July 7, 2019 10:35
Get document vectors from spacy
# Need to run:
# python -m spacy download en
# from console first to get the model
import spacy
import pandas as pd
nlp = spacy.load("en")
@ivyleavedtoadflax
ivyleavedtoadflax / bash_file_iterate.sh
Created August 6, 2019 15:19
Iterate through a bash file and do some things
for i in raw/*.json;
do
# Create new filename
filename=$(basename -- "$i")
extension="${filename##*.}"
filename="${filename%.*}"
new_filename=processed/refs_${filename}.txt
.DEFAULT_GOAL := files
MATCH_PATH := s3://datalabs-dev/reach-airflow/output/match_annotated_titles
EVAL_PATH := s3://datalabs-dev/reach-airflow/output/policy-test/evaluation/results
eval = evaluation-results.json
PRODIGY_PATH = s3://datalabs-data/reach_evaluation/data/sync
prodigy = 2019.10.8_valid_TITLE.jsonl \
@ivyleavedtoadflax
ivyleavedtoadflax / find_overlapping_spans.py
Last active February 24, 2020 19:32
Find overlapping spans in prodigy documents
# coding: utf-8
import itertools
from itertools import groupby
from operator import itemgetter
from pprint import PrettyPrinter
import plac
from deep_reference_parser.io import read_jsonl, write_jsonl