Skip to content

Instantly share code, notes, and snippets.

@swayson
swayson / search_pandas.py
Created June 8, 2015 07:16
Basic snippet for searching items
def search_item(dataframe, name, query, na=False, case=False, regex=True):
idx = pd.Series([False]*len(dataframe))
# For each item in the query look for the item and collect the documents ids it pertains to
for q in query:
matches = dataframe[text_column].str.contains(q, na=False, case=False, regex=True)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@swayson
swayson / csvkit-eg.md
Last active February 8, 2024 20:06
CSVKit Examples

1. Ditch Excel (for real)

    in2csv file1.xls > file1.csv

2. Conquer fixed-width formats

    in2csv -f fixed -s schema.csv data.fixed > data.csv

3. Find cells matching a regular expression

csvgrep -c phone_number -r "\d{3}-123-\d{4}" data.csv > matching.csv

---
title: 'Going deeper with dplyr: New features in 0.3 and 0.4'
output: html_document
---
## Introduction
In August 2014, I created a [40-minute video tutorial](https://www.youtube.com/watch?v=jWjqLW-u3hc) introducing the key functionality of the dplyr package in R, using dplyr version 0.2. Since then, there have been two significant updates to dplyr (0.3 and 0.4), introducing a ton of new features.
This document (created in March 2015) covers the most useful new features in 0.3 and 0.4, as well as other functionality that I didn't cover last time (though it is not necessarily new). My [new video tutorial](https://www.youtube.com/watch?v=2mh1PqfsXVI) walks through the code below in detail.
---
title: "Introduction to dplyr for Faster Data Manipulation in R"
output: html_document
---
Note: There is a 40-minute [video tutorial](https://www.youtube.com/watch?v=jWjqLW-u3hc) on YouTube that walks through this document in detail.
## Why do I use dplyr?
* Great for data exploration and transformation
@swayson
swayson / mount-vb.md
Last active August 29, 2015 14:17
How to mount a VirtualBox Shared Folder

Ok this was a little confusing for me but I finally realized what was happening. So I decided to give my 2 cents in hopes that it will be more clear for others and if I forget sometime in the future : ).

I was not using the name of the share I created in the VM, instead I used share or vb_share when the name of my share was wd so this had me confused for a minute.

First add your share directory in the VM Box: enter image description here

Whatever you name your share here will be the name you will need to use when mounting in the vm guest OS. i.e. I named mine "wd" for my western digital passport drive.

Next on the the guset OS make a directory to use for your mount preferably in your home directory.

@swayson
swayson / st3-project-settings.json
Created March 14, 2015 17:05
Sublime Text 3 Project configuration for Anaconda and alike
// (Project -> Edit Project)
{
"build_systems":
[
{
"name": "Anaconda Python Builder",
"selector": "source.python",
"shell_cmd": "python -u \"$file\""
}
],
library(ggplot2)
library(gtable)
# create example data
set.seed(42)
dataset_names <- c("Human", "Mouse", "Fly", "Worm")
datasets <- data.frame(name = factor(dataset_names, levels=dataset_names), parity = factor(c(0, 0, 1, 0)), v50 = runif(4, max=0.5), y=1:4)
data <- data.frame( dataset1 = rep(datasets$name, 4), dataset2 = rep(datasets$name, each = 4), z = runif(16,min = 0, max = 0.5) )
pal <- c("#dddddd", "#aaaaaa")
from multiprocessing import Pool
from PIL import Image
SIZE = (75,75)
SAVE_DIRECTORY = 'thumbs'
def get_image_paths(folder):
return (os.path.join(folder, f)
for f in os.listdir(folder)
if 'jpeg' in f)
@swayson
swayson / install-guest-additions.txt
Created March 6, 2015 20:30
Installing Guest Additions on Ubuntu
Follow these steps to install the Guest Additions on your Ubuntu virtual machine:
1. Login as ubuntu;
2. Click on Applications/System/Terminal (or on Applications/Terminal, if you are using the 606.1 Dapper Drake release);
3. Update your APT database with sudo apt-get update, and typing your password, if requested; Install the latest security updates with sudo apt-get upgrade;
4. Install required packages with sudo apt-get install build-essential module-assistant;
5. Configure your system for building kernel modules by running sudo m-a prepare;
6. Click on Install Guest Additions… from the Devices menu, then choose to browse the content of the CD when requested.
7. Run sudo sh /media/cdrom/VBoxLinuxAdditions.run, and follow the instructions on screen.