tgjc’s gists

tgjc / useful_pandas_snippets.py

Created June 11, 2018 15:18 — forked from bsweger/useful_pandas_snippets.md

Useful Pandas Snippets

	# List unique values in a DataFrame column
	# h/t @makmanalp for the updated syntax!
	df['Column Name'].unique()

	# Convert Series datatype to numeric (will error if column has non-numeric values)
	# h/t @makmanalp
	pd.to_numeric(df['Column Name'])

	# Convert Series datatype to numeric, changing non-numeric values to NaN
	# h/t @makmanalp for the updated syntax!

tgjc / gist:3f62450ae0f7e54e115eb26a93a05fb3

Created August 17, 2016 04:46

R dplyr: rename variables using string functions

	DATA %>% rename_(.dots=setNames(names(.), tolower(gsub("FROM", "TO", names(.)))))

	## Taken from: http://stackoverflow.com/questions/30382908/r-dplyr-rename-variables-using-string-functions

tgjc / not_in_lms.R

Created March 18, 2016 02:36

Script to read in HR data and filter using anti-joins


	# Notes:
	# 1. rename staff number field in each spreadsheet to "emp_id"
	# 2. When adding new spreadsheets, convert emp_id class to character to allow anti_join() to work
	# 3. if time allows, automate cleaning of column names and drop unused from hr-list.csv



	library(dplyr)
	library(readr)

tgjc / string-split.txt

Created September 2, 2015 22:40

Why I hate Excel

	##Split text in the form "surname, firstname" and concantenate
	to (firstname lastname)

	##Surname
	=LEFT(TEXT,(FIND(",",TEXT)-1))

	##First name
	=RIGHT(TEXT,LEN(I6) - E6 -1)

	##Concantenate

tgjc / gantt.R

Last active August 29, 2015 14:23

Simple R script for parsing a vector of date-times and returning dates only

	# Read data and extract start / finish columns
	setwd("/Volumes/emr/Testing/")

	raw_file <- read.csv("theo_Epic GANTT 20150605.csv")
	raw_start <- raw_file[,"Start"]
	raw_finish <- raw_file[,"Finish"]

	# Parse date-time and revemove times
	raw_start2 <- dmy(raw_start)
	new_start <- paste(day(raw_start2), month(raw_start2), year(raw_start2),"/")