Skip to content

Instantly share code, notes, and snippets.

View nassimhaddad's full-sized avatar

Nassim Haddad nassimhaddad

View GitHub Profile
@nassimhaddad
nassimhaddad / digest.R
Created January 19, 2013 08:49
digest package to create hash from any r object
library(digest)
test <- c("hobe", "jmjj", 1)
digest(test, algo = "md5")
digest(test, algo = "sha1")
digest(test, algo = "crc32") # not collision proof
digest(test, algo = "sha256")
digest(test, algo = "sha512")
@nassimhaddad
nassimhaddad / import_json.R
Last active December 11, 2015 17:38
import json file
# install.packages("rjson")
library("rjson")
json_file <- "json_file.json"
json_data <- fromJSON(paste(readLines(json_file), collapse=""))
# additional if needed
library(plyr)
json_data <- lapply(json_data, as.data.frame)
json_data <- do.call(rbind.fill, json_data)
@nassimhaddad
nassimhaddad / levenshtein.R
Created January 26, 2013 09:55
String matching, distance between two strings. Works particularly well to detect retweets or tweet variations.
### string matching
### metric to find the similarity between two strings
### some context in:
### http://en.wikipedia.org/wiki/String_metric
### testing levenshtein metric
library(RecordLinkage)
@nassimhaddad
nassimhaddad / read.xls.R
Last active December 11, 2015 18:28
read excel sheets
#' best package to read excel files is gdata
#' which works with both .xls and .xlsx
#' windows: follow instructions here:
#' http://cran.r-project.org/web/packages/gdata/INSTALL
library(gdata)
xlsx_file <- "myfile.xls"
sheet1 <- read.xls(xlsx_file,
sheet = "Sheet1",
stringsAsFactors = FALSE,
@nassimhaddad
nassimhaddad / non-ascii.R
Created January 26, 2013 18:13
remove non-ascii characters
# remove non-ascii characters
df$text <- gsub("[^\x20-\x7E]", "", df$text)
@nassimhaddad
nassimhaddad / branch_workflow.md
Last active December 11, 2015 20:39
Git - needed functions to create a new repository in Rstudio and then uploading it to bitbucket
@nassimhaddad
nassimhaddad / regex.R
Last active December 12, 2015 02:48
Tricks with regular expressions
# tricks with regular expressions ####
# insert character after a given pattern: use \\1
sub("([[:digit:]]{4})", "\\15", "12346789")
@nassimhaddad
nassimhaddad / gist:4708636
Created February 4, 2013 18:41
reorder factor levels. change the order of factors without changing their value
df$g <- factor(df$g, levels = letters[4:1])
@nassimhaddad
nassimhaddad / r2html_into_variable.R
Created February 7, 2013 12:01
generate html code with R
library(R2HTML)
# setup a temporaty file to store the code
fileName <- 'temp.html'
.HTML.file = file.path(getwd(), fileName)
# make a title
HTML(as.title("Title of my report"), append = FALSE)
# add space
HTMLhr()
@nassimhaddad
nassimhaddad / CCustomCumsum.R
Created February 11, 2013 19:54
custom "filter" = cumulative sum with multiplier, written with inline C. It is incredibly fast
# looping through vector ####
library(inline)
sign <- signature(x="numeric", n="integer", d="numeric")
code <- "
for (int i=1; i < *n; i++) {
x[i] = x[i-1]*d[0] + x[i];
}"
c_fn <- cfunction(sign,