Skip to content

Instantly share code, notes, and snippets.

bigcorPar <- function(x, nblocks = 10, verbose = TRUE, ncore="all", ...){
library(ff, quietly = TRUE)
require(doMC)
if(ncore=="all"){
ncore = multicore:::detectCores()
registerDoMC(cores = ncore)
} else{
registerDoMC(cores = ncore)
}
R <- c(2000, 5000, 10000, 20000, 40000)
## I hit the limit at ~50000 the ff function refuse to create the matrix.
# Error in if (length < 0 || length > .Machine$integer.max) stop("length must be between 1 and .Machine$integer.max") :
# missing value where TRUE/FALSE needed
# http://www.bytemining.com/2010/05/hitting-the-big-data-ceiling-in-r/
normal <- numeric(length=length(R))
for(i in 1:length(R)){
split <- ifelse(R[i]<=20000, 10, 20)
MAT <- matrix(rnorm(R[i] * 10), nrow = 10)
normal[i] <- system.time(res <- bigcor(MAT, nblocks = split, verbose=FALSE))[3]
library(RMySQL)
con <- dbConnect(MySQL(), group='toto', dbname="user_profile")
m <- dbGetQuery(con, "SELECT DISTINCT user_id FROM demographics WHERE gender='0'")
dbDisconnect(con)
@bobthecat
bobthecat / Dice_Rcpp.Rmd
Created September 10, 2013 13:22
Rcpp implementation of the Dice coefficient
---
title: Dice coefficient with RcppEigen
author: David Ruau
license: GPL (>= 2)
tags: Rcpp RcppEigen
summary: Compute the Dice coefficient (1945) between column of a matrix.
---
The Dice coefficient is a simple measure of similarity / dissimilarity (depending how you take it). It is intended to compare asymmetric binary vectors, meaning one of the combination (usually 0-0) is not important and agreement (1-1 pairs) have more weight than disagreement (1-0 or 0-1 pairs). Imagine the following contingency table: