This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Data/Structure Validation" | |
author: "agstudy" | |
--- | |
This post is an answer to [SO question](http://stackoverflow.com/questions/30844363/data-structure-validation-for-r#comment49735091_30844363) about creating a well typed data structure in R. | |
I think, that in R the only way to define a typed data structure is to `S4 class`. I should not that even S4 classes are *not strongly typed* since you can define your slot as `list`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sactter.grid <- function (x, y = NULL, z = NULL, color = par("col"), pch = NULL, | |
main = NULL, sub = NULL, xlim = NULL, ylim = NULL, zlim = NULL, | |
xlab = NULL, ylab = NULL, zlab = NULL, scale.y = 1, angle = 40, | |
axis = TRUE, tick.marks = TRUE, label.tick.marks = TRUE, | |
x.ticklabs = NULL, y.ticklabs = NULL, z.ticklabs = NULL, | |
y.margin.add = 0, grid = TRUE, box = TRUE, lab = par("lab"), | |
lab.z = mean(lab[1:2]), type = "p", highlight.3d = FALSE, | |
mar = c(5, 3, 4, 3) + 0.1, bg = par("bg"), col.axis = par("col.axis"), | |
col.grid = "grey", col.lab = par("col.lab"), cex.symbols = par("cex"), | |
cex.axis = 0.8 * par("cex.axis"), cex.lab = par("cex.lab"), |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="utf-8" ?> | |
<root> | |
<div class="BVRRReviewTitleContainer"> | |
<span class="BVRRLabel BVRRReviewTitlePrefix"></span> | |
<h2> | |
<span itemprop="name" class="BVRRValue BVRRReviewTitle">Perfect size for the kids and durable</span> | |
</h2> | |
<span class="BVRRLabel BVRRReviewTitleSuffix">, </span> | |
</div> | |
<div class="BVRRReviewDateContainer"> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* package whatever; // don't place package name! */ | |
import java.util.*; | |
import java.lang.*; | |
import java.awt.Point; | |
import java.io.*; | |
/* Name of the class has to be "Main" only if the class is public. */ | |
public class Maze | |
{ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Item-based Collaborative Filtering | |
======================================================== | |
The task is to use 3003377s-offsides.tsv file of drugs and side effects (url below) and use the R **recommenderlab library** or similar to create code that will let us do Item-based Collaborative Filtering, which will mean : | |
* we can identify drugs with similar side effects | |
* possibly predict unknown side effects by finding the average relative risk of side effects of the K-nearest neighbors using some similarity metric. | |
Basically this means making a matrix where cells are relative risk (“rr” or “log2rr” column) and columns are side effects and rows are drugs (use either the “stitch_id” or “drug” column). Then we can randomly omit values from cells and attempt to answer which distance metric or transformation of relative risk allows us to predict more accurately the omitted values. | |
Reading data and levelplot |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## read data | |
ll <- readLines(textConnection('28/03/13 FACTURE CARTE DU 240313 PHOTOBOX SARTROUVILLE CARTE -30 | |
25/03/13 2EME TIERS SUR FACTURE DE 180 EUR DU 240113 CAROLL INTL PARI -60 | |
25/03/13 RETRAIT DAB 24/03/13 11H15 166936 BNP PARIBAS PARIS -200.50 | |
18/03/13 PRELEVEMENT TRESOR PUBLIC 94 IMPOT NUM 005002 ECH 18 -500')) | |
## convert data to a data.frame | |
dat <- do.call(rbind,strsplit(gsub('[ ]{2,}','|',ll),'\\|')) | |
## name columns | |
colnames(dat) <- c('date','category','desc','amount') | |
dat <- as.data.frame(dat) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
########################################## | |
extra.calendarHeat <- function(dates, | |
values, | |
ncolors=99, | |
color="b2w", | |
pch.symbol = 15:20, | |
cex.symbol =2, | |
col.symbol ="#00000044", | |
pvalues=values, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
transformdata <- function(dates, values, date.form = "%Y-%m-%d", ...) { | |
require(lattice) | |
require(grid) | |
require(chron) | |
if (class(dates) == "character" | class(dates) == "factor" ) { | |
dates <- strptime(dates, date.form) | |
} | |
caldat <- data.frame(value = values, dates = dates) | |
min.date <- as.Date(paste(format(min(dates), "%Y"), |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
calendar.division <- function(...) | |
{ | |
xyetc <- list(...) | |
subs <- dat[xyetc$subscripts,] | |
dates.fsubs <- dat[dat$yr == unique(subs$yr),] | |
y.start <- dates.fsubs$dotw[1] | |
y.end <- dates.fsubs$dotw[nrow(dates.fsubs)] | |
dates.len <- nrow(dates.fsubs) | |
adj.start <- dates.fsubs$woty[1] |