This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
load("paintings_train.Rdata") | |
load("paintings_test.Rdata") | |
load("paintings_validation.Rdata") | |
paintings_full <- rbind(paintings_train, paintings_test) | |
paint_full <- clean_data(paintings_full) %>% distinct() | |
paint_valid <- clean_data(paintings_validation) | |
main.lm <- lm(logprice ~ dealer + year + Interm + origin_cat + endbuyer + |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(xgboost) | |
library(BayesTree) | |
library(mice) | |
clean_data_impute = function(df) { | |
preds_remove <- c("sale", "author", "price", "authorstyle", | |
"count", "Surface_Rect", "Surface_Rnd", | |
"diff_origin", "singlefig", "lot") | |
preds_num <- c("position", "year", "logprice", "Height_in", |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
train = paint_train %>% select(-logprice) | |
train.y = paint_train %>% select(logprice) %>% pull() | |
data.new <- paint_test %>% select(-logprice) %>% data.matrix() | |
xgb.fit <- xgboost(data = data.matrix(train), | |
label = train.y, | |
objective = "reg:linear", | |
eval_metric = "rmse", | |
max.depth = 10, | |
eta = 0.05, | |
nround = 75, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(tidyverse) | |
set.seed(1) | |
# dataframe of all permutations of 1:5, random values | |
df = data.frame( | |
expand.grid(x = 1:5, y =1:5), | |
val = rnorm(25)) | |
# spreading from long data to square data pretty simple | |
df %>% spread (y, val) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Butcher migas live-edge gentrify, pabst edison bulb pug. Edison bulb hot chicken iPhone, humblebrag slow-carb pitchfork hell of leggings umami viral. Paleo butcher kale chips kinfolk wolf offal. Flexitarian pour-over polaroid semiotics fam portland beard put a bird on it hoodie iceland bitters taxidermy direct trade master cleanse echo park. Wolf waistcoat mumblecore, pop-up helvetica sustainable leggings taiyaki sartorial. Seitan subway tile mustache marfa. Poke hell of vegan banh mi lomo bitters cred. Post-ironic yuccie cold-pressed, next level taiyaki wolf blog activated charcoal hella gochujang lumbersexual butcher kitsch banjo crucifix. Blog iPhone brunch chartreuse hell of taiyaki pabst chicharrones af taxidermy single-origin coffee umami selvage occupy. Green juice lyft franzen taxidermy. Yuccie quinoa pok pok banh mi. Truffaut scenester woke retro dreamcatcher venmo letterpress affogato celiac hot chicken portland. Hashtag irony chia, retro flannel glossier chambray crucifix single-origin coffee stree |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
for (i in x){ | |
wt <- dnorm(x-i, 0, scale) | |
plt <- ggplot(df, aes(x, y, col = wt)) + | |
geom_point(size = pmax(100*wt, 1)) + | |
geom_line(data = smoother[x<=i,], aes(x, y), col = "black") + | |
geom_point(data = smoother[x==i,], aes(x, y), size = 3, | |
col = "black", shape = 21, fill = "white") + | |
scale_color_gradient(low = "dark blue", high = "red") + | |
theme(legend.position = "none") | |
ggsave(plt, filename = paste("kernel_", i, ".png", sep="")) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(pracma) | |
scale <- abs((erfinv(-0.5)*(2^0.5)*4/h)^-1) | |
wt <- dnorm(x-50, 0, scale) | |
ggplot(df, aes(x, y, col = wt)) + | |
geom_point(size = pmax(100*wt, 1)) + | |
geom_line(data = smoother, aes(x, y), col = "black") + | |
geom_point(data = smoother[x==50,], aes(x, y), size = 3, | |
col = "black", shape = 21, fill = "white") + |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(ggplot2) | |
set.seed(1) | |
x <- 1:100 | |
y <- x^2*sin(2*pi*x/100) + 500*rnorm(length(x)) | |
df <- data.frame(x, y) | |
h <- 12 | |
smoother <- data.frame(ksmooth(x, y, "normal", bandwidth = h, n.points = 100)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
zoom$cherry <- grepl("cherry", zoom$spc_common) | |
zoom$dead <-zoom$status == "Dead" | |
table(zoom$hood, zoom$dead) | |
table(zoom$hood, zoom$cherry) | |
table(zoom$hood, zoom$brch_shoe) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
zoom <- subset(trees, zipcode %in% c(11239, 11206, 11212, 11224, 11221, | |
11201, 11215, 11217, 11231, 11234)) | |
zoom$hood <- as.factor(ifelse(zoom$zipcode %in% c(11201, 11215, 11217, 11231, 11234), 1,0)) | |
map <- get_map(location = c(lon = -73.95, lat = 40.64), zoom = 12, | |
maptype = "satellite", source = "google") | |
ggmap(map) + geom_point(data=zoom, aes(x = longitude, y = latitude, col = hood), | |
size = 0.5, shape = 16, alpha = 0.1, show.legend = F) |