Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save yuu-ito/9202769 to your computer and use it in GitHub Desktop.
Save yuu-ito/9202769 to your computer and use it in GitHub Desktop.

データ解析のための統計データモデリング ロジスティック回帰をggplotで

logistic <- function(z) 1/(1 + exp(-z))
z <- seq(-6, 6, 0.1)
# plot(z,logistic(z),type='l')
library("ggplot2")
qplot(x = z, y = logistic(z)) + geom_line()

plot of chunk unnamed-chunk-1

# http://hosho.ees.hokudai.ac.jp/~kubo/ce/IwanamiBook.html
d <- read.csv("./data4a.csv")
qplot(data = d, x = x, y = y, col = f)

plot of chunk unnamed-chunk-1

head(d)
##   N y     x f
## 1 8 1  9.76 C
## 2 8 6 10.48 C
## 3 8 5 10.83 C
## 4 8 6 10.94 C
## 5 8 1  9.37 C
## 6 8 1  8.81 C
head(cbind(d$y, d$N - d$y))
##      [,1] [,2]
## [1,]    1    7
## [2,]    6    2
## [3,]    5    3
## [4,]    6    2
## [5,]    1    7
## [6,]    1    7
glm(cbind(y, N - y) ~ x + f, data = d, family = binomial)
## 
## Call:  glm(formula = cbind(y, N - y) ~ x + f, family = binomial, data = d)
## 
## Coefficients:
## (Intercept)            x           fT  
##      -19.54         1.95         2.02  
## 
## Degrees of Freedom: 99 Total (i.e. Null);  97 Residual
## Null Deviance:	    499 
## Residual Deviance: 123 	AIC: 272
dim(d)
## [1] 100   4
train.row <- sample(100, 50)
d.train <- d[train.row, ]
d.test <- d[-train.row, ]

d.glm <- glm(cbind(y, N - y) ~ x + f, data = d.train, family = binomial)
print(d.glm)
## 
## Call:  glm(formula = cbind(y, N - y) ~ x + f, family = binomial, data = d.train)
## 
## Coefficients:
## (Intercept)            x           fT  
##      -19.71         1.97         1.87  
## 
## Degrees of Freedom: 49 Total (i.e. Null);  47 Residual
## Null Deviance:	    266 
## Residual Deviance: 80.6 	AIC: 157
# 残り半分のデータの予測。type='response'
# 付けないとデフォルトで線形予測値ななる模様 ?predict.glm
d.pred <- predict(d.glm, d.test[, c(3, 4)], type = "response", se = T)
d.res <- data.frame(d.test, pred = d.pred$fit * d$N, ucl = (d.pred$fit + d.pred$se.fit) * 
    d$N, lcl = (d.pred$fit - d.pred$se.fit) * d$N)
## Warning: row names were found from a short variable and have been
## discarded
library("ggplot2")
p <- qplot(y = y, x = x, col = f, data = d.res)
# p<-p+facet_grid(~f)
p <- p + geom_smooth(aes(ymax = ucl, ymin = lcl), data = d.res, stat = "identity")
# 勝手に線引かれてしまうのはなぜ?
p

plot of chunk unnamed-chunk-1

N y x f
8 1 9.76 C
8 6 10.48 C
8 5 10.83 C
8 6 10.94 C
8 1 9.37 C
8 1 8.81 C
8 3 9.49 C
8 6 11.02 C
8 0 7.97 C
8 8 11.55 C
8 0 9.46 C
8 2 9.47 C
8 0 8.71 C
8 5 10.42 C
8 3 10.06 C
8 6 11 C
8 3 9.95 C
8 4 9.52 C
8 5 10.26 C
8 8 11.33 C
8 5 9.77 C
8 8 10.59 C
8 1 9.35 C
8 4 10 C
8 1 9.53 C
8 8 12.06 C
8 4 9.68 C
8 7 11.32 C
8 5 10.48 C
8 5 10.37 C
8 8 11.33 C
8 1 9.42 C
8 7 10.68 C
8 1 7.91 C
8 3 9.39 C
8 8 11.65 C
8 6 10.66 C
8 7 11.23 C
8 7 10.57 C
8 4 10.42 C
8 7 11.73 C
8 8 12.02 C
8 8 11.55 C
8 0 8.58 C
8 6 11.08 C
8 5 10.49 C
8 8 11.12 C
8 3 8.99 C
8 8 10.08 C
8 8 10.8 C
8 0 7.83 T
8 5 8.88 T
8 5 9.74 T
8 8 9.98 T
8 3 8.46 T
8 2 7.96 T
8 7 9.78 T
8 8 11.93 T
8 3 9.04 T
8 5 10.14 T
8 8 11.01 T
8 3 8.88 T
8 6 9.68 T
8 7 9.8 T
8 8 10.76 T
8 6 9.81 T
8 5 8.37 T
8 5 9.38 T
8 0 7.68 T
8 8 10.23 T
8 8 9.83 T
8 0 7.66 T
8 6 9.33 T
8 1 8.2 T
8 8 9.54 T
8 8 10.55 T
8 6 9.88 T
8 6 9.34 T
8 6 10.38 T
8 6 9.63 T
8 8 12.44 T
8 8 10.17 T
8 7 9.29 T
8 8 11.17 T
8 6 9.13 T
8 2 8.79 T
8 0 8.19 T
8 7 10.25 T
8 8 11.3 T
8 8 10.84 T
8 8 10.97 T
8 3 8.6 T
8 7 9.91 T
8 8 11.38 T
8 8 10.39 T
8 7 10.45 T
8 0 8.94 T
8 5 8.94 T
8 8 10.14 T
8 1 8.5 T
@yuu-ito
Copy link
Author

yuu-ito commented Feb 25, 2014

rpubsにアップロードできなかったのでひとまずここに。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment