Created
November 22, 2021 19:38
-
-
Save Lakens/553a0787873e8b7514aa55307e76de8c to your computer and use it in GitHub Desktop.
Why p-values are not measures of evidence
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(metafor) | |
g <- escalc( | |
measure = "SMD", | |
n1i = 43, # sample size in group 1 is 50 | |
m1i = 1.3, # observed mean in group 1 is 5.6 | |
sd1i = 1, # observed standard deviation in group 1 is 1.2 | |
n2i = 43, # sample size in group 2 is 53 | |
m2i = 1, # observed mean in group 1 is 4.9 | |
sd2i = 1 | |
) # observed standard deviation in group 2 is 1.3 | |
# print results | |
meta_data <- rbind(g, g, g) | |
# Perform the meta-analysis | |
res <- rma(yi, vi, data = meta_data) | |
res | |
forest(res) | |
jpeg(file="plot2.jpg",width=2000,height=1400, units = "px", res = 300) | |
forest(res) | |
dev.off() | |
# Lindley plot | |
N <- 150 | |
p <- 0.05 | |
p_upper <- 0.05 + 0.00000001 | |
p_lower <- 0.00000001 | |
ymax <- 50 # Maximum value y-scale (only for p-curve) | |
# Calculations | |
# p-value function | |
pdf2_t <- function(p) 0.5 * dt(qt(p / 2, 2 * N - 2, 0), 2 * N - 2, ncp) / dt(qt(p / 2, 2 * N - 2, 0), 2 * N - 2, 0) + dt(qt(1 - p / 2, 2 * N - 2, 0), 2 * N - 2, ncp) / dt(qt(1 - p / 2, 2 * N - 2, 0), 2 * N - 2, 0) | |
jpeg(file="plot1.jpg",width=3400,height=2000, units = "px", res = 300) | |
plot(-10, | |
xlab = "P-value", ylab = "", axes = FALSE, | |
main = "P-value distribution for d = 0, 50% power, and 99% power", xlim = c(0, 1), ylim = c(0, ymax), cex.lab = 1.5, cex.main = 1.5, cex.sub = 1 | |
) | |
axis(side = 1, at = seq(0, 1, 0.05), labels = formatC(seq(0, 1, 0.05),format="f", digits=2), cex.axis = 1) | |
#Draw null line | |
ncp <- (0 * sqrt(N / 2)) # Calculate non-centrality parameter d | |
curve(pdf2_t, 0, 1, n = 1000, col = "grey", lty = 1, lwd = 2, add = TRUE) | |
#Draw 50% low power line | |
N <- 146 | |
d <- 0.23 | |
se <- sqrt(2 / N) # standard error | |
ncp <- (d * sqrt(N / 2)) # Calculate non-centrality parameter d | |
curve(pdf2_t, 0, 1, n = 1000, col = "black", lwd = 3, add = TRUE) | |
#Draw 99% power line | |
N <- 150 | |
d <- 0.5 | |
se <- sqrt(2 / N) # standard error | |
ncp <- (d * sqrt(N / 2)) # Calculate non-centrality parameter d | |
curve(pdf2_t, 0, 1, n = 1000, col = "black", lwd = 3, lty = 3, add = TRUE) | |
dev.off() | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi @Lakens!
I was reading your article and the Figure 1 about the Lindley paradox really got my attention. I ended up here because I wanted to replicate the plot and I was curious about the code.
Would you mind expanding on the formula for
pdf2_t
?I understand that we are making use of two distributions, one centered at
0
, and the other one atncp
; the former represents the distribution under the null hypothesis H_0, the latter under the alternative hypothesis H_A. Then we get the critical values foralpha/2
and1-alpha/2
in the null distribution. After that, we get the densities at these values for both distributions and we normalize the densities we got for H_A by those we got for H_0, finally we add them up. Why so? Why do we add them? And most importantly, why we multiply the first term by 0.5 but not the second one?Also, to get the
ncp
(lines 54 and 60), that is the effect size/mean difference, shouldn't we multiplyd
by the standard errorsqrt(2/N)
instead of dividing it?