Skip to content

Instantly share code, notes, and snippets.

@Lakens
Created November 22, 2021 19:38
Show Gist options
  • Save Lakens/553a0787873e8b7514aa55307e76de8c to your computer and use it in GitHub Desktop.
Save Lakens/553a0787873e8b7514aa55307e76de8c to your computer and use it in GitHub Desktop.
Why p-values are not measures of evidence
library(metafor)
g <- escalc(
measure = "SMD",
n1i = 43, # sample size in group 1 is 50
m1i = 1.3, # observed mean in group 1 is 5.6
sd1i = 1, # observed standard deviation in group 1 is 1.2
n2i = 43, # sample size in group 2 is 53
m2i = 1, # observed mean in group 1 is 4.9
sd2i = 1
) # observed standard deviation in group 2 is 1.3
# print results
meta_data <- rbind(g, g, g)
# Perform the meta-analysis
res <- rma(yi, vi, data = meta_data)
res
forest(res)
jpeg(file="plot2.jpg",width=2000,height=1400, units = "px", res = 300)
forest(res)
dev.off()
# Lindley plot
N <- 150
p <- 0.05
p_upper <- 0.05 + 0.00000001
p_lower <- 0.00000001
ymax <- 50 # Maximum value y-scale (only for p-curve)
# Calculations
# p-value function
pdf2_t <- function(p) 0.5 * dt(qt(p / 2, 2 * N - 2, 0), 2 * N - 2, ncp) / dt(qt(p / 2, 2 * N - 2, 0), 2 * N - 2, 0) + dt(qt(1 - p / 2, 2 * N - 2, 0), 2 * N - 2, ncp) / dt(qt(1 - p / 2, 2 * N - 2, 0), 2 * N - 2, 0)
jpeg(file="plot1.jpg",width=3400,height=2000, units = "px", res = 300)
plot(-10,
xlab = "P-value", ylab = "", axes = FALSE,
main = "P-value distribution for d = 0, 50% power, and 99% power", xlim = c(0, 1), ylim = c(0, ymax), cex.lab = 1.5, cex.main = 1.5, cex.sub = 1
)
axis(side = 1, at = seq(0, 1, 0.05), labels = formatC(seq(0, 1, 0.05),format="f", digits=2), cex.axis = 1)
#Draw null line
ncp <- (0 * sqrt(N / 2)) # Calculate non-centrality parameter d
curve(pdf2_t, 0, 1, n = 1000, col = "grey", lty = 1, lwd = 2, add = TRUE)
#Draw 50% low power line
N <- 146
d <- 0.23
se <- sqrt(2 / N) # standard error
ncp <- (d * sqrt(N / 2)) # Calculate non-centrality parameter d
curve(pdf2_t, 0, 1, n = 1000, col = "black", lwd = 3, add = TRUE)
#Draw 99% power line
N <- 150
d <- 0.5
se <- sqrt(2 / N) # standard error
ncp <- (d * sqrt(N / 2)) # Calculate non-centrality parameter d
curve(pdf2_t, 0, 1, n = 1000, col = "black", lwd = 3, lty = 3, add = TRUE)
dev.off()
@rusiano
Copy link

rusiano commented Nov 10, 2024

Hi @Lakens!

I was reading your article and the Figure 1 about the Lindley paradox really got my attention. I ended up here because I wanted to replicate the plot and I was curious about the code.

Would you mind expanding on the formula for pdf2_t?

pdf2_t <- function(p) 0.5 * dt(qt(p / 2, 2 * N - 2, 0), 2 * N - 2, ncp) / dt(qt(p / 2, 2 * N - 2, 0), 2 * N - 2, 0) + dt(qt(1 - p / 2, 2 * N - 2, 0), 2 * N - 2, ncp) / dt(qt(1 - p / 2, 2 * N - 2, 0), 2 * N - 2, 0)

I understand that we are making use of two distributions, one centered at 0, and the other one at ncp; the former represents the distribution under the null hypothesis H_0, the latter under the alternative hypothesis H_A. Then we get the critical values for alpha/2 and 1-alpha/2 in the null distribution. After that, we get the densities at these values for both distributions and we normalize the densities we got for H_A by those we got for H_0, finally we add them up. Why so? Why do we add them? And most importantly, why we multiply the first term by 0.5 but not the second one?

Also, to get the ncp (lines 54 and 60), that is the effect size/mean difference, shouldn't we multiply d by the standard error sqrt(2/N) instead of dividing it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment