Skip to content

Instantly share code, notes, and snippets.

@vhbui02
Last active May 31, 2023 04:02
Show Gist options
  • Save vhbui02/239f0f5f29abea054e24070f0fb519e5 to your computer and use it in GitHub Desktop.
Save vhbui02/239f0f5f29abea054e24070f0fb519e5 to your computer and use it in GitHub Desktop.
[Statistics Cheat Sheet] #statistic

95% CI of 1 CAT

prop.test(x, n, p) binom.test(x, n, p) p = 1/num. of level

Check Ratio also

table(c(x1, n1), c(x2, n2)) prop.test(table)

Z-test

z.test zsum.test

1-sample T-test (parametric, used when NORMAL)

t.test tsum.test

Wilcoxon (non-parametric, used when not NORMAL)

wilcox.test(data = ?, y(NUM)~x(CAT/Factor)) wilcox_test(data = ?, y(NUM)~x(CAT/Factor))

2-sample Fisher-Pitman Permutation Test (non-parametric)

oneway_test(y(NUM)~x(CAT/Factor))

Check Normality <0.1: non NORMAL, >0.1: NORMAL

shapiro.test lillie.test

Check Independance between CAT groups

chisq.test(contingency_table or c(x1, x2), p) <0.05 - reject H0 'no relationship' hypothesis because each trial in binomial distribution is independent, in order to use binom.test we need to make sure they're independent fisher.test(xtabs(~X1 + X2))

Correction Method

p.adjust(c(df1$p.value, df2$p.value), method='holm')

Check Equal Variance

var.test()

Linear Regression

model <- lm(data = ?, y ~ x) anova(model) summary(model) anova <- aov(data = ?, y ~ x) TukeyHSD(anova) plot(TukeyHSD(anova)) confint(anova)

Test ANOVA assumption

lillie.test(residuals(model)) // < 0.1 = Not Normal shapiro.test(residuals(model)) // < 0.1 = Not Normal fligner.test(data = ?, y ~ x) // >0.05 = Equal Var leveneTest(data = ?, y ~ x) // >0.05 = Equal Var

If ANOVA assumption false, Kruskal-Willis to the rescue

kruskal.test(data = ?, y ~ x) kruskalmc(data = ?, y ~ x) oneway_test(y ~ x) // since it also a non-parametric method

Some helpful func:

densityplot(), qqnorm(), qqline() str(), subset(data = dataframe, subset = conditions) xtabs(data = dataframe, ~X) produce contingency table table(dataframe$X) also produce contingency table runif(1000) gen uniform dist rnorm(1000) *gen normal dist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment