Skip to content

Instantly share code, notes, and snippets.

@explodecomputer
explodecomputer / power1.pdf
Last active September 4, 2020 11:17
power_interactions
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
# simulation
n <- 10000
a <- rbinom(n, 2, 0.5)
b <- rbinom(n, 2, 0.49)
# perform fisher's exact test to obtain odds ratio for two allele frequencies being different
cont <- matrix(
c(sum(a==0) * 2 + sum(a==1), sum(a==2) * 2 + sum(a==1), sum(b==0) * 2 + sum(b==1), sum(b==2) * 2 + sum(b==1)), 2, 2)
fisher.test(cont)
# Number of individuals in the population
npop <- 100000
# Distribution of variable in the population
x <- rnorm(npop)
# Individuals selected into sample with high x value
s <- rbinom(npop, 1, plogis(x * 0.4))
# Population mean of x
library(dplyr)
library(data.table)
library(TwoSampleMR)
bmi_id <- "ukb-b-19953"
chd_id <- "ukb-b-3983"
datadir <- "/mnt/storage/private/mrcieu/research/mr-eve/UKBB_replication/replication/results"
@explodecomputer
explodecomputer / covid_collider_prediction.rmd
Last active May 28, 2020 22:53
Predictions when training and testing subsets are selected based on a collider
---
title: Prediction when testing and training are stratified by a collider
---
## Background
In Menni et al 2020 they look for risk factors that associate with testing positive. They then create a model to predict test status in untested individuals using those risk factors.
The risk factors, and testing positive, both influence whether individuals are tested. Therefore, the associations in the tested sample are likely biased due to colliders, and not transportable to those in the untested sample.
@explodecomputer
explodecomputer / mr_dgp.rmd
Last active March 18, 2020 18:45
Data generating process underlying causal inference using Mendelian randomization
---
title: Data generating process underlying causal inference using Mendelian randomization
author: Gibran Hemani
date: '`r format(Sys.Date())`'
---
## Background
Causal inference between two traits, the exposure's ($x$) effect on the outcome ($y$) can be made using associations of genetic variants $g$ on $x$ and $y$. This method is known as Mendelian randomization (MR), a special case of instrumental variable (IV) analysis where the instrument is a genetic variant. Assume the following causal structure:
@explodecomputer
explodecomputer / google_drive_image.md
Last active November 7, 2019 10:59
Embed image from google drive

name

name

@explodecomputer
explodecomputer / tetrachoric.r
Created September 19, 2019 15:08
tetrachoric under different scenarios
library(mvtnorm)
library(psych)
n <- 100000
dn <- rmvnorm(n, c(0,0), matrix(c(1,0.5, 0.5,1), 2))
# 50% prevalence
d <- dn
d[,1] <- rbinom(n, 1, pnorm(d[,1]))
@explodecomputer
explodecomputer / hwe.r
Created August 20, 2019 16:34
Merged SNP HWE
maf <- function(x) { sum(x) / (length(x) * 2)}
hwe <- function(x) {
observed <- table(x)
m <- maf(x)
expected <- c(
(1-m)^2, 2 * m * (1-m), m^2
) * length(x)
chisq.test(rbind(observed, round(expected)))
print(rbind(observed, round(expected)))
@explodecomputer
explodecomputer / infant_weight_height.r
Created July 18, 2019 13:53
Infant weight and later height
library(ggplot2)
library(tidyr)
library(alspac)
data(current)
vars <- c("cf040", "cf041", "cf042", "cf043", "fh3000")
b <- extractVars(subset(current, name %in% vars))
b1 <- subset(b, select=c("aln", "qlet", vars)) %>% filter(!apply(., 1, function(x) any(is.na(x))))
for(i in vars)