Skip to content

Instantly share code, notes, and snippets.

View padpadpadpad's full-sized avatar
🙃

Daniel Padfield padpadpadpad

🙃
View GitHub Profile
# difference in Phil McNulty rating and public votes
# https://www.bbc.co.uk/sport/football/44595146
library(tibble)
library(ggplot2)
library(tidyr)
library(dplyr)
player <- c('Jordan Pickford', 'Kieran Trippier', 'Harry Maguire', 'John Stones', 'Kyle Walker', 'Jordan Henderson', 'Ashley Young', 'Ruben Loftus-Cheek', 'Raheem Sterling', 'Jesse Lingard', 'Harry Kane')
@padpadpadpad
padpadpadpad / TeamParallel.R
Last active October 16, 2020 07:49
Example of parallelising R code using furrr and multidplyr
# load packages
library(multidplyr)
library(dplyr)
library(tidyr)
library(purrr)
library(furrr)
library(tibble)
library(tictoc)
# I want to know which gene a bunch of SNPs is in. This will show us how to give different inputs to a function and also parallelising. Hopefully.
@padpadpadpad
padpadpadpad / dynamic_df_filtering.R
Last active April 20, 2018 20:11
Filtering a data frame based on a dynamic number of columns and conditions
# code problem
# want to feed into a function to filter values of a dataframe based on values from a master dataframe
# inspiration - rates of false positives for snps (mutations) in a gene are influenced by the threshold for proportion of expression in a population. For example, if 0.01 of the population has it, its less likely to be a SNP than if you're at 0.9. Where to set this threshold is unknown.
# However, if you have the list of actual snps (from sequencing individual clones), you can look at different thresholds of variables to check which values give the least false positives and find the most true snps
# load packages
library(ggplot2)
library(dplyr)
library(tidyr)
library(tibble)
# load / install packages
library(dplyr)
library(tibble)
# devtools::install_github('padpadpadpad/TeamPhytoplankton')
# make dummy data
temp = (5:30) + 273.15
rate = TeamPhytoplankton::boltzmann(15, 1, temp, Tc = 15, log = 'N')
d <- tibble(rate, temp)
@padpadpadpad
padpadpadpad / position_jitter_points_line.R
Last active April 4, 2018 13:39
Trying to get points and lines to align using position_jitter()
# load ggplot2
library(ggplot2)
# make data frame
d <- tibble(x = rep(c('a', 'b', 'c'), each = 10),
y = rnorm(30, mean = 10, sd = 1),
group = rep(1:10, times = 3))
# create jitter position with seeding
pos <- position_jitter(width = 0.1, height = 0, seed = 42)
@padpadpadpad
padpadpadpad / an_alternate_bootstrap.Rmd
Last active August 6, 2018 13:51
An alternative bootstrapping method for non-linear regression.
---
title: "Non-linear bootstrapping approach 2.0"
author: "Daniel Padfield"
date: "15/03/2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
@padpadpadpad
padpadpadpad / assign_gene_from_pos.R
Last active February 8, 2022 09:21
Assign gene into position in genome taking into account chromosome and gene position
# example
# putting the SNP pos in the correct gene by its start and end position
# load packages
library(tibble)
library(dplyr)
library(tidyr)
library(purrr)
# dataframe of SNP positions and proportions
# cheeky late night bootstrapped mean
# load packages
library(broom)
library(tidyr)
library(purrr)
library(modelr)
library(dplyr)
# load dataset
# smatr bootstrap example
# load packages
library(smatr)
library(dplyr)
library(purrr)
library(tidyr)
library(ggplot2)
# load data
# example of trying to subset a matrix ####
# So I am trying to do Mantel tests that require distance matrices
# However I do not care about every pairwise distance in the matrix. instead I want to make some of the values of the distance NA to remove them
# Struggling to work out how to subset a matrix
# load packages
library(tibble)
library(dplyr)