Skip to content

Instantly share code, notes, and snippets.

View lwaldron's full-sized avatar

Levi Waldron lwaldron

View GitHub Profile
@lwaldron
lwaldron / cmdErr.R
Created November 23, 2022 15:26
identify cMD datasets producing error
suppressPackageStartupMessages(library(curatedMetagenomicData))
sampleMetadata[sampleMetadata$study_name == "FengQ_2015", ] |>
returnSamples("relative_abundance", rownames = "NCBI")
allstudies <- unique(sampleMetadata$study_name)
allres <- sapply(allstudies, function(study) {
message(study)
try(
suppressMessages(sampleMetadata[sampleMetadata$study_name == study,] |>
@lwaldron
lwaldron / NYC_CHS20.Rmd
Last active September 14, 2022 12:55
Demo of importing, recoding, creating survey, and logistic regression on CHS20 survey data
---
title: "NYC CHS import"
author: "Levi Waldron"
date: "`r Sys.Date()`"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE)
```
@lwaldron
lwaldron / benchmark_heatmapTable.R
Last active July 15, 2022 14:59
benchmark_heatmapTable.R
req.packages <- c("GenomicSuperSignature", "bcellViper", "microbenchmark", "reprex")
BiocManager::install(req.packages, ask=FALSE, force = FALSE)
reprex::reprex({
BiocManager::version()
BiocManager::valid()
suppressPackageStartupMessages({
library(GenomicSuperSignature)
library(bcellViper)
@lwaldron
lwaldron / anscombe_residuals.Rmd
Created June 20, 2022 12:28
Residuals plots of the Anscombe datasets
---
title: "Anscombe residuals plots"
author: "Levi Waldron"
date: "`r Sys.Date()`"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
@lwaldron
lwaldron / microsud_cmd_reprex.R
Last active June 11, 2022 10:45
Comparing assays from two different downloads
# installed curatedMetagenomicData from github
suppressPackageStartupMessages({
library(curatedMetagenomicData)
library(dplyr)
})
#I download the data in two ways, one select a few CRC studies and the other will multiple studies available.
# specific only crc studies downloaded
suppressMessages({
@lwaldron
lwaldron / diabimmune.R
Created February 2, 2022 16:54
Download DIABIMMUNE antibiotics cohort .fna.gz files
# See https://diabimmune.broadinstitute.org/diabimmune/antibiotics-cohort/resources/16s-sequence-data
# The provided command `wget -r -np -nd https://pubs.broadinstitute.org/diabimmune/data/15` does not work because files are listed in an html page
library(dplyr)
library(rvest)
url <- "https://diabimmune.broadinstitute.org/diabimmune/data/15/"
url %>%
read_html() %>%
html_elements("a") %>%
html_attr("href") %>%
download.file(., destfile = basename(.))
@lwaldron
lwaldron / NYC-COVID_ACS_merge
Created September 13, 2021 01:57
NYC-COVID data merged with ACS community-level data
##### Importing COVID-19 data from the NYC DOHMH github (https://github.com/nychealth/coronavirus-data) and merge with ACS data of interest
# In order to get the URL of a table of your interest, go to the table and click on 'History' on the top right corner.
# You will see the upload history for the table on this page. Choose a time point of interest and click on the second
# to the last button on the right (if you hover over the button it should say 'View at this point in the hisotry').
# You will be directed to view the table. Then click on 'Raw' and copy the URL.
covid <- read.csv("https://raw.githubusercontent.com/nychealth/coronavirus-data/7ce1b84610232be9c3f780484865a51f73b8c469/recent/recent-4-week-by-modzcta.csv")
head(covid)
@lwaldron
lwaldron / framingham.R
Created September 13, 2021 01:50
Framingham Heart Study access and recoding
##### Importing Framingham Heart Study data from a github repository (https://github.com/GauravPadawe/Framingham-Heart-Study)
library(tidyverse)
#importing the dataset
chddata <- read.csv("https://raw.githubusercontent.com/GauravPadawe/Framingham-Heart-Study/adcc828b8a5b3ddbd8d5b8b98e2b27cf60538db6/framingham.csv")
#some recoding
chddataclean <- chddata %>%
mutate(TenYearCHD = if_else (TenYearCHD=='1',"CHD", "No-CHD"),
@lwaldron
lwaldron / scMultiome.R
Created June 15, 2021 12:16
Object serialization and sizes of SingleCellMultiModal::scMultiome dataset
library(SingleCellMultiModal)
library(MultiAssayExperiment)
suppressMessages(scmm <- scMultiome(dry.run = FALSE))
format(object.size(scmm), units="Mb") #31Mb in memory
saveHDF5MultiAssayExperiment(scmm)
dir("h5_mae", full.names=TRUE) |> file.info() # ~193MB on disk
suppressMessages(scmm_sparse <- scMultiome(format = "MTX", dry.run = FALSE))
@lwaldron
lwaldron / TCGA_re.R
Created June 1, 2021 06:48
CPU and memory footprints of a few operations on RaggedExperiment objects from TCGA
## ---------------------------------------------------------------------------------------------------------------------------------
library(curatedTCGAData)
library(TCGAutils)
library(RaggedExperiment)
## -----------------------------------------------------------------------------------------------------------------------------------------
cnvdry <-
curatedTCGAData(assays = "CNVSNP",