Skip to content

Instantly share code, notes, and snippets.

View tomsing1's full-sized avatar

Thomas Sandmann tomsing1

View GitHub Profile
@tomsing1
tomsing1 / image_placeholder.R
Created January 8, 2024 19:09
Create an image tag pointing to a random picture from Lorem Picsum
#' Create an image tag with an example image
#'
#' @param width Scalar integer, the width of the image
#' @param height Scalar integer, the height of the image
#' @param title Scalar character, the title of the image
#' @return A `shiny.tag` with the URL to a random image from
#' [Lorem Picsum](https://picsum.photos/)
#' @export
#' @importFrom htmltool tags
#' @importFrom checkmate assert_count assert_character
@tomsing1
tomsing1 / pyenv.md
Last active December 28, 2023 00:16
Using pyenv to manage multiple python versions & virtual environments

There is a lot of confusing information about virtual environments in python out there, in part because the tool chain has evolved over many years.

I decided to follow the advice of Real python and The hitchhiker's guide to python and manage both multiple python versions and multiple virtual environments with pyenv

@tomsing1
tomsing1 / quarto_webr_example.qmd
Created November 19, 2023 17:32
Quarto markdown file that uses the quarto-webr extension to run R code in the browser
---
title: "Embedding R into Quarto documents with quarto-webr"
subtitle: "Example: intersecting differential expression results"
author: "Thomas Sandmann"
date: '2023/11/18'
format: html
engine: knitr
webr:
show-startup-message: true
packages: ['ggvenn', 'huxtable']
@tomsing1
tomsing1 / fgsea.md
Created September 29, 2023 21:15
fgsea output

The fgsea::fgsea function returns a data.frame with the following columns:

  • pathway – name of the pathway as in ‘names(pathway)‘;
  • pval – an enrichment p-value;
  • padj – a BH-adjusted p-value;
  • ES – enrichment score

    GSEA calculates the ES by walking down the ranked list of genes, increasing a running-sum statistic when a gene is in the gene set and decreasing it when it is not. The magnitude of the increment depends on the correlation of the gene with the phenotype. The ES is the maximum deviation from zero encountered in walking the list.

@tomsing1
tomsing1 / postgres_json_duckdb.R
Last active September 10, 2023 18:07
duckdb: Retrieving a Postgres table with a json column and unpacking it
library(duckdb)
library(jsonlite)
library(purrr)
con <- dbConnect(duckdb())
# duckdb: install and load postgres extension to connect
# to a Postgres database via duckdb
dbExecute(con, "INSTALL postgres;")
dbExecute(con, "LOAD postgres;")
@tomsing1
tomsing1 / gorilla.csv
Created August 28, 2023 17:06
CSV file with simluated steps & bmi data, similar to Yanai and Lercher, biorXiv, 2020
bmi steps group
29.96 145.631067961165 F
29.8981818181818 10048.5436893204 M
23.4690909090909 3859.22330097087 M
26.0345454545455 7718.44660194175 M
19.5127272727273 10776.6990291262 M
29.6509090909091 3932.03883495146 M
30.3 2730.58252427184 M
25.1381818181818 7245.14563106796 F
29.4654545454545 11504.854368932 M
@tomsing1
tomsing1 / gorilla.R
Created August 28, 2023 16:26
R script to create a dataset similar to Yanai and Lercher, Selective attention in hypothesis-driven data analysis, biorXiv, 2020
# source: Matt Dray's blog: https://www.rostrum.blog/posts/2021-10-05-gorilla/
library(magick)
# download and read the image
img_file <- tempfile(fileext = ".jpg")
download.file(
paste0(
"https://classroomclipart.com/images/gallery/",
"Clipart/Black_and_White_Clipart/Animals/",
@tomsing1
tomsing1 / datalad.md
Created August 22, 2023 15:49
Datalad tutorial, based on its amazing documentation

Installation on Mac OS X

brew install datalad wget

Creating a new dataset

First, we create a new, empty dataset. (Reminder: a dataset refers to a folder of files, not any single file.)

@tomsing1
tomsing1 / statistical_misunderstandings.md
Created August 21, 2023 19:49
Statistical misunderstandings

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Excerpted from Greenland et al, Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations, 2016

What P values, confidence intervals, and power calculations don't tell us

Common misinterpretations of single P values

  1. The P value is the probability that the test hypothesis is true; for example, if a test of the null hypothesis gave P = 0.01, the null hypothesis has only a 1 % chance of being true; if instead it gave P = 0.40, the null hypothesis has a 40 % chance of being true.
@tomsing1
tomsing1 / duckdb_notes.txt
Created May 1, 2023 17:05
Notes on first steps with duckdb
brew install duckdb
duckdb -c "INSTALL httpfs"
REMOTE_FILE="https://raw.githubusercontent.com/mwaskom/seaborn-data/master/penguins.csv"
# read a remote CSV file in to a duckdb database into an in-memory database
duckdb -c "SELECT * FROM '${REMOTE_FILE}';"
# By default, the CLI will open a temporary in-memory database.