Skip to content

Instantly share code, notes, and snippets.

View ATpoint's full-sized avatar

Alexander Bender ATpoint

  • Germany
  • 07:50 (UTC +01:00)
View GitHub Profile
@ATpoint
ATpoint / getKEGG.R
Created November 6, 2024 14:41
Retrieve human-readable KEGG pathway name to Ensembl gene ID mapping table
pkgs <- c("KEGGREST", "org.Mm.eg.db", "tidyverse", "AnnotationDbi")
invisible(lapply(pkgs, function(x) suppressPackageStartupMessages(library(x, character.only = TRUE))))
pathway_id_2_name <-
keggList("pathway", "mmu") %>%
enframe(name = "pathway_id", value = "pathway_name") %>%
mutate(pathway_id = gsub("mmu", "", pathway_id),
pathway_name = gsub(" - .*", "", pathway_name))
pathway_2_entrez <-
@ATpoint
ATpoint / compare_md5.sh
Last active October 9, 2024 09:00
Compare md5sums in a file (output of md5sum) with current md5 of these files
#!/bin/bash
# Read existing md5sums and use to confirm integrity of fastq files
#SBATCH --nodes=1
#SBATCH --cpus-per-task=36
#SBATCH --partition=normal
#SBATCH --time=08:00:00
#SBATCH [email protected]
#SBATCH --job-name=md5checker
@ATpoint
ATpoint / velocity_windows.R
Created October 8, 2024 08:35
Get scvelo going on Windows using reticulate and conda
# Install R infrastructure
pkg_install <- c("reticulate", "zellkonverter")
BiocManager::install(pkg_install, update = FALSE)
# Install conda itself via the reticulate package, takes time...
reticulate::install_miniconda()
# Create an environment for velocity with a specific python version that is required
# to install a downgraded matplotlib, see below
reticulate::conda_create(envname = "velocity", python_version = "3.8.19")
geom_segment(data=metadata(sce)$grid.df,
mapping=aes(x=start.1, y=start.2, xend=end.1, yend=end.2),
size=1,
arrow=arrow(length=unit(0.1, "inches"), type="closed"), inherit.aes = FALSE)
@ATpoint
ATpoint / readLoomMatrices.R
Created January 23, 2024 08:17
Standalone function to read loom files from velocyto into R as CsparseMatrix, adapted from velocyto.R.
#' Read loom files into R as CsparseMatrix, depending on Matrix and hdf5r
#' @param path to file loom file on disk
#' @param include.ambiguous logical, whether to also read the "ambiguous" counts from velocyto, default FALSE
#'
readLoomMatrices <- function(file, include.ambiguous=FALSE) {
require(Matrix)
require(hdf5r)
engine='hdf5r'
@ATpoint
ATpoint / BundleAndValidate.sh
Last active January 16, 2024 20:02
Tar or zip a directory and validate integrity by comparison of file list md5sums
#!/bin/bash
# BundleAndValidate <FOLDER_TO_BUNDLE> <tar/tarReplaceLink/zip> <Destination>
#
# Tar a folder and validate integrity of the tarball. For this, compare the file listing of the input folder
# with the file listing of the tarball via md5sum. If identical assume tarball is fine.
# Option tar makes a tarball, tarReplacelink uses tar -h to replace links with pointing file and zip makes a zipball.
# Don't be surprised that it can take some time before you see the tarball at destination after submitting the job.
# The first step is a full listing of the origin file, and with many folders and files that can take a while.
# Uses mbuffer internally to buffer incoming tar stream.
#' Create unique combinations
#'
#' To a given vector of numeric values add or subtract a constant value,
#' and return all possible combinations of that as a matrix
#'
#' @param values a vector with numeri values
#' @param change_value a numeric value too add or subtract
#'
#' @details
#' There are 2^length(values) possible combinations
reprex::reprex({
d <- data.frame(a=1)
# Works
suppressMessages({library(AnnotationDbi); library(dplyr)})
d %>% select(a)
detach("package:AnnotationDbi", unload=TRUE)
detach("package:dplyr", unload=TRUE)
@ATpoint
ATpoint / getREACTOME.R
Created July 7, 2023 12:22
Get REACTOME terms, translate to mouse
# Retrieve REACTOME terms for human directly from the website, then map to mouse.
# Pull human to mouse mappings via biomaRt.
library(biomaRt)
library(data.table)
library(magrittr)
library(rtracklayer)
library(tidyverse)
options(timeout=999)
@ATpoint
ATpoint / retrieveKEGG.R
Last active June 26, 2023 11:15
Query biomaRt and the KEGG API to produce a table connecting KEGG pathways with gene names for mouse and human
library(biomaRt)
library(data.table)
library(dplyr)
library(magrittr)
kegg <- sapply(c("mouse", "human"), function(x){
if(x=="mouse"){
dataset <- "mmusculus_gene_ensembl"