Skip to content

Instantly share code, notes, and snippets.

View PoisonAlien's full-sized avatar
🕉️

Anand Mayakonda PoisonAlien

🕉️
View GitHub Profile
category path
Machine_Learning https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/bsd/Machine_Learning/Facebook/PyTorch.svg
Machine_Learning https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/bsd/Machine_Learning/Google/Tensorflow.svg
Machine_Learning https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/bsd/Machine_Learning/Jupyter/Jupyter.svg
Amino-Acids https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/cc-0/Amino-Acids/B--Gideon-Bergheim/alanine_chem.svg
Amino-Acids https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/cc-0/Amino-Acids/B--Gideon-Bergheim/alanine_noH.svg
Amino-Acids https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/cc-0/Amino-Acids/B--Gideon-Bergheim/alanine.svg
Amino-Acids https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/cc-0/Amino-Acids/B--Gideon-Bergheim/amino_acid_back
We can make this file beautiful and searchable if this error is corrected: It looks like row 8 should actually have 2 columns, instead of 1 in line 7.
category path
Intracellular_components https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/cc-0/Intracellular_components/Simon_Dürr/zincfinger.svg
Intracellular_components https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/cc-0/Intracellular_components/ASE/histone_complex.svg
Intracellular_components https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/cc-0/Intracellular_components/ASE/histone_complex_acetylated.svg
Intracellular_components https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/cc-0/Intracellular_components/jaiganesh/mitochondria.svg
Intracellular_components https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/cc-0/Intracellular_components/jaiganesh/proteasome.svg
Intracellular_components https://raw.githubusercontent.com/duerrsimon/bioicons/refs/heads/main/static/icons/cc-0/Intracellular_components/jaiganesh/Endoplasmic_Reticulum.svg
Intracell
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
---
title: "Sequencing run summary"
date: "Generated on: `r Sys.Date()`"
output:
html_document:
toc: true
toc_depth: 3
toc_float: true
self_contained: yes
theme: sandstone
#!/usr/bin/env python3
#A simple script to predict gnomAD ancestry using PCA loadings trained on gnomAD V3 datasets
#See here for details: https://gnomad.broadinstitute.org/news/2021-09-using-the-gnomad-ancestry-principal-components-analysis-loadings-and-random-forest-classifier-on-your-dataset/
#Author: Anand Mayakonda
import sys
import os.path
import shutil
import argparse
@PoisonAlien
PoisonAlien / bwview.sh
Created July 27, 2023 09:13
subset a bigWig file
#!/usr/bin/env bash
#Script to subset a bigWig file for user specific loci
#MIT License (Anand Mayakonda; [email protected])
function usage (){
echo "Subset a bigWig file for genomic loci.
Requires UCSC kentutils bigWigToBedGraph and bedGraphToBigWig to be installed
Binaries available from: https://hgdownload.soe.ucsc.edu/admin/exe/
pipeline_dir = "./"
echo "Downloading VEP cache.." 1>&2
mkdir -p ${pipeline_dir}/resources/vep_cache/
cd ${pipeline_dir}/resources/vep_cache/
curl -O https://ftp.ensembl.org/pub/release-107/variation/indexed_vep_cache/homo_sapiens_vep_107_GRCh38.tar.gz
tar -xzf homo_sapiens_vep_107_GRCh38.tar.gz -C ./
wget https://ftp.ensembl.org/pub/release-107/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
gunzip -c Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz | bgzip > Homo_sapiens.GRCh38.dna.primary_assembly.fa.bgzip
@PoisonAlien
PoisonAlien / createproject.sh
Last active November 28, 2021 11:41
A minimal project template directory structure that I use for my Bioinformatic projects
#!/usr/bin/env bash
#A minimal project template structure that I use for my Bioinformatic projects
#MIT License (Anand Mayakonda; [email protected])
function usage() {
echo "createproject.sh - Create a project template directory structure
Usage: createproject.sh [option] <project_name>
#Wrapper around goseq
#'@param assayedGenes total gene IDs that were measured
#'@param deGenes DE gene IDs
#'@param source_id Can be `ensGene` or `geneSymbol`
#'@param hyperGeo Dfault TRUE. Set to FALSE for rna-seq data
goseq_wrapper = function(assayedGenes, deGenes, source_id = "ensGene", hyperGeo = TRUE){
gene_vector = as.integer(assayedGenes %in% deGenes)
names(gene_vector)= assayedGenes
pwf = suppressWarnings(suppressMessages(goseq::nullp(DEgenes = gene_vector, genome = "hg19", id = source_id, plot.fit = FALSE)))
####################################################################################
#
# Best-practice 450k/EPIC QC and preprocessing workflow for the PPCG project
#
# creator: Pavlo Lutsik
#
# 30.01.2021
####################################################################################
library(RnBeads)