Skip to content

Instantly share code, notes, and snippets.

View thanhleviet's full-sized avatar
🤗
Focusing

Thanh Lee thanhleviet

🤗
Focusing
View GitHub Profile
@thanhleviet
thanhleviet / print_sample_sheet.nf
Created October 6, 2021 12:24
Parse and print records in a csv with nextflow
#!/usr/bin/env nextflow
nextflow.enable.dsl=2
ch_pilon = Channel.fromPath(params.sample_sheet)
.splitCsv(header: true)
.map {row -> tuple(row.sample_id,[row.sr1,row.sr2],row.contigs)}
ch_pilon.view()
@thanhleviet
thanhleviet / eben.r
Created July 22, 2021 17:28
R script to manipulate data
library(tidyverse)
library(data.table)
library(janitor)
csv <- fread("Ebenn_code_data_21Jul21_18.08.csv")
features <- names(csv)[-c(1,2,4)]
sample_names <- csv$Name %>%
gsub("_flye_[a-z\\_]*|_hybrid","",.) %>%
@thanhleviet
thanhleviet / generate_barcode_nbd96.r.S
Last active July 27, 2021 18:43
simple R code to generate barcode index for mapping to sample name as biologists sometimes just send Well index rather than the barcode name
#Generate barcode indexes for NBD96 kit
rm(list = ls())
library(dplyr)
library(data.table)
index <- sapply(seq(1:6), function(x) paste0(LETTERS[1:8],x)) %>% t() %>% as.vector()
bc <- list()
letter_length <- 8
@thanhleviet
thanhleviet / snakemake-pure-python.py
Created June 24, 2021 22:44 — forked from marcelm/snakemake-pure-python.py
pure Python module that uses snakemake to construct and run a workflow
#!/usr/bin/env python3
"""
Running this script is (intended to be) equivalent to running the following Snakefile:
include: "pipeline.conf" # Should be an empty file
shell.prefix("set -euo pipefail;")
rule all:
input:
Bootstrap: docker
From: ubuntu:xenial
%labels
Author: Thanh Le Viet
Software: pangolin
Description: "Pangolin: Software package for assigning SARS-CoV-2 genome sequences to global lineages"
Notes: "This singularity definition is based on https://github.com/StaPH-B/docker-builds"
@thanhleviet
thanhleviet / README.md
Created July 28, 2020 11:12 — forked from hkwi/README.md
libvirt + prometheus + grafana
@thanhleviet
thanhleviet / find.sh
Created July 10, 2020 10:32 — forked from gr1ev0us/find.sh
Cheatsheet for find linux
# List of cheatsheet for linux find.
# Taken from here http://alvinalexander.com/unix/edu/examples/find.shtml
# basic 'find file' commands
# --------------------------
find / -name foo.txt -type f -print # full command
find / -name foo.txt -type f # -print isn't necessary
find / -name foo.txt # don't have to specify "type==file"
find . -name foo.txt # search under the current dir
find . -name "foo.*" # wildcard
#Required libraries
library(tidyverse)
library(rgdal)
library(rgeos)
library(broom)
library(maps)
#Postcode spatial data
#You need postcode map here https://www.opendoorlogistics.com/wp-content/uploads/Data/UK-postcode-boundaries-Jan-2015.zip
england <- readOGR(
dsn= "./osm/postcode/Distribution/" ,
@thanhleviet
thanhleviet / watch_and_basecalling.sh
Last active September 1, 2020 12:53
Watch a folder, split fast5 into 150-file chunks and run guppy basecalling from a chunk list.
#!/usr/bin/env bash
# Author: Thanh Le Viet
# This script will split every consecutive fast5 files into a batch file list for basecalling with guppy
# It is used for "live" basecalling while the sequencing still running on another machine.
# Command: bash ./run_basecalling.sh
# summary_file="sequencing_summary_FAO15487_23198198.txt"
# Usage: run watch_and_basecalling.sh sequencing_summary_FAO15487_23198198.txt
# Note: each run has a different summary_file name.
summary_file=$1
@thanhleviet
thanhleviet / merge_lanes.nf
Last active April 9, 2020 23:33
nextflow script to merge 4 fastq lanes for Illumina data
params.reads = "/beegfs/sars-cov2/Test_060420"
ch_reads = Channel.fromFilePairs(params.reads + "/" + "*_R{1,2}_001.fastq.gz", flat: true)
ch_reads
.map {
it -> [it[0].replaceAll(~/\_L00[1,2,3,4]/,""), it[1], it[2]]
}
.groupTuple(by:0)
.into {ch_reads_in; ch_test}