Skip to content

Instantly share code, notes, and snippets.

@sephraim
sephraim / tab2csv.sh
Created March 2, 2015 16:29
Convert a TSV (tab-separated values) file to a CSV (comma-separated values) file
#!/bin/sh
# Convert a TSV (tab) file to a CSV (comma) file
#
# Please note that this will surround all values with
# double-quotes. All other double-quotes will be escaped.
#
# Example usage:
# ./tab2csv.sh myfile.tsv > myfile.csv
@sephraim
sephraim / sort_variants.rb
Last active March 28, 2017 22:24
Sort a list of genetic variants by chromosomal position
# Sorts a file by chromosomal position
#
# Input file must have the following format:
# - Column 1: chromosome (e.g. chr1, chr10 *OR* 1, 10)
# - Column 2: start position (e.g. 4325484)
# - All other columns can be ordered in any way
#
# Input file sample:
# chr17:5432542:G>A .........
# 17:5432542:G>A .........
@sephraim
sephraim / swap_strand.rb
Created March 17, 2015 17:16
Swap nucleotide sequence to opposite chromosome strand
# Swaps the nucleotide sequence to the opposite strand
#
# This means if the sequence is on the forward strand,
# then the function will return the corresponding sequence
# on the reverse strand, and vice versa.
#
# For example:
# Original: TCCAGACAC
# Swapped: GTGTCTGGA
#
#!/bin/bash
# Split a file into N parts
#
# Each resulting file will have a .ptXX suffix
#
# Example - split file into 30 parts:
# ./parts.sh myfile.txt 30
split -dl$((`wc -l < $1`/$2+1)) $1 $1.pt
#!/bin/bash
##
# Remove Duplicates in a VCF
#
# A duplicate variant is when multiple records have the same
# CHROM, POS, REF, and ALT. This script will pick *one* of the
# duplicate variants and discard the rest. The record that is
# picked is the one that comes first in sorting order.
#
@sephraim
sephraim / rename_HGMD_INFO_tags.sh
Created April 28, 2016 00:02
The following script will capitalize all INFO tags and add the prefix "HGMD_" to each tag.
#!/bin/bash
##
# Rename HGMD INFO tags
#
# Example usage:
# ./rename_HGMD_INFO_tags.sh hgmd-hg19.vcf > hg19_HGMD_2015r4.MORL.vcf
##
tags=(
@sephraim
sephraim / 0_reuse_code.js
Created November 10, 2016 21:27
Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
@sephraim
sephraim / create_db.sql
Last active November 11, 2016 22:46
Join multiple CSVs / TSVs using SQLite
/**
* STEP 1: Write queries to import files into an SQLite database
**/
/* Set file input mode */
.mode tabs
/* Import tables from TSV files */
.import file1.tsv table1
.import file2.tsv table2
@sephraim
sephraim / find_duplicates_first_N_cols.sh
Last active November 11, 2016 22:45
Print duplicate lines based on first N columns
grep -wFf <(cut -f__COL1__-__COL2__ __FILE__ | sort | uniq -d) __FILE__
@sephraim
sephraim / add_column.sql
Last active November 11, 2016 22:43
Add column to MySQL database
ALTER TABLE `__TABLE__`
ADD `__COL__`
VARCHAR(255) NULL
DEFAULT NULL
COMMENT '__COMMENT__'
AFTER `__ANOTHER_COL__` ;