Skip to content

Instantly share code, notes, and snippets.

@ConradStack
ConradStack / genome_links.md
Created November 1, 2016 14:43
Links to publicly available genomes
@ConradStack
ConradStack / append.pl
Created January 9, 2017 23:22 — forked from jimhester/append.pl
Parsing fasta files in perl ruby python and go
#!/usr/bin/env perl
use warnings;use strict;
my ($header,$sequence);
$header = <>;
chomp $header;
while(my $line = <>){
chomp $line;
if($line =~ /^>/){
@ConradStack
ConradStack / bamfilter_oneliners.md
Created February 17, 2017 19:59 — forked from davfre/bamfilter_oneliners.md
SAM and BAM filtering oneliners
@ConradStack
ConradStack / clear_pagecache.sh
Created September 28, 2017 02:19
Clear linux PageCache
# From [this tutorial](https://www.tecmint.com/clear-ram-memory-cache-buffer-and-swap-space-on-linux/)
sync; echo 1 > /proc/sys/vm/drop_caches
@ConradStack
ConradStack / pyfaidx.extract_by_name.py
Created September 30, 2017 02:06
Extract sequences from a fasta file, preserving read name comments
# Create a new fasta file given a fasta file and list of sequence names
# - outputting the long_name does/did not seem to work properly in the faidx script that is packaged with pyfaidx
from pyfaidx import *
# read fasta file
fa = Fasta('test.fa')
@ConradStack
ConradStack / get_columns.sql
Created February 20, 2019 04:40
SQL server query to get the list of columns in a table along with Data types, NOT NULL, and PRIMARY KEY constraints
/*
From [this stackover post](https://stackoverflow.com/questions/2418527/sql-server-query-to-get-the-list-of-columns-in-a-table-along-with-data-types-no)
*/
SELECT
c.name 'Column Name',
t.Name 'Data type',
c.max_length 'Max Length',
c.precision ,
c.scale ,
# Derived from https://towardsdatascience.com/how-to-fine-tune-gpt-2-for-text-generation-ae2ea53bc272
import os
import pandas as pd
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import numpy as np
import random
import torch
from torch.utils.data import Dataset, DataLoader