Last active
August 29, 2015 14:16
-
-
Save crazyhottommy/6360523051f63fe56cc8 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#! /usr/bin | |
# put the coordinates in a bed file | |
infile=$1 | |
while read chr start end | |
do | |
samtools faidx ref.fasta $chr:$start-$end >> test.fa | |
done <$infile |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Use the cruzdb python package https://github.com/brentp/cruzdb | |
from cruzdb import sequence | |
print sequence.sequence(‘hg19’,’chr1’, 100000, 100010) | |
# cactaagcaca | |
# 1 based! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# download twoBitToFa here http://hgdownload.cse.ucsc.edu/admin/exe/ | |
twoBitToFa http://hgdownload.cse.ucsc.edu/gbdb/hg19/hg19.2bit test.fa -seq=chr1 -start=100000 -end=100010 | |
cat test.fa | |
>chr1:100000-100010 | |
actaagcaca | |
# Note that it is 0 based cooridnates system | |
# Use betools get fasta http://bedtools.readthedocs.org/en/latest/content/tools/getfasta.html | |
# you need a fasta file of the whole genome, and a bed file containing the cooridnates | |
bedtools getfasta -fi UCSC_hg19_genome.fa -bed input.bed -fo out.fa | |
cat out.fa | |
>chr1:100000-100010 | |
actaagcaca | |
# it is also 0 based | |
# Use samtools faidx | |
# First, index your fasta file (only have to do this once) | |
samtools faidx reference.fa | |
# then extract the sequences you want | |
samtools faidx ref.fasta chr1:100000-100010 | |
>chr1:100000-100010 | |
cactaagcaca | |
# note that it is 1 based ! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment