Last active
August 29, 2015 14:16
-
-
Save anpefi/5724e90ae8504f6165ed to your computer and use it in GitHub Desktop.
In order to add some sequences as "contaminants database" in the fastqc analysis, a contaminant file should be provided with the following format: header[tabulation]sequence. Usage: fasta2oneline.sh example.fa > contaminants.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# fasta2oneline.sh | |
# Convert fasta file in one line: header[tab]sequence (useful for fastqc contaminant file, for example) | |
# Output to the stdout, redirect it to a file | |
# Usage: fast2oneline.sh x_example.fa > z_contaminants.txt | |
INPUT_FILE=$1 | |
cat ${INPUT_FILE} | sed '/^$/d' | sed -n '/^>/!{H;$!b};s/$/ \t/;x;1b;s/\n//g;p' | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
>seq1 | |
catcgatcgtacgatcgtacgtacgtagc | |
>seq2 | |
acgtacgtcatgcatgatactgtagctacgtacgtacgt | |
>seq3 | |
agctagtcgatcgatcgatcgatcgatcgatcgatcgtacgtacg |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
>seq1 catcgatcgtacgatcgtacgtacgtagc | |
>seq2 acgtacgtcatgcatgatactgtagctacgtacgtacgt | |
>seq3 agctagtcgatcgatcgatcgatcgatcgatcgatcgtacgtacg |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment