Created
March 4, 2022 13:52
-
-
Save avrilcoghlan/e22ebbe71f9c4f6ea5d57783f17747b3 to your computer and use it in GitHub Desktop.
Perl script to reformat the file of within-species paralogs into the format that my pipeline expects
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl | |
$file = $ARGV[0]; # input file of within-species paralogs from BioMart | |
open(FILE,"$file") || die "ERROR: cannot open $file\n"; | |
while(<FILE>) | |
{ | |
$line = $_; | |
chomp $line; | |
@temp = split(/\t+/,$line); | |
# Genome project Gene stable ID Paralogue gene stable ID | |
# schistosoma_mansoni_prjea36577 Smp_000020 Smp_333560 | |
# schistosoma_mansoni_prjea36577 Smp_000030 Smp_346350 | |
if ($line =~ /schistosoma_mansoni_prjea36577/) | |
{ | |
$para1 = $temp[1]; | |
$para2 = $temp[2]; | |
print "within_species_paralog (schistosoma_mansoni) $para1 (schistosoma_mansoni) $para2 (schistosoma_mansoni)\n"; | |
} | |
} | |
close(FILE); | |
print STDERR "FINISHED\n"; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment