Skip to content

Instantly share code, notes, and snippets.

@mtw
Created February 7, 2018 14:13
Show Gist options
  • Save mtw/3927eaebad6192bca9dc26c9e7b339a0 to your computer and use it in GitHub Desktop.
Save mtw/3927eaebad6192bca9dc26c9e7b339a0 to your computer and use it in GitHub Desktop.
A minimalistic Perl6 Stockholm format parser
#!/usr/bin/env perl6
# a minimal Stockholm alignment format parser that reads
# ONLY single-line alignments and dumps each sequence (without additional
# gap characters) to a file
use Grammar::Tracer;
grammar StockholmParser {
token TOP { <header>+ <alignment> <consensus> <sep>? <.eol> }
token header { ['#'|'#='] [\N]+ <.eol> }
token alignment { <aln>+ }
token aln { <id> \s+ <seq> <.eol> }
token id { [\S+] }
token consensus { '#=GC SS_cons' <ws> <form> <.eol> }
token form { <[ ( ) . ]>+ }
token seq { <[\w\-]>+ }
token eol { \n [\h*\n]* }
token sep { '//' }
}
class Stk-actions {
method seq($/) {make $/.subst(:g, '-','') }
}
sub MAIN ( $filename ){
say "processing $filename";
my $result = StockholmParser.parsefile($filename, actions => Stk-actions.new);
# note $result;
for $result<alignment><aln>.flat -> $a {
my $fh=open "$a<id>.fa", :w;
$fh.say($a<seq>.made);
$fh.close
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment