Skip to content

Instantly share code, notes, and snippets.

@MattOates
Last active January 1, 2016 08:29
Show Gist options
  • Save MattOates/8118699 to your computer and use it in GitHub Desktop.
Save MattOates/8118699 to your computer and use it in GitHub Desktop.
Newick grammar in Perl6... the quoted and unquoted tokens fail if you replace unquoted with <[A..Z]>* it works for the example newick string :/
#!/usr/bin/env perl6
use v6;
grammar BioInfo::Parser::Tree::Newick {
rule TOP {
<tree>+
}
rule tree {
<node> ";"
}
rule node {
<children> <label>? <distance>? | <children>? <label> <distance>?
}
rule children {
"(" <node> ("," <node>)* ")"
}
rule label {
<quoted> | <unquoted>
}
rule distance {
":" <number>
}
token quoted { "'" ( <-[']> | "''" )* "'" }
#token unquoted { <[A..Z]>* }
token unquoted { <-[\[\]()':\s]>* }
token number { <[+-]>?\d+"."?\d*<[eE]>?\d* }
}
my $quoted = "(a,('Darwin''s Bulldog (Huxley)',c):-1.92e19)'The ''Root''':5;";
my $unquoted = "(A:0.1,B:0.2,(C:0.3,D:0.4):0.5);";
say "Quoted Newick";
say BioInfo::Parser::Tree::Newick.parse($quoted);
say "Unquoted Newick";
say BioInfo::Parser::Tree::Newick.parse($unquoted);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment