Skip to content

Instantly share code, notes, and snippets.

@benui-dev
Created January 25, 2012 06:55
Show Gist options
  • Select an option

  • Save benui-dev/1675141 to your computer and use it in GitHub Desktop.

Select an option

Save benui-dev/1675141 to your computer and use it in GitHub Desktop.
Simple example of how to parse Edict XML file with Perl and XML::LibXML
#!/usr/local/bin/perl
use strict;
use warnings;
use XML::LibXML;
use encoding "utf-8";
my $file = $ARGV[0];
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file($file);
# Output
# 明白 めいはく obvious, clear, plain, evident, apparent
# ...
foreach my $entry ($doc->findnodes('./JMdict/entry')) {
foreach my $keb ($entry->findnodes('./k_ele/keb')) {
print $keb->to_literal . "\t";
}
foreach my $reb ($entry->findnodes('./r_ele/reb')) {
print $reb->to_literal . "\t";
}
foreach my $sense ($entry->findnodes('./sense')) {
my @glosses = map {
$_->to_literal
} $sense->findnodes('./gloss');
print join(", ", @glosses);
}
print "\n";
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment