Skip to content

Instantly share code, notes, and snippets.

@carlosmcevilly
Created December 27, 2012 00:33
Show Gist options
  • Save carlosmcevilly/4384363 to your computer and use it in GitHub Desktop.
Save carlosmcevilly/4384363 to your computer and use it in GitHub Desktop.
given a reference file, output every line from STDIN that does not appear in the reference file, preserving the order of the lines in STDIN.
#!/usr/bin/perl
use strict;
use warnings;
my $in = shift || die "usage: $0 <ref-file>\n\n";
my %unwanted = %{&load_words($in)};
my $line;
while ($line=<>) {
my $word;
chomp($line);
$line =~ s/^\s*//;
$line =~ s/\s*$//;
if ($line =~ /^\d+\s+(.*)\s*$/) {
$word = $1;
}
else {
$word = $line;
}
print "$line\n" unless (defined($unwanted{$word}));
}
sub load_words {
my $file = shift;
my %words = ();
open(IN, $file) or die "$0 error: $!";
while (<IN>) {
chomp;
$words{$_} = 1;
}
close(IN);
return \%words;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment