Skip to content

Instantly share code, notes, and snippets.

@leejarvis
Created December 10, 2012 22:08
Show Gist options
  • Save leejarvis/4253798 to your computer and use it in GitHub Desktop.
Save leejarvis/4253798 to your computer and use it in GitHub Desktop.
class Tag
class << self
attr_accessor :match
attr_accessor :children
end
self.children = []
def self.inherited(klass)
Tag.children << klass
end
def self.matches(obj)
self.match = obj
end
def self.scan(token, value)
token.tags << self.new if matches?(value)
end
def self.matches?(value)
case self.match
when Regexp; self.match =~ value
when String, Numeric; self.match == value
when Array, Range; self.match.include?(value)
else
false
end
end
def to_s
self.class.to_s.gsub(/([a-z\d])([A-Z])/, '\1_\2').downcase
end
end
class Number < Tag
matches /\d+/
end
class SingleNumber < Tag
matches '0'...'10'
end
class Token
attr_reader :value, :tags
def initialize(value)
@value = value
@tags = []
Tag.children.each { |tagger| tagger.scan(self, value) }
end
end
class Tokenizer
def self.tokenize(text)
text.split(/\s+/).map { |token| Token.new(token) }
end
end
tokens = Tokenizer.tokenize(DATA.read)
p tokens.map { |t| [t.value, t.tags.map(&:to_s)] }
__END__
foo bar 10 two 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment