Skip to content

Instantly share code, notes, and snippets.

@ympbyc
Forked from pasberth/html.rb
Created June 25, 2012 17:18
Show Gist options
  • Save ympbyc/2989977 to your computer and use it in GitHub Desktop.
Save ympbyc/2989977 to your computer and use it in GitHub Desktop.
require 'regparsec'
module HTMLParsers
extend RegParsec::Regparsers
ValidTags = ->(state) { one_of(*state.valid_tags) }
OpenTag = between('<', '>', ValidTags)
CloseTag = between('</', '>', ValidTags)
Line = try /[^\<\n]+/, &:to_s
TagBody = many one_of(Line, ->(s) { Tag })
Tag = apply(OpenTag, TagBody, CloseTag) { |open, body, close|
{ open.to_s.to_sym => body }
}
p Tag.parse(input: '<html><p>aaa</p></html>', valid_tags: ["html", "p"])
p Tag.parse(input: '<foo>invalid</foo>', valid_tags: ["bar"])
# => nil
p Tag.parse(input: '<foo>valid</foo>', valid_tags: ["foo"])
# => {:foo=>["valid"]}
p Tag.regparse(input: '<foo>valid</foo>', valid_tags: ["foo"])
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment