Created
February 11, 2010 12:54
-
-
Save cabo/301474 to your computer and use it in GitHub Desktop.
IPv6 address RE written in an intention-revealing way
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# IPv6 address regular expression done right | |
# [email protected] 2010-02-11 | |
# from a NANOG thread: | |
# (corrected version of) http://gist.github.com/294476 | |
# Use the tests in that gist if you actually want to change this | |
ORIGINAL_IPV6_REGEX = /^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$/ | |
# Now let's write this down in an intention-revealing way instead | |
# Yes, I could have chosen better names. | |
H = '[0-9A-Fa-f]{1,4}' # 1-4 hex digits | |
D = '(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)' # 0..255, no leading zeros | |
E = '(D(\.D){3})'.gsub(/D/, D) # IPv4 address | |
IPV6CORE = %w{ | |
(H:){7}(H|:) | |
(H:){6}(:H|E|:) | |
(H:){5}(((:H){1,2})|:E|:) | |
(H:){4}(((:H){1,3})|((:H)?:E)|:) | |
(H:){3}(((:H){1,4})|((:H){0,2}:E)|:) | |
(H:){2}(((:H){1,5})|((:H){0,3}:E)|:) | |
(H:){1}(((:H){1,6})|((:H){0,4}:E)|:) | |
:(((:H){1,7})|((:H){0,5}:E)|:) | |
}.map { |l| "(#{l})"}.join("|").gsub(/H/, H).gsub(/E/, E) | |
IPV6_REGEX = | |
Regexp.new('^\s*(IPV6CORE)(%.+)?\s*$'.gsub(/IPV6CORE/, IPV6CORE)) | |
# This RE could be fixed in many ways, e.g., by not allowing just any | |
# junk after the % character, or by using \A and \z, or by fixing its | |
# numerous performance shortcomings. But the point here was to show | |
# how to write down the original expression properly. | |
# And that's all we test here, therefore: | |
if __FILE__ == $0 | |
require 'test/unit' | |
class TC_MyTest < Test::Unit::TestCase | |
def test_same | |
assert_equal ORIGINAL_IPV6_REGEX, IPV6_REGEX | |
end | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment