jt · August 22, 2011 18:41
diff --git a/regext.txt b/regext.txt
 - any character you use it will literally match it except special characters
 ^ $ ? . / \ [ ] { } ( ) + * - all the special characters that will need escaping if you don't want them to be special
 // - regexp ruby class

 Common Patterns (I authored)
 /[\w+\.~-]+@[\w~-]+.[\w\.]+/ - match emails, conforms to RFC 3986 section 3.3
 /\+?(\d)?[-|\.|\s]?\(?(\d{3})\)?[-|\.|\s]?(\d{3})[-|\.|\s]?(\d{4})/ - match phone numbers, https://gist.github.com/1009331

 Strategies

 foo(?!.*foo) - negative lookahead, find the foo that does not have a foo following it. use to find the last match in a string.

 Anchors

 ^      - start of line
 \A     - start of string
 $      - end of line
 \Z     - end of string
 \b     - any any word boundary character
 \B     - any non word boundary
 \<     - start of word
 \>     - end of word
 /^apple/.match 'pear apple' # no match, ^ looks for apple at beginning of string with not whitespace before it
 \A - longhand for ^, /\Aapple/ same as /^apple/
 /apple$/.match 'apple pear' # no match, $ looks for apple at end of string with not whitespace after it
 \Z - longhand for $, /apple\Z/ same as /apple$/

 Character Classes

 []       - character class, a quasi-wildcard, matches only characters specified
 [abc]    - match a single character, a, b, or c
 [^abc]   - match a single character except for a, b, or c
 [a-zA-Z] - match single character in the range a-z or A-Z
 .        - any character
 \c       - control character
 \s       - any whitespace character
 \S       - any non-whitespace character
 \d       - any digit, shorthand for [0-9]
 \D       - any non-digit
 \w       - any word character, shorthand for [0-9a-fA-F_]
 \W       - any non-word character
 \xhh     - hexadecimal char hh @expand
 \X       - ??
 \Oxxx    - octal char xxx @expand

 Also Note: Any special characters within a character class become literal characters unless
 they are escaped (e.g. [.] matches a period versus [\.] which is any character)

 Quantifiers

 a?     - nothing or a, ? marks previous character as optional
 a*     - nothing or more of a
 a+     - one or more of a
 a{3}   - exactly 3 of a
 a{3,}  - 3 or more of a
 a{3,6} - 3 to 6 of a

 Ranges

 (a|b)   - a or b
 (...)   - contents are captured
 (:?...) - passive group. gain the benefits of using parens but without having to capture its match.
 \n      - nth group/subpattern

 Ruby Matching

 - there are 2 components to a ruby regexp, the pattern and the modifers. modifers are optional, example
  /something/i # something is the pattern, i is the modifier
 - every match operation either succeeds or fails, if no match it will always be nil
 "an interesting ruby string".match(/ruby/) # returns a matchdata class
 "test this".match(/banana/)                # returns nil
 /ruby/.match("an interesting ruby string") # returns a matchdata class
 /ruby/.match("an interesting ruby string") # returns a matchdata class
 "test this" =~ /this/                      # returns 5, the beginning location of the match
 /this/ =~ "test this"                      # ''  ''
 - class MatchData has a boolean value of true making it useful for logic operations
 - class MatchData also stores information about the match
 "before after before".scan(/before/) - returns an array of all matches, if the pattern contains captures, you'll get an array of arrays
 "before after before".split(/before/) - returns an array of everything except the matches

 MatchData, example methods:
  match = /ejected/.match 'ejected'
  match.string                      # ejected, the string we matched agains
  match[0]                          # the entire part of the string matched
  match[1]                          # first match
  match[2]                          # second match
  match.captures[0]                 # first match
  match.captures[1]                 # second match

 Modifiers

 /i - case insensitive
 /m - makes wildcard, . , match newlines
 /x - ignore whitespace in pattern
 /o - perform #{...} substitutions only once
 /s - treat string as single line
 /[rd]ejected/imxo - chain multiple modifiers

 Substitution

 "after it all".gsub(/after/, "before") # "before it all"
 "after it all".gsub(/after/, "before \\0") # before after it all, reinsert the first capture. increment for additional

 Special Chars

 \  - escape char
 \n - newline
 \r - carriage return
 \t - tab
 \v - vertical tab
 \f - form feed
	- any character you use it will literally match it except special characters
	^ $ ? . / \ [ ] { } ( ) + * - all the special characters that will need escaping if you don't want them to be special
	// - regexp ruby class

	Common Patterns (I authored)
	/[\w+\.~-]+@[\w~-]+.[\w\.]+/ - match emails, conforms to RFC 3986 section 3.3
	/\+?(\d)?[-\|\.\|\s]?\(?(\d{3})\)?[-\|\.\|\s]?(\d{3})[-\|\.\|\s]?(\d{4})/ - match phone numbers, https://gist.github.com/1009331

	Strategies

	foo(?!.*foo) - negative lookahead, find the foo that does not have a foo following it. use to find the last match in a string.

	Anchors

	^ - start of line
	\A - start of string
	$ - end of line
	\Z - end of string
	\b - any any word boundary character
	\B - any non word boundary
	\< - start of word
	\> - end of word
	/^apple/.match 'pear apple' # no match, ^ looks for apple at beginning of string with not whitespace before it
	\A - longhand for ^, /\Aapple/ same as /^apple/
	/apple$/.match 'apple pear' # no match, $ looks for apple at end of string with not whitespace after it
	\Z - longhand for $, /apple\Z/ same as /apple$/

	Character Classes

	[] - character class, a quasi-wildcard, matches only characters specified
	[abc] - match a single character, a, b, or c
	[^abc] - match a single character except for a, b, or c
	[a-zA-Z] - match single character in the range a-z or A-Z
	. - any character
	\c - control character
	\s - any whitespace character
	\S - any non-whitespace character
	\d - any digit, shorthand for [0-9]
	\D - any non-digit
	\w - any word character, shorthand for [0-9a-fA-F_]
	\W - any non-word character
	\xhh - hexadecimal char hh @expand
	\X - ??
	\Oxxx - octal char xxx @expand

	Also Note: Any special characters within a character class become literal characters unless
	they are escaped (e.g. [.] matches a period versus [\.] which is any character)

	Quantifiers

	a? - nothing or a, ? marks previous character as optional
	a* - nothing or more of a
	a+ - one or more of a
	a{3} - exactly 3 of a
	a{3,} - 3 or more of a
	a{3,6} - 3 to 6 of a

	Ranges

	(a\|b) - a or b
	(...) - contents are captured
	(:?...) - passive group. gain the benefits of using parens but without having to capture its match.
	\n - nth group/subpattern

	Ruby Matching

	- there are 2 components to a ruby regexp, the pattern and the modifers. modifers are optional, example
	/something/i # something is the pattern, i is the modifier
	- every match operation either succeeds or fails, if no match it will always be nil
	"an interesting ruby string".match(/ruby/) # returns a matchdata class
	"test this".match(/banana/) # returns nil
	/ruby/.match("an interesting ruby string") # returns a matchdata class
	/ruby/.match("an interesting ruby string") # returns a matchdata class
	"test this" =~ /this/ # returns 5, the beginning location of the match
	/this/ =~ "test this" # '' ''
	- class MatchData has a boolean value of true making it useful for logic operations
	- class MatchData also stores information about the match
	"before after before".scan(/before/) - returns an array of all matches, if the pattern contains captures, you'll get an array of arrays
	"before after before".split(/before/) - returns an array of everything except the matches

	MatchData, example methods:
	match = /ejected/.match 'ejected'
	match.string # ejected, the string we matched agains
	match[0] # the entire part of the string matched
	match[1] # first match
	match[2] # second match
	match.captures[0] # first match
	match.captures[1] # second match

	Modifiers

	/i - case insensitive
	/m - makes wildcard, . , match newlines
	/x - ignore whitespace in pattern
	/o - perform #{...} substitutions only once
	/s - treat string as single line
	/[rd]ejected/imxo - chain multiple modifiers

	Substitution

	"after it all".gsub(/after/, "before") # "before it all"
	"after it all".gsub(/after/, "before \\0") # before after it all, reinsert the first capture. increment for additional

	Special Chars

	\ - escape char
	\n - newline
	\r - carriage return
	\t - tab
	\v - vertical tab
	\f - form feed
No results found