Last active
November 21, 2016 03:49
-
-
Save Corecii/eb99d4558679434113bb9e719a0cef01 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
local string_gmatch = string.gmatch | |
local function getWordDashSpaceTable(txt, tbl) | |
local tbl, numTbl = tbl or {}, 0 | |
for word1, dash, word2, space in string_gmatch(txt, "([^%-%s]*)(%-)(%S*)(%s*)") do | |
tbl[numTbl + 1] = word1 | |
tbl[numTbl + 2] = dash | |
tbl[numTbl + 3] = word2 | |
tbl[numTbl + 4] = space | |
numTbl = numTbl + 4 | |
end | |
return tbl | |
end | |
--- | |
local testCases = { | |
"John-dash summer-wind", | |
"-Noname Two-space", | |
"Accept-the-first-dash ignored", | |
"word-word ignored word2-word2", | |
"- empty-person", | |
"many-results in-one string-are right-here" | |
} | |
local testCase, testCaseResult | |
for index = 1, #testCases do | |
testCase = testCases[index] | |
print("\nTest Case: "..testCase) | |
testCaseResult = getWordDashSpaceTable(testCase) | |
for testCaseIndex = 1, #testCaseResult do | |
print(testCaseResult[testCaseIndex]) | |
end | |
end | |
--- | |
--[[ Explanation | |
http://wiki.roblox.com/index.php?title=Global_namespace/String_manipulation#string.match | |
http://wiki.roblox.com/index.php?title=Global_namespace/String_manipulation#string.gmatch | |
https://www.lua.org/pil/7.1.html (iterators) | |
http://wiki.roblox.com/index.php?title=String_pattern | |
string.gmatch will return all of the results like :match, but as an iterator. | |
It works similar to `pairs(table)` or `next, table`, but instead it returns the | |
results of a string pattern match on each loop, allowing you to iterate through | |
every match in the string. | |
The pattern: | |
([^%-]*)(%-)(%S*)(%s+) | |
([^%-]*) | |
() is used to capture a section, meaning it gets returned as a variable. | |
[] is used to denote a set. A set defines what characters should be matched | |
[^] denotes a complement set, it will match all characters except whatever is provided. | |
%- means the `-` character. It is 'pattern escaped' here. | |
- is normally a 'magic character' that matches as few of a certain character | |
or set as possible. By putting % in front, we make it literally mean the `-` | |
character, instead of the 'magic' `-`. | |
%s is a character class. It matches all whitespace characters. That is, space, tab, newline and carriage return. | |
* is a magic character and quantifier that mean to match 0 or more, and as many as possible, or | |
a character or set of characters. | |
All together: Capture a matching of 0 or more and as many as possible of any characters that are not literally - | |
(%-) | |
These were all explained previously. | |
Capture a matching of literally - | |
(%S*) | |
%s is a character class. It matches all whitespace characters. That is, space, tab, newline and carriage return. | |
%S is a complement character class, just like a complement set. It matches | |
allcharacters except whitespace characters. | |
Capture a matching of 0 or more and as many as possible of all characters except whitespace | |
(%s*) | |
Capture a matching of 0 or more and as many as possible of whitespace characters | |
To put it all together: | |
Capture a matching of 0 or more and as many as possible of any characters that are not literally - | |
Capture a matching of literally - | |
Capture a matching of 0 or more and as many as possible of all characters except whitespace | |
Capture a matching of 1 or more and as many as possible of whitespace characters | |
]] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment