Created
July 13, 2011 03:32
-
-
Save copenhas/1079659 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
///<summary> | |
/// Helper class to get the word counts from text | |
///</summary> | |
public class WordCount { | |
///<summary> | |
/// Takes in a string of text cuts it up and builds a list of the unique words and | |
/// the number of times they occurred in the list. Optionally takes a regular | |
/// expression to use as the delimiter. By default the delimiter is whitespace. | |
///</summary> | |
public IDictionary<string, int> GetUniqueWords(string text, Regex delimiterPattern = null) { | |
//implement me | |
} | |
///<summary> | |
/// Takes in a StringReader to use to retrieve text. Then cuts the text up and builds | |
/// a list of the unique words and the number of times they occurred in the list. | |
/// Optionally takes a regular expression to use as the delimiter. By default the | |
/// delimiter is whitespace. | |
///</summary> | |
public IDictionary<string, int> GetUniqueWords(StringReader reader, | |
Regex delimiterPattern = null) { | |
//implement me | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Implement the class then use the first one to tell me the unique word counts for this text:
"The dog jumped over the moon, but that's because the dog had wings and the moon had fallen"
Then open a file and read it in as a stream (may need to tweak the interface of the class, if you are confused just skip it) to pump the text in efficiently into the second method. Use this regular expression "|" (matches pipes) to cut the text up.
"dog|dog|moon|over|jump|dog|jump"