This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{- | |
The MIT License (MIT) | |
Copyright (c) 2103 Matthew Smith [email protected] | |
Permission is hereby granted, free of charge, to any person obtaining a copy | |
of this software and associated documentation files (the "Software"), to deal | |
in the Software without restriction, including without limitation the rights | |
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |
copies of the Software, and to permit persons to whom the Software is |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{- | |
Step 2 | |
(m>0) ATIONAL -> ATE | |
... | |
(m>0) BILITI -> BLE | |
-} | |
step2 :: String -> String | |
step2 str = | |
if (not $ null sfxs') && (measure $ snd $ wordDesc $ stm (fst sfx)) > 0 | |
then swapsfx str (fst sfx) (snd sfx) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
getstem :: String -> String -> String | |
getstem str sfx = take (length str - length sfx) str | |
swapsfx :: String -> String -> String -> String | |
swapsfx str sfx [] = take (length str - length sfx) str | |
swapsfx str sfx sfx' = take (length str - length sfx) str ++ sfx' | |
step1a :: (String, String) -> String | |
step1a (str,_) = swapsfx str (fst sfxs') (snd sfxs') | |
where sfxs' = head $ dropWhile (\ss -> not $ fst ss `isSuffixOf` str) sfxs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
measure :: (String, String) -> Int | |
measure (_,ds) = | |
length $ filter (=='c') ds' | |
where ds' = dropWhile (=='c') [head a | a <- group ds] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Vivake Gupta ([email protected]) | |
# http://tartarus.org/martin/PorterStemmer/python.txt | |
def m(self): | |
n = 0 | |
i = self.k0 | |
while 1: | |
if i > self.j: | |
return n | |
if not self.cons(i): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rotateR :: [a] -> [a] | |
rotateR [x] = [x] | |
rotateR xs = last xs : init xs | |
-- Make a pattern to match the consonants and vowels in a word where a cons is | |
-- not "aeiou" or y preceded by a cons e.g., "happy" -> ("happy", "cvccv") | |
wordDesc :: String -> (String, String) | |
wordDesc str = | |
(str, [corv (a,b,i) | (a,b,i) <- zip3 str (rotateR str) [0..len]]) | |
where len = length str - 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rotateR :: [a] -> [a] | |
rotateR [x] = [x] | |
rotateR xs = last xs : init xs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- Make a pattern to match the consonants and vowels in a word where a cons is | |
-- not "aeiou" or y preceded by a cons e.g., "happy" -> ("happy", "cvccv") | |
wordDesc :: String -> (String, String) | |
wordDesc str = (str, [corv (a,b,i) | (a,b,i) <- zip3 str (rotateR str) [0..len]]) | |
where len = length str - 1 | |
corv (a,b,i) | |
| a == 'y' && i /= 0 && b `notElem` vs = 'v' | |
| a `elem` vs = 'v' | |
| otherwise = 'c' | |
where vs = "aeiou" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def cons(self, i): | |
"""cons(i) is TRUE <=> b[i] is a consonant.""" | |
if self.b[i] == 'a' or self.b[i] == 'e' or self.b[i] == 'i' or self.b[i] == 'o' or self.b[i] == 'u': | |
return 0 | |
if self.b[i] == 'y': | |
if i == self.k0: | |
return 1 | |
else: | |
return (not self.cons(i - 1)) | |
return 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rotateR :: [a] -> [a] | |
rotateR [x] = [x] | |
rotateR xs = last xs : init xs |