Skip to content

Instantly share code, notes, and snippets.

@kowey
Created April 9, 2010 14:40
Show Gist options
  • Select an option

  • Save kowey/361229 to your computer and use it in GitHub Desktop.

Select an option

Save kowey/361229 to your computer and use it in GitHub Desktop.
tokenise :: CleanHillEntry -> String -> String
tokenise hill str = unwords (render toks)
where
toks = case parse parseChunks "" str of
Left e -> error $ "bug (chunking should always work): " ++ show e
Right cs -> cs
--
render [] = []
render (Word x : ts) = x : render ts
render (Desig x : ts) = x : render ts
render (Punct x : ts) = [x] : render ts
render (Num x : ts) = num : render ts
where
num = fromMaybe "_OTHER_NUMBER_" (possibilities ts)
possibilities (Word w : _)
| w `elem` [ "ft", "feet" ] && matchH chiHeightFeet x = Just "_HEIGHT_FEET_"
| "m" `isPrefixOf` w =
case () of _ | any (isWord "col") toks && matchMH chiColHeight x -> Just "_COL_HEIGHT_"
| any (isWord "drop") toks && matchMH chiColDrop x -> Just "_COL_DROP_"
| matchH chiHeightMetres x -> Just "_HEIGHT_METRES_"
| otherwise -> Nothing
possibilities _ = Nothing
--
matchMH f x =
case f hill of
MaybeJSON (Nothing) -> False
MaybeJSON (Just h) -> heightMatches h x
matchH f = heightMatches (f hill)
-- ----------------------------------------------------------------------
-- parsers
-- ----------------------------------------------------------------------
-- should try to preserve the text as much as possible
data Chunk = Num String
| Desig String -- ^ eg. A10, BN2
| Word String
| Punct Char
deriving (Show, Eq, Ord)
parseChunks = (parseChunk `sepEndBy` optional space) <* eof
parseChunk = nonStr <|> str
where
nonStr = desig <|> num <|> punct
--
desig = try $ do
ds <- (:) <$> dchar <*> many1 dchar
if any isDigit ds && any isUpper ds
then return (Desig ds)
else fail "designation (at least 1 digit and 1 upper)"
where
dchar = digit <|> satisfy isUpper
--
num = try $ do
ds <- (:) <$> digit <*> many (digit <|> oneOf ".,")
return (Num ds)
punct = Punct <$> satisfy isPunctuation
str = Word <$> many1 (notFollowedBy' nonStr >> satisfy (not . isSpace))
Ben Nevis (Scottish Gaelic: Beinn Nibheis, Scottish Gaelic pronounciation: peˈɲivəʃ) is the highest mountain in the British Isles.
It is located at the western end of the Grampian Mountains in the Lochaber area of Scotland, close to the town of Fort William.
A route popular with experienced hillwalkers starts at Torlundy, a few miles north-east of Fort William on the A82 road, and follows the path alongside the Allt a' Mhuilinn.
In the late 1990s, Lochaber Mountain Rescue Team erected two posts on the summit plateau, in order to assist walkers attempting the descent in foggy conditions.
Ben Nevis ( Scottish Gaelic : Beinn Nibheis , Scottish Gaelic pronounciation : peˈɲivəʃ ) is the highest mountain in the British Isles .
It is located at the western end of the Grampian Mountains in the Lochaber area of Scotland , close to the town of Fort William .
A route popular with experienced hillwalkers starts at Torlundy , a few miles north - east of Fort William on the A82 road , and follows the path alongside the Allt a ' Mhuilinn .
In the late _OTHER_NUMBER_ s , Lochaber Mountain Rescue Team erected two posts on the summit plateau , in order to assist walkers attempting the descent in foggy conditions .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment