Skip to content

Instantly share code, notes, and snippets.

@sw17ch
Created March 15, 2012 01:00
Show Gist options
  • Save sw17ch/2040850 to your computer and use it in GitHub Desktop.
Save sw17ch/2040850 to your computer and use it in GitHub Desktop.
This example depends on parsec >= 3.0 and the indents package >= 0.3.2.
To install indents, run the following command:
cabal install indents
> module Main where
> import Text.Parsec
> import Text.Parsec.Indent
> import qualified Control.Monad.State as S
First, lets define the input data we want to use. In this case, we want to
specify the name of the list by aligning it with the left-most column and
ending it with a colon. It is allowed to include numbers and letters.
In this example, we have two lists defined.
> test_input :: String
> test_input = unlines [
> "foo:",
> " bar",
> " baz",
> " boop",
> "",
> "foop:",
> " 123",
> " 45234kdskfjs34",
> " 9999999"
> ]
Rather than using raw strings to define the two different types of fields as
type synonyms. In this case, we have Names and Items which are both
represented as strings.
> type Name = String
> type Item = String
Finally, we define our goal type. We're calling it a named list. A named list
has a name and a list of items in the list.
> data NamedList = NamedList Name [Item]
> deriving (Show)
Here's where things get a little tricky. The Parser type synonym built into
Text.Parsec assumes a few things:
1) Assumes that the input stream is of type String.
2) Assumes that the user state is ().
3) Assumes that the underlying monad is the Identity monad.
The only one of these assumptions that's a problem for the default layout given
by Text.Parsec is that the indentation parser assumes a different underlying
monad. Instead, it uses (Control.Monad.State SourcePos) as the underlying
monad.
> type IParser a = ParsecT String () (S.State SourcePos) a
Since we've had to redefine Parser because of the change in the underlying monad,
we'll also redefine the 'parse' conenience function to handle our new type.
Our definition handles both the new underlying monad and the extra step needed
by the indentation parser.
> iParse :: IParser a -> String -> String -> Either ParseError a
> iParse parser file_name input = runIndent file_name $ runParserT parser () file_name input
Now, we need to define what we mean by 'a Name'. In this case, we want a name
to be defined by a non-empy string of alpha-numeric characters followed by a
colon.
> aName :: IParser String
> aName = do
> n <- many1 alphaNum
> _ <- char ':'
> spaces
> return n
An Item is defined similarly to a name, but we don't expect the trailing colon.
> anItem :: IParser String
> anItem = do
> i <- many1 alphaNum
> spaces
> return i
In this parser, we actually attempt to parse an indented block. In this case,
we're using the 'withBlock' construct. The 'withBlock' construct attempts to
parse a list name followed by a block of items in the list. The two items are
combined with a function. In this case, we pass 'NamedList' as the function
with which to combine the name and the items.
> aNamedList :: IParser NamedList
> aNamedList = do
> b <- withBlock NamedList aName anItem
> spaces
> return b
We might want to parse more than one list out of the same stream, so that's
exactly what we do here. We clear any preceeding spaces out of the input
stream and then attempt to parse at least one named list.
> someNamedLists :: IParser [NamedList]
> someNamedLists = do
> spaces
> lists <- many1 aNamedList
> return lists
Finally, we invoke iParse on our parser, a dummy file name, and the test input
stream. If we get a valid result, we'll just print the Haskell representation
of the NamedList type. If we get a parse error, we'll print the error instead.
> main :: IO ()
> main = do
> let res = iParse someNamedLists "some_source.txt" test_input
>
> putStrLn "Indentation Parser Example"
> case res of
> Left err -> putStrLn $ "Error: " ++ show err
> Right v -> print v
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment