Skip to content

Instantly share code, notes, and snippets.

@neongreen
Forked from zearen/JSONParser.hs
Last active October 12, 2015 22:57
Show Gist options
  • Save neongreen/f157ff6f83a3c03ec2dc to your computer and use it in GitHub Desktop.
Save neongreen/f157ff6f83a3c03ec2dc to your computer and use it in GitHub Desktop.
A JSON parser/printer
{-# LANGUAGE NoMonomorphismRestriction, FlexibleContexts #-}
import Data.Functor
import Control.Monad
import Data.List
import qualified Data.Map as M
import Text.Megaparsec
import Text.Megaparsec.Lexer (signed, float, integer)
import qualified Text.Megaparsec.Lexer as L
-- This is our data structure for representing JSON. “Either a number, or a
-- string, or an array, or...”
data JSON = JNumber Double
| JString String
| JArray [JSON]
| JObject (M.Map String JSON)
| JBool Bool
| JNull
-- This converts a JSON document to a string.
instance Show JSON where
show (JNumber x) = show x
show (JString x) = show x
show (JArray x) = "[" ++ intercalate ", " (map show x) ++ "]"
show (JObject x) = "{" ++ intercalate ", " (map showPair (M.toList x)) ++ "}"
where showPair (key, value) = show key ++ ": " ++ show value
show (JBool x) = if x then "true" else "false"
show JNull = "null"
-- Here goes a specification of JSON. First of all, a JSON value is either a
-- number, or a string, or an array, etc:
jsonP = lexeme $ choice [numberP, stringP, arrayP, objectP, boolP, nullP]
-- Null is simply string "null":
nullP = symbol "null" $> JNull
-- A boolean is either "true" or "false":
boolP = symbol "true" $> JBool True
<|> symbol "false" $> JBool False
-- A number is either a float or an integer, and it has a sign. “pure ()”
-- means that no space is allowed between the sign and the
-- number. “fromInteger” converts an integer to a float (our JSON
-- representation uses floats for everything).
numberP = JNumber <$>
signed (pure ()) (try float <|> (fromInteger <$> integer))
-- Strings are more complicated. First of all, let's define an escaped char:
escapedChar = do
char '\\'
choice [
char '\"' $> '\"', -- A boring list of hardcoded values from the spec.
char '\\' $> '\\',
char '/' $> '/' ,
char 'n' $> '\n',
char 'r' $> '\r',
char 'f' $> '\f',
char 't' $> '\t',
char 'b' $> '\b',
unicodeEscape ]
-- Where a Unicode escape is something like “u2001”:
unicodeEscape = do
char 'u'
code <- count 4 hexDigitChar
return $ toEnum (read ("0x" ++ code))
-- Now, a string is just some ordinary or escaped chars between double quotes:
stringLiteral = lexeme $ do
char '\"'
(escapedChar <|> anyChar) `manyTill` char '\"'
stringP = JString <$> stringLiteral
-- An array is a list of comma-separated values between brackets:
arrayP = JArray <$> brackets (jsonP `sepBy` comma)
-- An object is a list of comma-separated association pairs between curly
-- braces, and an association pair is just “key: value”.
pairP = do
key <- stringLiteral
symbol ":"
value <- jsonP
return (key, value)
objectP = JObject . M.fromList <$> braces (pairP `sepBy` comma)
-- Okay, that's all. Here's a small wrapper that reads JSON line by line and
-- echoes everything:
main = forever $ do
line <- getLine
parseTest (jsonP <* eof) line
-- Some utils.
symbol = L.symbol space
lexeme = L.lexeme space
brackets = between (symbol "[") (symbol "]")
braces = between (symbol "{") (symbol "}")
comma = symbol ","
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment