Skip to content

Instantly share code, notes, and snippets.

@bos
Last active October 18, 2016 13:56
Show Gist options
  • Save bos/6986451 to your computer and use it in GitHub Desktop.
Save bos/6986451 to your computer and use it in GitHub Desktop.
A proof-of-concept of a new approach to encoding JSON values for aeson.
{-# LANGUAGE GeneralizedNewtypeDeriving, FlexibleInstances,
OverloadedStrings #-}
import Data.Monoid (Monoid(..), (<>))
import Data.String (IsString(..))
import Data.Text (Text)
import Data.Text.Lazy.Builder (Builder, singleton)
import qualified Data.Text.Lazy.Builder as Bld
import qualified Data.Text.Lazy.Builder.Int as Bld
-- The phantom type here allows us to say "I am encoding a value of
-- type x".
data Build a = Build {
_count :: {-# UNPACK #-} !Int
, run :: Builder
}
instance Show (Build a) where
show = show . run
data Object
data Array
data Mixed
object :: Build Object -> Build Object
object (Build 0 _) = build "{}"
object (Build _ kvs) = build $ singleton '{' <> kvs <> singleton '}'
array :: Build a -> Build Array
array (Build 0 _) = build "[]"
array (Build _ vs) = build $ singleton '[' <> vs <> singleton ']'
instance Monoid (Build a) where
mempty = Build 0 mempty
mappend (Build i a) (Build j b)
| ij > 1 = Build ij (a <> singleton ',' <> b)
| otherwise = Build ij (a <> b)
where ij = i + j
instance IsString (Build Text) where
fromString = string
instance IsString (Build Mixed) where
fromString = build . Bld.fromString
(<:>) :: Build Text -> Build a -> Build Object
k <:> v = Build 1 (run k <> ":" <> run v)
int :: Integral a => a -> Build a
int = build . Bld.decimal
text :: Text -> Build Text
text = build . Bld.fromText
string :: String -> Build Text
string = build . Bld.fromString
build :: Builder -> Build a
build = Build 1
mixed :: Build a -> Build Mixed
mixed (Build a b) = Build a b
@lpsmith
Copy link

lpsmith commented Oct 16, 2013

That's a pretty interesting attempt. I certainly see a lot of value in being able to write one definition and be able to get either an aeson value, a json builder, or potentially even other structures and syntaxes.

Some potential issues that I see:

  1. Differing run-time complexities from JsonValue instances, e.g. it's O(1) to mappend two json builder arrays, but O(m+n) to mappend two aeson value arrays. How common would it be to want to use one definition to get a value but another to get a builder? Or is there enough common ground in the interface that typical use cases can be nearly optimal to both types of output?
  2. Execution speed and code size. My intuition with this type of code is that without careful use of JsonValue instances, with known types and inlined/specialized across modules, you will lose a fair bit of performance to indirect calls to small functions. But numbers are better than speculation.

Still, I think this approach is worthy of more exploration and experimentation. (And reminds me that sometimes I really wish GHC had closed type classes/type families, because you could use them here to avoid my second concern, at least in principle, although you would be giving up the potential polymorphism in the process.)

There might be some interesting ideas lurking in Bryan's attempt, but one thing that sticks out to me is that you can use the proof-of-concept to produce syntactically invalid json, by say mappending two text values.

@basvandijk
Copy link

Hi @lpsmith

it's O(1) to mappend two json builder arrays, but O(m+n) to mappend two aeson value arrays.

For that reason I introduced the elements :: Vector json -> array method.

How common would it be to want to use one definition to get a value but another to get a builder?

One use-case that I see is for post-processing a JSON Value. This is from some code at work for example. Here the Identified type can be used to add an identifier to some other Haskell type:

data Identified id a = Identified
                       { identifiedId      :: id
                       , identifiedContent :: a
                       } deriving (Show, Typeable, Generic)

instance (ToJSON id, ToJSON a) => ToJSON (Identified id a) where
    toJSON (Identified iden x) =
        case toJSON x of
          Object obj -> Object $ H.insert "id" (toJSON iden) obj
          v -> object [ "id"      .= toJSON iden
                      , "content" .= v
                      ]

Here you can see I use toJSON internally and do case analysis on the resulting JSON Value.

My intuition with this type of code is that without careful use of JsonValue instances, with known types and inlined/specialized across modules, you will lose a fair bit of performance to indirect calls to small functions. But numbers are better than speculation.

I agree, we need some benchmarks here. Users probably need to add SPECIALIZE pragma's to their toJSON definitions.

@lpsmith
Copy link

lpsmith commented Oct 16, 2013

Err, I meant, how common would it be to want two definitions, one to get a value, and the other to get a builder, due to the differing time/space complexities of the implementations?

@basvandijk
Copy link

@lpsmith FYI: I began experimenting with this approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment