Skip to content

Instantly share code, notes, and snippets.

@begriffs
Created June 10, 2014 04:02
Show Gist options
  • Save begriffs/3b35c7b646a94695929d to your computer and use it in GitHub Desktop.
Save begriffs/3b35c7b646a94695929d to your computer and use it in GitHub Desktop.
Opaleye Preview
This is a preview of Tom Ellis' database query library for Haskell.
> {-# LANGUAGE Arrows, FlexibleContexts #-}
> -- TODO: Get rid of FlexibleContexts if we ever move the definition of s
> -- elsewhere.
> {-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
> {-# LANGUAGE TemplateHaskell #-}
>
> module Karamaan.Opaleye.Examples where
>
> import Prelude hiding (sum)
> import Karamaan.Opaleye.Unpackspec (Unpackspec)
> import Karamaan.Opaleye.Table (makeTableDef)
> import Karamaan.Opaleye.QueryArr (Query, QueryArr)
> import Karamaan.Opaleye.Nullable (Nullable)
> import Karamaan.Opaleye.Aggregate (aggregate, groupBy, sum, avg, count)
> import qualified Karamaan.Opaleye.Operators2 as Op2
> import qualified Karamaan.Opaleye.Predicates as P
> import qualified Karamaan.Opaleye.Operators.Numeric as N
> import Karamaan.Opaleye.Wire (Wire)
> import Karamaan.Opaleye.SQL (showSqlForPostgresDefault)
> import Control.Category ((<<<))
> import Control.Arrow (arr, (&&&), returnA)
> import Data.Time.Calendar (Day)
>
> import Data.Profunctor.Product.TH (makeAdaptorAndInstance)
> import Data.Profunctor.Product (PPOfContravariant, ProductProfunctor, p2, p5)
> import Data.Profunctor (dimap)
> import Data.Profunctor.Product.Default (Default, def)
> import qualified Database.PostgreSQL.Simple as SQL
> import Karamaan.Opaleye.RunQuery as RQ
Introduction
============
In this example file I'll give you a brief introduction to the Opaleye
relational query EDSL. I'll show you how to define tables in Opaleye;
use them to generate selects, joins and filters; use the API of
Opaleye to make your queries more composable; and finally run the
queries on Postgres.
Opaleye uses HaskellDB's SQL generator. You should really use the
latest version from https://github.com/m4dc4p/haskelldb because it
patches some important bugs present in 2.2.2.
Schema
======
A Query that returns the contents of a table is defined with
'makeTableDef'. The syntax is simple. You specify the types of the
columns and the names of the columns. 'Wire X' essentially means "a
column of type X".
There is some typeclass magic which matches the names of the columns
with a field in the query result type. In Opaleye you never *have* to
use typeclasses. All the magic that typeclasses do is also available
by explicitly passing in the "typeclass dictionary". However, for
this example file we will always use the typeclass versions because
they are simpler to read.
For tuples, the typeclass instances are already defined, so you can
just go ahead and define your table specification.
> personTable :: Query (Wire String, Wire Int, Wire String)
> personTable = makeTableDef ("name", "age", "address") "personTable"
If we generate the SQL for this we see the following:
ghci> sh personTable
SELECT name as name_1,
age as age_1,
address as address_1
FROM personTable as T1
('sh' is just a conveniently named utility function for the purposes
of this example file)
Opaleye can use user defined types in queries. It will save you a lot
of headaches if you define your typeclasses to be polymorphic in all
their fields. If you want to use concrete types in particular places,
as you almost always will, you can use type synonyms. For example:
> data Birthday' a b = Birthday { bdName :: a, bdDay :: b }
> type Birthday = Birthday' String Day
> type WireBirthday = Birthday' (Wire String) (Wire Day)
To get user defined types to work with the typeclass magic they must
have instances defined for them. The instances are derivable with
Template Haskell.
> $(makeAdaptorAndInstance "pBirthday" ''Birthday')
Then we can use 'makeTableDef' to make a table that returns our record
type in exactly the same way as before. This usage makes it clear why
the fields should be polymorphic. The argument to 'makeTableDef' has
type 'Birthday' String String', whilst in the return value it is
'Birthday' (Wire String) (Wire Day)'. In practice you don't need to
worry about this, just make all your fields polymorphic!
> birthdayTable :: Query WireBirthday
> birthdayTable = makeTableDef (Birthday { bdName = "name"
> , bdDay = "birthday" })
> "birthdayTable"
ghci> sh birthdayTable
SELECT name as name_1,
birthday as birthday_1
FROM birthdayTable as T1
Projection
==========
The type 'Query a' is a type synonym for 'QueryArr () a', and
'QueryArr' is an instance of 'Arrow'. For the purposes of projection
you need to know that this means there is a combinator
arr :: (a -> b) -> QueryArr a b
and a combinator
<<< :: QueryArr b c -> QueryArr a b -> QueryArr a c
This allows us to use a function of type 'a -> b' to project columns
of type 'b' from a 'Query a'. For example, if we want just the first
and second columns of 'personTable'
> nameAge :: Query (Wire String, Wire Int)
> nameAge = arr (\(x, y, _) -> (x, y)) <<< personTable
ghci> sh nameAge
SELECT name as name_1,
age as age_1
FROM personTable as T1
Product
=======
You can take the cartesian product of two queries by using the arrow
'(&&&)' combinator. Specialised to 'Query' the type is
(&&&) :: Query a -> Query b -> Query (a, b)
For example, to take the product of the 'personTable' and the
'birthdayTable' we do
> personBirthdayProduct :: Query ((Wire String, Wire Int, Wire String),
> WireBirthday)
> personBirthdayProduct = personTable &&& birthdayTable
ghci> sh personBirthdayProduct
SELECT name_1,
age_1,
address_1,
name_2,
birthday_2
FROM (SELECT name as name_1,
age as age_1,
address as address_1
FROM personTable as T1) as T1,
(SELECT name as name_2,
birthday as birthday_2
FROM birthdayTable as T1) as T2
Note that in 'personBirthdayProduct' we end up with a nested tuple.
If you prefer to have a flattened tuple, one way of flattening it
would be with the technique we learned above in "Projection". Note
that in practice, however, you probably won't end up writing queries
in this more "low level" way which is somewhat fiddly, but use higher
level combinators. Still, this is a useful example.
> personBirthdayProduct' :: Query (Wire String, Wire Int, Wire String,
> WireBirthday)
> personBirthdayProduct' = arr (\((x, y, z), b) -> (x, y, z, b))
> <<< personBirthdayProduct
The generated SQL will be exactly the same as before.
There is a further way to do product and unflattening in one go, and
that is to use Arrow notation. Arrow notation is a GHCi extension
which is *extremely* useful when writing Opaleye queries. This is the
previous example rewritten with Arrow notation.
> personBirthdayProduct'' :: Query (Wire String, Wire Int, Wire String,
> WireBirthday)
> personBirthdayProduct'' = proc () -> do
> (x, y, z) <- personTable -< ()
> b <- birthdayTable -< ()
> returnA -< (x, y, z, b)
Again the same SQL is generated.
Arrow notation is much more convenient than Arrow combinators when
writing involved queries. As a simple introduction we'll see how to
how to add two numeric fields. This query returns all pairs of
people, and the sum of their ages.
> totalAge :: Query (Wire String, Wire String, Wire Int)
> totalAge = proc () -> do
> (name1, age1, _) <- personTable -< ()
> (name2, age2, _) <- personTable -< ()
>
> sumAge <- N.plus -< (age1, age2)
>
> returnA -< (name1, name2, sumAge)
ghci>
SELECT name_1,
name_2,
age_1 + age_2 as age_1_plus_age_2_3
FROM (SELECT name as name_1,
age as age_1
FROM personTable as T1) as T1,
(SELECT name as name_2,
age as age_2
FROM personTable as T1) as T2
Join
====
In Opaleye you express a join by taking the product of two tables and
then restricting the result to the case where an equality holds
between two columns. This approach is very general and extends not
just to joining based on equality of columns but on inequalities and
all sorts of other predicates.
If we want to join 'personTable' to 'birthdayTable' in order to look
up the birthday of every person we can do that with the following,
which is essentially a product followed by a restriction.
> personAndBirthday :: Query (Wire String, Wire Int, Wire String, Wire Day)
> personAndBirthday = proc () -> do
> (name, age, address) <- personTable -< ()
> birthday <- birthdayTable -< ()
>
> P.restrict <<< Op2.eq -< (name, bdName birthday)
>
> returnA -< (name, age, address, bdDay birthday)
ghci> sh personAndBirthday
SELECT name_1,
age_1,
address_1,
birthday_2
FROM (SELECT name as name_1,
age as age_1,
address as address_1
FROM personTable as T1) as T1,
(SELECT name as name_2,
birthday as birthday_2
FROM birthdayTable as T1) as T2
WHERE (name_1 = name_2)
Composability
-------------
This query gives us our first opportunity for a glimpse at the
enormous composability that Opaleye offers.
The relationship that we are trying to express between a person's name
and their birthday is a process which "takes in" their name and "gives
out" their birthday. 'birthdayOfPerson' encodes this this directly in
Opaleye. It "takes in" a name and returns all rows of 'birthdayTable'
which match that name (which, if the birthday table is defined
properly in our DBMS, should be only one row!). Using Arrow notation
makes this informal description of the behaviour simple to implement.
> birthdayOfPerson :: QueryArr (Wire String) (Wire Day)
> birthdayOfPerson = proc name -> do
> birthday <- birthdayTable -< ()
>
> P.restrict <<< Op2.eq -< (name, bdName birthday)
>
> returnA -< bdDay birthday
We can't generate "the SQL of" birthdayOfPerson. Since it's not a
'Query' it doesn't have any SQL! What we do with it is use it to
reimplement 'personAndBirthday'' in a more neatly-factored way.
> personAndBirthday' :: Query (Wire String, Wire Int, Wire String, Wire Day)
> personAndBirthday' = proc () -> do
> (name, age, address) <- personTable -< ()
> birthday <- birthdayOfPerson -< name
>
> returnA -< (name, age, address, birthday)
ghci> sh personAndBirthday'
SELECT name_1,
age_1,
address_1,
birthday_2
FROM (SELECT name as name_1,
age as age_1,
address as address_1
FROM personTable as T1) as T1,
(SELECT name as name_2,
birthday as birthday_2
FROM birthdayTable as T1) as T2
WHERE (name_1 = name_2)
Note that the generated SQL is exactly the same as before. Pulling
out 'birthdayOfPerson' was merely a refactoring. It didn't change
behaviour.
More joins
----------
Here's an example of restricting with an inequality condition. This
query finds everyone whose age is less than 18.
> children :: Query (Wire String, Wire Int, Wire String)
> children = proc () -> do
> row@(_, age, _) <- personTable -< ()
> -- Note: having to pull out the 'constant 18' explicitly
> -- is a bit messy. A better syntax for this is an
> -- active research project!
> eighteen <- Op2.constant 18 -< ()
> P.restrict <<< N.lt -< (age, eighteen)
>
> returnA -< row
ghci> sh children
SELECT name as name_1,
age as age_1,
address as address_1
FROM personTable as T1
WHERE (age < 18)
Here's an example with more restrictions. It returns everyone who is
not in their twenties, and lives at a specific address.
> notTwentiesAtAddress :: Query (Wire String, Wire Int, Wire String)
> notTwentiesAtAddress = proc () -> do
> row@(_, age, address) <- personTable -< ()
> twenty <- Op2.constant 20 -< ()
> thirty <- Op2.constant 30 -< ()
>
> ltTwenty <- N.lt -< (age, twenty)
> gteThirty <- N.gte -< (age, thirty)
>
> P.restrict <<< Op2.or -< (ltTwenty, gteThirty)
>
> myAddress <- Op2.constant "1 My Street, My Town" -< ()
>
> P.restrict <<< Op2.eq -< (address, myAddress)
>
> returnA -< row
ghci> sh notTwentiesAtAddress
SELECT name as name_1,
age as age_1,
address as address_1
FROM personTable as T1
WHERE (address = '1 My Street, My Town') AND (age < 20 OR age >= 30)
More composability
------------------
We can factor out some parts of the 'notTwentiesAtAddress' query. For
example we can pull out the check for being 'notTwenties' and the
check 'addressIs1MyStreet'.
> notTwenties :: QueryArr (Wire Int) (Wire Bool)
> notTwenties = proc age -> do
> twenty <- Op2.constant 20 -< ()
> thirty <- Op2.constant 30 -< ()
> ltTwenty <- N.lt -< (age, twenty)
> gteThirty <- N.gte -< (age, thirty)
> Op2.or -< (ltTwenty, gteThirty)
>
> addressIs1MyStreet :: QueryArr (Wire String) (Wire Bool)
> addressIs1MyStreet = proc address -> do
> myAddress <- Op2.constant "1 My Street, My Town" -< ()
> Op2.eq -< (address, myAddress)
>
> notTwentiesAtAddress' :: Query (Wire String, Wire Int, Wire String)
> notTwentiesAtAddress' = proc () -> do
> row@(_, age, address) <- personTable -< ()
>
> P.restrict <<< notTwenties -< age
> P.restrict <<< addressIs1MyStreet -< address
>
> returnA -< row
The generated SQL is again exactly the same as before.
> sh :: Default (PPOfContravariant Unpackspec) a a
> => Query a -> IO ()
> sh = putStrLn . showSqlForPostgresDefault
Aggregation
===========
Type safe aggregation is the jewel in the crown of Opaleye. Both
HaskellDB and Esqueleto have aggregation implementations that allow
the application programmer to produce an invalid SQL query. By
contrast, every Opaleye expression you can write generates well formed
SQL. Of course there may be bugs in the implemenation, but the idea
is that there are no bugs in the API!
By way of example, suppose we have a widget table which contains the
style, color, location, quantity and radius of widgets. We can model
this information with the following datatype.
> data Widget a b c d e = Widget' { style :: a
> , color :: b
> , location :: c
> , quantity :: d
> , radius :: e }
For the purposes of this example the style, color and location will be
strings, but in practice you'll probably want to use an abstract data
type for them.
> widgetTable :: Query (Widget (Wire String) (Wire String) (Wire String)
> (Wire Int) (Wire Double))
> widgetTable = makeTableDef (Widget' { style = "style"
> , color = "color"
> , location = "location"
> , quantity = "quantity"
> , radius = "radius" })
> "widgetTable"
>
> $(makeAdaptorAndInstance "pWidget" ''Widget)
Here we see the first explict use of our Template Haskell derived
code. We use the 'pWidget' "adaptor" to specify how columns are
aggregated. Note that this is yet another example of avoiding a
headache by keeping your datatype fully polymorphic, because the
'count' aggregator changes a 'Wire String' into a 'Wire Int'.
'aggregateWidgets' groups by the style and color of widgets,
calculating how many (possibly duplicated) locations there are, the
total number of such widgets and their average radius.
> aggregateWidgets :: Query (Widget (Wire String) (Wire String) (Wire Int)
> (Wire Int) (Wire Double))
> aggregateWidgets = aggregate (pWidget (Widget' { style = groupBy
> , color = groupBy
> , location = count
> , quantity = sum
> , radius = avg }))
> widgetTable
The Opaleye corresponds closely to the generated SQL.
ghci> sh widgetTable
SELECT style as style_1_2,
color as color_1_2,
COUNT(location) as location_1_2,
SUM(quantity) as quantity_1_2,
AVG(radius) as radius_1_2
FROM widgetTable as T1
GROUP BY style,
color
Running queries on Postgres
===========================
Opaleye provides simple facilities for running queries on Postgres.
Other DBMSes are not forbidden, but have just not been tried!
For example we can run the 'notTwentiesAtAddress' query as below.
Note that this particular formulation uses typeclasses so please put
type signatures on everything in sight to minimize the number of
confusing error messages!
> notTwentiesQuery :: SQL.ConnectInfo -> IO [(String, Int, String)]
> notTwentiesQuery connectInfo = RQ.runQueryDefaultConnectInfo connectInfo
> notTwentiesAtAddress
Note that nullable columns are indicated with the Nullable type
constructor, and these are converted to Maybe when executed. If we
have a table with a nullable column like the following
> widgets :: Query (Wire String, Wire (Nullable Int))
> widgets = makeTableDef ("widget_location", "widget_quantity") "widgets_table"
then when we run it the nullable columns turns into a column of Maybes
> widgetsQuery :: SQL.ConnectInfo -> IO [(String, Maybe Int)]
> widgetsQuery connectInfo = RQ.runQueryDefaultConnectInfo connectInfo widgets
Conclusion
==========
There ends the Opaleye introductions module. Please send me your questions!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment