Skip to content

Instantly share code, notes, and snippets.

@erantapaa
Created October 20, 2015 16:54
Show Gist options
  • Save erantapaa/1a8b316601ae409a4b3d to your computer and use it in GitHub Desktop.
Save erantapaa/1a8b316601ae409a4b3d to your computer and use it in GitHub Desktop.
pretty printing HTML with Conduit
{-# LANGUAGE NoMonomorphismRestriction #-}
-- build-depends: base >= 4.7 && < 5, xml-conduit, text, conduit, transformers, resourcet, html-conduit, conduit-extra
module Lib
where
import qualified Text.XML.Stream.Parse as P
import qualified Text.XML.Stream.Render as R
import qualified Data.Conduit.List as CL
import qualified Data.Text as T
import qualified Data.Text.IO as T
import Control.Monad.IO.Class (liftIO)
import Control.Monad.Trans.Resource (runResourceT)
import Data.Conduit
import Data.Conduit.Binary (sourceFile)
import Text.HTML.DOM (eventConduit)
-- this uses the XML parser; you'll get staircasing because entities
-- like <img>, <link> and <meta> are not auto-closed.
test3 :: IO ()
test3 = (runResourceT . runConduit)
$ P.parseFile (P.def { P.psDecodeEntities = P.decodeHtmlEntities }) "input.html"
=$= R.prettify
=$= R.renderText R.def
=$= CL.mapM_ (\x -> liftIO (T.putStr x))
-- uses `eventConduit` which uses HTML auto-closing conventions.
test4 :: IO ()
test4 = (runResourceT . runConduit)
$ sourceFile "input.html"
=$= eventConduit
=$= R.prettify
=$= R.renderText R.def
=$= CL.mapM_ (\x -> liftIO (T.putStr x))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment