Created
March 21, 2012 17:46
-
-
Save erochest/2150126 to your computer and use it in GitHub Desktop.
A script I wrote in Literate Haskell using Shelly
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
# pandoc -f markdown+lhs -t html5 --smart --css https://raw.github.com/richleland/pygments-css/master/default.css s5topdf.lhs | |
# pandoc -f markdown+lhs -t html5 --smart --css s5topdf.css s5topdf.lhs | |
pandoc -f markdown+lhs -t html5 --smart s5topdf.lhs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
nmap \h :w<CR>:call Send_to_Tmux("./generate-html.sh > index.html\n")<CR> | |
nmap \c :w<CR>:call Send_to_Tmux("./generate-html.sh \| pbcopy\n")<CR> | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
runhaskell ./s5topdf.lhs "$@" | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode { | |
margin: 0; padding: 0; vertical-align: baseline; border: none; } | |
table.sourceCode { width: 100%; } | |
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; } | |
td.sourceCode { padding-left: 5px; } | |
code > span.kw { color: #007020; font-weight: bold; } | |
code > span.dt { color: #902000; } | |
code > span.dv { color: #40a070; } | |
code > span.bn { color: #40a070; } | |
code > span.fl { color: #40a070; } | |
code > span.ch { color: #4070a0; } | |
code > span.st { color: #4070a0; } | |
code > span.co { color: #60a0b0; font-style: italic; } | |
code > span.ot { color: #007020; } | |
code > span.al { color: #ff0000; font-weight: bold; } | |
code > span.fu { color: #06287e; } | |
code > span.er { color: #ff0000; font-weight: bold; } | |
pre.sourceCode { | |
margin-left: 1cm; | |
margin-right: 1cm; | |
padding: 0.5em 1em; | |
border: 1px solid #888; | |
-moz-border-radius: 0.5em; | |
-webkit-border-radius: 0.5em; | |
border-radius: 0.5em; | |
-moz-box-shadow: 5px 5px 5px #888; | |
-webkit-box-shadow: 5px 5px 5px #888; | |
box-shadow: 5px 5px 5px #888; | |
} | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Shell Programming in Haskell: Converting S5 Slides to PDF | |
========================================================= | |
Recently, I gave an introduction to Python for Chris' and Kelly's [GIS | |
Workshop][s12gis]. It was a really great experience, and we had a lot of fun | |
learning about Python and how to use it with ArcGIS. | |
I did [my slides][slides] for it in Markdown, using [S5][s5]. Others around the | |
Scholars' Lab have used [Show-off][showoff] to compose slide-shows in Markdown, | |
but I wanted something a little simpler, and it had been a while since I'd | |
looked at S5, so I used that instead. | |
Then Kelly asked me for a PDF version of the slideshow. Heh. | |
At first I thought I might have to covert it to Showoff or (worse yet) | |
PowerPoint. But I Googled around and found that converting it wouldn't be too | |
difficult. The process itself would be simple, and a small shell script would | |
make it even easier. | |
<img src="http://www.scholarslab.org/wp-content/uploads/2012/03/philosoraptor.jpg" style="float: left; padding-right: 0.5em;"> | |
And then my infallible instinct to make any project ten times more interesting | |
(i.e., *complicated*) kicked in. | |
I remembered that I'd just read Greg Weber's post about [Shelly][shelly], a | |
library to make shell scripting a bit easier in Haskell. I've been seriously | |
playing with Haskell for almost a year now, using it for most of my | |
side-projects and for anything that no one else will have to maintain. The | |
thought of using Haskell for shell scripting was intriguing, just because it | |
would be another way for me to wrap my head around this very different computer | |
language. | |
But I was skeptical. At first glance, Haskell doesn't seem like a good | |
candidate for shell programming. Typically, these scripts are quick, one-off | |
programs, often written in anger, that need to be created quickly and nimbly | |
(dare I say, *agily*?). However, Haskell is statically-typed, and its type | |
system is not given to making quick changes. (Well, I've found that not to be | |
quite accurate, but it is the perception.) Generally, I think that languages | |
like Haskell are more suited to larger systems, because their power and | |
concision really only become apparent when working with large bodies of code. | |
Whatever my reaction, though, a small script like this, with limited scope, | |
seemed perfect. | |
The Process | |
----------- | |
The process I found to handle the conversion was fairly simple. | |
1. Get a PNG screenshot of each slide using [webkit2png][webkit2png]; | |
2. Concatenate all of the PNGs into a PDF using the [ImageMagick][magick] tool | |
`convert`; | |
3. Clean up the PNGs. | |
With that laid out, let's jump in. | |
Preface | |
------- | |
First, some book-keeping: I have to let Haskell know that I'm going to use | |
string literals in places that require [Data.Text.Text][text] instances: | |
\begin{code} | |
{-# LANGUAGE OverloadedStrings #-} | |
\end{code} | |
Also, we have to import the [Shelly][shelly] module. | |
\begin{code} | |
import Shelly | |
\end{code} | |
And we need some other modules for working with characters, text, and other | |
things. | |
\begin{code} | |
import Control.Monad (forM_) | |
import qualified Data.Char as C | |
import qualified Data.Text.Lazy as T | |
import Filesystem.Path | |
import Prelude hiding (FilePath) | |
import System.Environment | |
\end{code} | |
Converting to PNGs | |
------------------ | |
The first step is taking screenshots of each slide. To do that, I used the | |
[webkit2png][webkit2png] script. | |
For most things, I'm using Python 2.7, but I haven't bothered installing | |
`pyobjc` for it. `webkit2png` uses `pyobjc`, though, so I have to run that | |
program with Python 2.6, which is the default Python shipped with Mac OS 10.6. | |
I only generate the full-sized screenshot, and I output it to a filename that | |
includes the slide number. In Bash, that would look like this: | |
```bash | |
python2.6 $(which webkit2png) \ | |
--fullsize \ | |
--filename pythongis-000 \ | |
http://people.virginia.edu/~err8n/pythongis/#slide0 | |
``` | |
First, let's create a generic function to run commands in Python 2.6. In | |
Shelly, the convention is to add an underscore to functions that throw away | |
their output: | |
\begin{code} | |
python26_ script args = run_ "python2.6" (script:args) | |
\end{code} | |
This is kind of interesting because I wouldn't abstract this out if I were | |
writing this in Bash, Python, or Ruby. But adding this function felt quite | |
natural in Haskell, which tends to encourage smaller, more generic, yet more | |
focused, functions. | |
Now I'll build on that to create a command to look for the program | |
`webkit2png`, and if it finds it, pass it to Python 2.6: | |
\begin{code} | |
webkit2png_ filename url = do | |
script <- which "webkit2png" | |
case script of | |
Nothing -> echo "ERROR: webkit2png not installed." | |
Just script' -> do | |
s <- toTextWarn script' | |
python26_ s [ "--fullsize" | |
, "--filename", filename | |
, url | |
] | |
\end{code} | |
This could be better. For one thing, this command could print an error message | |
if `webkit2png` isn't available. If that happens, it should probably also | |
short-circuit the rest of the script. The way to do this in Haskell would be to | |
return a [Maybe][maybe] value, which is what the `which` function above | |
does. In this case, I know that the program is installed and on the `PATH`, | |
so I'm being a little sloppy. | |
Converting to PDF | |
----------------- | |
The next step is to concatenate all the PNGs into one PDF. I'm using the | |
`convert` program from [ImageMagick][magick] to do this. This takes a list of | |
PNG files to convert, the name of the PDF file, and generates the output. | |
\begin{code} | |
convert :: FilePath -> [FilePath] -> ShIO () | |
convert pdf pngs = run_ "convert" =<< mapM toTextWarn (pngs ++ [pdf]) | |
\end{code} | |
Working on Multiple Files | |
------------------------- | |
Right now, `webkit2png_` (the function to download the slides as PNGs) operates | |
on a single slide. But we'll need to do this for every slide in the show. | |
`downloadSlides` takes the number of slides and the base URL, and it calls | |
`webkit2png_` for each slide. It returns a list of file names for the | |
downloaded PNGs. | |
\begin{code} | |
downloadSlides :: Int -> String -> ShIO [FilePath] | |
downloadSlides slideCount baseUrl = do | |
forM_ inputs $ \(url, file) -> webkit2png_ file url | |
return files' | |
where | |
baseUrl' = T.pack $ baseUrl ++ "#slide" | |
range = map (T.pack . show) [0..slideCount] | |
urls = map (T.append baseUrl') range | |
files = map (T.append "slide-") range | |
files' = map (fromText . flip T.append "-full.png") files | |
inputs = zip urls files | |
\end{code} | |
The only wrinkle here is that the file names that are passed to `webkit2png` | |
aren't the ones that are output. Instead, the program appends the size of the | |
image (thumbnail, full, etc.) and the ".png" extension. Since I want to operate | |
on those files later, I have to create both the file name prefix to pass to | |
`webkit2png` and the full file name to process later. This is unfortunate and | |
brittle, because if `webkit2png` ever changes how it names the output files, my | |
script will break. | |
This is also shell-script sloppy in another way. I should really create a | |
temporary directory and download the PNGs there. Maybe someday. | |
Putting it all Together and Getting the Inputs | |
---------------------------------------------- | |
All the pieces are in place. The only things left are to parse the command-line | |
arguments, call `downloadSlides` and `convert`, and delete the downloaded PNGs. | |
The `main` function is the entry-point for the script. It picks three | |
parameters from the command line and tries to make one a `Int`. If that can't | |
happen for any reason, it prints the usage message and exits. If the | |
command-line is right, the script continues processing. | |
\begin{code} | |
main :: IO () | |
main = shelly $ verbosely $ do | |
args <- liftIO $ getArgs | |
case args of | |
[slides, url, pdf] | all C.isNumber slides -> do | |
pngs <- downloadSlides (read slides) url | |
convert (fromText $ T.pack pdf) pngs | |
echo . T.pack $ "Wrote PDF to " ++ pdf | |
mapM_ rm_f pngs | |
otherwise -> echo usage | |
\end{code} | |
This is the usage/help message. | |
\begin{code} | |
usage :: T.Text | |
usage = "\ | |
\usage: s5topdf.lhs [slides] [url] [output] \n\ | |
\ \n\ | |
\ slides is the number of slides in the slideshow.\n\ | |
\ url is the URL to access the slideshow at.\n\ | |
\ output is the filename of the PDF file to create.\n" | |
\end{code} | |
Running | |
------- | |
To run this script, pass it to `runhaskell` with the right command-line | |
arguments. For example, here's a small [wrapper script][wrapper]. | |
Conclusion | |
---------- | |
Using Haskell for shell programming hasn't been bad, but it's not as fast as | |
shell programming usually is, either. This is still more verbose than the bash, | |
Python, or Ruby versions would be, and it took me (a little) longer to write. | |
(Of course, I was unfamiliar with several of these libraries, and that slowed | |
me down.) | |
However, I needed to do almost no debugging. Once I got the types to line up | |
and `runghc` to stop complaining, it just worked. There were no bugs hiding in | |
parts that hadn't run yet. Based on experience with other languages, I'd | |
expected to have to tweak the `convert` function (the second stage of | |
processing) once I got the `webkit2png` part working (the first stage). But | |
that wasn't necessary. After I coaxed the complete script into printing the | |
usage message, everything else worked flawlessly. | |
The bottom line: For very short one-off scripts, this seems like over-kill. For | |
scripts that you expect to grow, Haskell plus Shelly might be more attractive. | |
Second Conclusion | |
----------------- | |
One of the things that attracts me to Haskell is it's history of using | |
[literate programming][literate]. In fact, I'm using it right now. This post | |
was generated from the script itself. I've posted the raw version to a | |
[gist][gist], so you can compare them. | |
Using literate Haskell was a success. I really liked being able to interleave | |
extended commentary with the code and to have both be part of the final | |
product. I think it changed the nature of both the script and the post. This | |
might not work as well for larger projects with more lines of code and multiple | |
modules, but for a small script, it was very comfortable. I can see doing this | |
again for descriptions of small algorithms, projects, or demos. | |
Also, having this file double as a script *and* the post is kind of neat, at | |
least for the moment. | |
[markdown]: http://daringfireball.net/projects/markdown/ "Markdown" | |
[s12gis]: http://tinyurl.com/s12gis "GIS Workshop" | |
[slides]: http://people.virginia.edu/~err8n/pythongis/ "The Slides in Question" | |
[s5]: http://meyerweb.com/eric/tools/s5/ "S5: A Simple Standards-Based Slide Show System" | |
[showoff]: https://github.com/schacon/showoff "showoff" | |
[haskell]: http://www.haskell.org/haskellwiki/Haskell "Haskell" | |
[shelly]: http://www.yesodweb.com/blog/2012/03/shelly-for-shell-scripts "Shelly for Shell Scripts" | |
[literate]: http://en.wikipedia.org/wiki/Literate_programming "Wikipedia: Literate Programming" | |
[gist]: https://gist.github.com/2150126 "The raw script" | |
[wrapper]: https://gist.github.com/2150126#file_s5topdf "A wrapper script" | |
[text]: http://hackage.haskell.org/package/text "Data.Text package" | |
[webkit2png]: http://www.paulhammond.org/webkit2png/ "webkit2png" | |
[maybe]: http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Maybe.html "Data.Maybe package" | |
[magick]: http://www.imagemagick.org/script/index.php "ImageMagick" | |
\begin{code} | |
-- vim: set filetype=lhaskell: | |
\end{code} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment