So imagine you have the following file structure:
.
├── _plugins
├── _posts
│ ├── 2013-01-01-test-post.md
│ └── 2013-10-02-blah.md
├── _templates
│ ├── base.html
│ └── post.html
└── config.yml
3 directories, 5 files
The idea (of static site generators) is that you can run: foo-ssg build
and get the following:
.
├── index.html
└── posts
└── 2013
├── index.html
├── 01
| ├── index.html
| └── 01
| ├── index.html
| └── test-post.html
└── 10
├── index.html
└── 02
├── index.html
└── blah.html
Or something similar, at least. (in this case, I'm imagining index.html
under
the date directories to be indexes per day/week/month/whatever.
So to accomplish this, we could use Jekyll/Hyde/Pelican/pandoc/whatever. BUT THAT'S NO FUN. Those systems are super complicated, especially to extend. (Jekyll is the best, in my experience, and it still fails in counterintuitive ways)
So since I'm apparently obsessed with stream processing and functional programming, we need three kinds of nodes/functions/stages to transform the former file structure into the latter: converters, renderers, and filters. The graph ends up looking like:
digraph ssg {
# push filenames (optional w/ content to converters)
input -> converter;
# converters can also listen to other converters
converter -> converter;
# all converter output gets filtered
converter -> filter;
# the last stage is the renderer - this would probably just be putting
# rendered templates into HTML.
filter -> renderer;
}
Take n
file's worth of input, produce m
(greater than or equal to n
)
objects worth of structured data. Can depend on other converters (for example,
needing all the posts available to produce archive pages.)
For example, the following markdown file:
---
title: Blah
date: 2013-10-02
---
# Test!
First paragraph
Second paragraph
could produce this JSON object:
{
"title": "Blah",
"date": [2013, 10, 2], // punting on the date format here, probably would be unix timestamp
"summary": "<p>First paragraph</p>",
"content": "<p>First paragraph</p><p>Second paragraph</p>",
"type": "post",
"source": ["_posts", "2013-10-02-blah.md"]
}
They could do any task which requires conversion of one form or another (PNG crushing and SCSS/LESS conversion come to mind)
Filters take n
items of structured data and produce m
(less than or equal
to n
) unchanged items of structured data.
So given:
{
// snip
"draft": true
}
A draft filter would not produce anything, except maybe a log message.
Finally, a renderer takes n
items of structured data and produces n
HTML
(or whatever) pages, plus a path to each. Ideally, each converter has a
renderer. So let's take the example of an ATOM feed creator, in pseudo-python
code:
converters/atom.py
:
return sorted(key=lambda p: p['date'], blogposts)[:10]
converters/atom.py
:
return jinja2.render('atom.xml', posts=atom_posts), 'atom.xml'
My thought is that since none of these things are necessarily python-specific, it would be easiest to implement them in a diverse body of languages. If BoingBoing or someone wants to write a go program because their enormous blog is taking too long to render to markdown, they should be able to. So we need a two-stage build process for each "node":
-
Discovery
- The controller system finds built-in and site-specific nodes in specified folders, and runs each with a JSON input describing the site's configuration.
- The node responds with a JSON output describing which streams it wants to subscribe to.
-
Rendering
- The controller sends each known converter node all the content that matches subscriptions.
- The converter nodes respond with structured data.
- The controller passes the structured data to the filters (again matching subscriptions) paying attention to their filter-or-not signals.
- The controller sends all content to specified renderers
- The renderers return final rendered versions of each file, plus filenames
- The controller creates necessary files and folders, filled with content
... actually, on second thought this is gonna be a ton easier to just write Python classes for. We'll start there.