Literate programming is a technique introduced by Dolad Knuth many years ago. Nowdays literate programming is almost dead, it's really sad in my opinion. This little application designed to bring literate programming approach to almost any programming language expirience.
I strongly recomend to read original Knuth paper on Literate Programming (http://www.literateprogramming.com/knuthweb.pdf)
Noweb.py by Jonathan Aquino was inspiration for this humble peace of code.
It was surprisingly easy to implement this tool. Main idea is to parse file in single pass line-by-line detecting
chunks and use Map
to store it's names and values.
In second part of processing recursively 'expand' chunks bodies, replacing entries of others chunks to get full programm.
To process files this application using os, io, bufio and regex packages. Flag package used to parse command line parameters. It's a bit shitty, but it's ok.
Right after start application will try to parse command line parameters. If some vital data is not defined application will show usage and exit. There is 4 overall parameters:
- --src-out: File name for code output (tangle output)
- --doc-out: File name for document output (weave output)
- --default-chunk: Default chunk name. Chunk with this name will consider holding main program code. By default it's name is "*"
- First parameter after all options witll consider file name to parse
As I mention above, we using flag
package to parse command line. For every command line argument there is variable defined. Default values for src-out and doc-out parameters is empty string.
In Go this is "Zero value" for string, so we can catch situation when user omit one or another parameter. Default value for default-chunk is always "*".
If there is no file to parse we can't do anything except show usage. Another case is when both src-out and doc-out is missing. In this situation application will show usage too, because it can't do anything useful with given file.
But if /only one/ of they is missing application can dump source code or documentation without dumping another part.
For exmaple, if you want to generate both, documentation and source from some file source.w
, you should run:
lit --src-out source.c --doc-out source.tex source.w
But if you need only source, you can omit doc-out parameter:
lit --src-out source.c source.w
Same works for doc-out.
File parsing process is extremely straightforward. After file is open we reading it line by line trying to match one specified regular expressions.
Expression "<<([^>]+)>>=" is used to match beginning of chunk, "@" for end of chunk.
After chunk beginning is found we extract his name from submatches and store it in variable chunkName
, after that any line not matched by any regular expression is added to Map named chunks
with value of chunkName
as a key.
If line matches with end of chunk expression chunkName is set ot zero value. If line no one expression can match line and chunkName
variable set to zero value, that line is adding to document
string variable.
As a result of execution parseFile
function returns document
string and chunks
map.
To simplify processing of every line of code defined closure processLine
. This closure decides where current processing line will go: to the chunk body or documentation.
Every chunk body can contain any number of links to another chunks. To build whole program from literate source we need to "expand" every chunk body by replacing links to other chunks by its bodies.
First of all we define data structure for "final" expanded chunks expandedChunks
. After that we define regular expression, which will match "links" to other chunks.
Expand body closure defined inside expandChunks
function takes a body as an argument and match it for links to another chunks. After that it takes every linked chunk name and replaces it with
result of recursive self-invocation with linked chunk body.
If there is no linked chunks closure just returns given body. May be I should check if expandedChunks
already has expanded body for linked chunk to avoid extra work.