[Important terminology: roc is short for roxygen comment block, rocblock and for combination of roc and object that's being documented]

Roxygen design

The roccer is the object in charge of processing a tag. It is made up of two components:

A parser, which is in charge of converting the raw text string into an intermediate object that can be used by other roccers and is turned into output
An rocout object, which takes the intermediate format (after it's potentially been modified by other roccers) and turns it into an output format

Each of these is described in turn below.

Each roccer has a name, and a set of dependencies (currently stored in base_prereqs. Before the roccers are called, a topological sort is performed to ensure that they are run in the correct order. This makes sure (e.g.) that @title is processed before the intro paragraphs, and that @param is processed before @inheritParams.

In the future roccers may gain some sort of keyword or tag field, to make it more easy to flexibly select subsets of the roccers to run.

Parser

The parser implements one method: parse_rocblocks, which takes a list of rocblocks as input and returns a list of rocblocks as output.

There are currently three types of parser:

null_parser: does nothing (useful when the text is used as is)
roc_parser: modifies only the roc component of the rocblock. It has two arguments: tag and one. These are both functions: tag called will just a single tag, and one will be called with the entire rocblock (broken down using do.call into roc, obj etc). These both return list which are combined with the original roc using modifyList.
rocblock_parser: has only a single argument, all, which is a function that is called with the list of all rocblocks and should return a list of rocblocks. This means it can modify anything: it can add or delete rocblocks, or modify any component of the rocblock.

This makes it possible for tags to be very flexible - in roxygen2 they were basically limited to local action, but many tags (like @include, @family, and @inheritParams need a more global perspective. It also makes it possible to write other specialised parsers that might operate on particular types of object and extract more information to add to the roc.

The other advantage of multiple parsers is for caching: the more locally a parser operates the easier it is to cache between subsequent runs. Globally operating tags depend on all roccers, so will generally need to be recomputed every time, but roccers that work with a single tag should only need to be recomputed if that tag changes.

Rocout

The rocout object has three methods:

output_build: this is given a single tag from the rocblock, and should return an output object which describes where and what to write.
output_postproc: this combines multiple tags and does any other postprocessing data modification work before the final output. (Mainly separated out from output_write to make it easier to test)
output_write: this basically calls format on the output object representing each file and then write_if_different to provide an informative message.

Output objects

Output objects represent the various possible types of output. There are currently 3:

rd commands in an rd file
lines in the NAMESPACE file
fields in the DESCRIPTION file

hadley/design-principles.md

Roxygen design

Parser

Rocout

Output objects