Current find & replace features work well for names and single-line patterns that happen to map well to regular expressions. For more complex manipulations, something more comprehensive and purpose-built is needed.
In this article I introduce Codex, a flexible and expressive language for specifying code modifications.
Codex works like traditional find & replace, but with some extensions to make dealing with multiple lines and indentation intuitive, and to allow matching language-specific syntax elements with Tree-sitter queries.
A codex expression consists of one or more of the following:
-
Plain text, which matches itself.
-
A newline, which matches one or more newlines, skipping over whitespace-only lines.
-
An increase or decrease in indentation, which matches exactly that and is relative to the current context.
-
On its own line,
*
or+
which matches zero or more lines or one or more lines respectively; followed by an optional capture label (see below), e.g.* @someLines
. -
A regular expression (in JavaScript literal syntax) followed by an optional capture label, e.g.
/\w+/@functionName
. -
A Tree-sitter query which matches the text of the corresponding nodes, followed by an optional capture label, e.g.
(function_declaration) @fn
. -
[
and]
which mark the start and end of the text to replace, respectively. Either or both can be omitted, defaulting to the start and end of the match.
A capture label consists of an @
followed by an alphabetic name for the capture, and makes the associated match available to use in the replacement (see Replacement expressions).
-
Combining literals, regular expressions, and line quantifiers to match a JavaScript function:
function /\w+/@name\(/[^)]*/@args) { * @body }
-
Matching one or more JavaScript functions with a Tree-sitter query:
(function_declaration)+ @fns
The following characters must be escaped with a backslash in literals:
\
,/
,[
,]
, and(
.@
if preceded by a regular expression or Tree-sitter query.*
and+
if at the start of a line.*
,+
, and?
if preceded by a Tree-sitter query as in the example above.
Captured nodes within Tree-sitter queries are available in the replacement, and the names can be prefixed with a dash (e.g. @-name
) to delete those nodes from the result (ie. they will not be there when a surrounding capture is inserted into the replacement).
Deleted nodes are available to use elsewhere in the replacement without the prefix, e.g. @name
.
A replacement expression consists of one or more of the following:
-
Plain text, which produces itself.
-
A newline, which produces a newline and preserves the current indentation.
-
An increase or decrease in indentation, which indents or dedents relative to the current context.
-
A capture reference, e.g.
@captureName
, which produces the corresponding regular expression match, lines, or syntax nodes. Multi-line captures are re-indented to the current context.
To insert a literal @
in the replacement, two @
s are used (@@
).
When performing the replacement (including when removing @-
-prefixed nodes), blank lines are inserted or deleted according to context and preferences.
-
Converting a JavaScript module that exports an object with an
init
method, to a function that performs the body of theinit
method and returns the original object with theinit
method removed:module.exports = (object (method_definition (property_identifier) @p (statement_block "{" (_)+ @initBody "}") ) @-init . "," @-c (#eq? @p "init") ) @obj/;?/
module.exports = function() { @initBody return @obj; }