Summary of meeting between Tidyverse members and Luke Tierney at useR! 2024.
Frontends and low level tools need to know what kind of bindings they are dealing with. Objectives include:
-
Avoiding side effects such as triggering a promise or causing a missing argument error. Low level tools often can't afford to protect against those for every variable lookup. Figuring out what happened by inspecting errors is also ambiguous, and sometimes impossible (promises may cause longjumps in a variety of ways).
-
Transparency in debugging/development settings. Providing context to the user about what's going to happen if they attempt to retrieve the value of a binding (i.e. an active binding invokation, a promise forcing leading to the evaluation of such and such expression, etc).
-
Completeness of the API to inspect and manipulate bindings. It should be possible to write an environment cloner using these tools: Iterate over bindings, retrieve type, given type, retrieve components (prexpr, prenv, active binding function, etc), given components, create duplicate binding in new environment.
-
tidyeval (the NSE framework for the tidyverse) needs to obtain both the expression and the original frame environment of substituted dots.
Existing API:
Rboolean R_existsVarInFrame(SEXP env, SEXP sym); // Unfortunate inconsistency in param order
Rboolean R_BindingIsActive(SEXP sym, SEXP env);
New API:
typedef enum {
R_BindingTypeUnbound = 0, /* Unbound in this environment */
R_BindingTypeValue = 1, /* Direct value binding */
R_BindingTypeMissing = 2, /* Missing argument */
R_BindingTypeDelayed = 3, /* Delayed promise */
R_BindingTypeForced = 4, /* Forced promise */
R_BindingTypeActive = 5, /* Active binding */
} R_BindingType;
R_BindingType R_GetBindingType(SEXP sym, SEXP env);
Existing:
SEXP R_ActiveBindingFunction(SEXP sym, SEXP env);
New:
SEXP R_DelayedBindingExpression(SEXP sym, SEXP env);
SEXP R_DelayedBindingEnvironment(SEXP sym, SEXP env);
SEXP R_ForcedBindingExpression(SEXP sym, SEXP env);
Existing:
void R_MakeActiveBinding(SEXP sym, SEXP fun, SEXP env);
void Rf_setVar(SEXP sym, SEXP value, SEXP env); // Value
void R_removeVarFromFrame(SEXP sym, SEXP env); // Unbound
New:
void R_MakeDelayedBinding(SEXP sym, SEXP expr, SEXP evalEnv, SEXP env);
void R_MakeForcedBinding(SEXP sym, SEXP expr, SEXP value, SEXP env);
void R_MakeMissingBinding(SEXP sym, SEXP env);
We need a way to create forced promises that work with substitute()
. This could be achieved by passing a NULL
environment or by splitting the constructor into two variants.
Edit: We've decided against this and went for the explicit predicates and accessors.
If we use a NULL
environment as an indicator for forced promises, we can simplify the API by sharing the type, accessors, and constructor:
typedef enum {
R_BindingTypeUnbound = 0, /* Unbound in this environment */
R_BindingTypeValue = 1, /* Direct value binding */
R_BindingTypeMissing = 2, /* Missing argument */
R_BindingTypePromise = 3, /* Delayed or forced promise */
R_BindingTypeActive = 4, /* Active binding */
} R_BindingType;
SEXP R_PromiseBindingExpression(SEXP sym, SEXP env);
SEXP R_PromiseBindingEnvironment(SEXP sym, SEXP env);
void R_MakePromiseBinding(SEXP sym, SEXP promiseExpr, SEXP promiseEnv, SEXP env);
Edit: We've discussed another API for this with Luke at the R dev day, see Davis' comment below.
Useful to do at C level for two things:
-
Fast dots checkers, i.e. https://rlang.r-lib.org/reference/check_dots_unnamed.html and https://rlang.r-lib.org/reference/check_dots_used.html
-
Capturing environments of unforced arguments passed through multiple levels of dots. Necessary for hygienic evaluations of captured expressions.
typedef enum {
R_DotsBindingTypeValue = 0, /* Direct value binding */
R_DotsBindingTypePromise = 1, /* Delayed or forced promise */
} R_DotsBindingType;
typedef struct {
R_DotsBindingType type;
SEXP name;
} R_DotsIteratorItem;
/* Returns a private LISTSXP containing: the iterator state as a RAWSXP in the
CAR, a protecting container in the CDR (for extra safety we might want to
protect the current binding), and a type identifier in the TAG (for runtime
error checking). The caller must protect this object and consider it opaque.
The behaviour in case `env` does not contain a DOTSEXP could be an error
(check the binding type for `...` beforehand) or an empty iterator. */
SEXP R_MakeDotsIterator(SEXP env);
/* Returns true if advanced, in which case `item` is safely readable. */
Rboolean R_DotsNext(SEXP dotsIterator, R_DotsIteratorItem *item);
SEXP R_DotsPromiseBindingExpression(SEXP dotsIterator);
SEXP R_DotsPromiseBindingEnvironment(SEXP dotsIterator);
SEXP R_DotsValueBinding(SEXP dotsIterator);
SEXP iter = R_MakeDotsIterator(env);
R_DotsIteratorItem item;
while (R_DotsNext(iter, &item)) {
switch (item.type) {
case R_DotsBindingTypeValue: Rf_PrintValue(R_DotsValueBinding(iter)); break;
case R_DotsBindingTypePromise: Rf_PrintValue(R_DotsPromiseBindingExpression(iter)); break;
}
}
Currently our main concern is avoid materialising row names. In the future, getAttrib()
should return an altrep string sequence for automatic row names. In the meantime, if an object already has altrep row names, it should not materialise it, which is currently the case via INTEGER()
.
It might be useful to have a way of getting and setting a list of attributes, but we'll first try to manage without that.
Discussed the
Iterating over dots
section at r-dev-day on Aug 11, 2025 at useR! in Durham, NC with Luke.We settled on a simpler scheme that:
...names()
and...length()
To implement these we should tap into and refactor the existing dots tooling here:
https://github.com/wch/r-source/blob/503d9e0e8af0b394fb483fa604310ed077ff73b9/src/main/envir.c#L1426-L1507