Guaranteed copy elision for named return values

This proposal aims to provide guaranteed copy elision for common cases of local variables being returned from a function.

Motivation

The accepted P0135 proposal already provides guaranteed copy elision for when a prvalue is returned from a function by stating that the result object of that prvalue (and not a temporary) is directly copy-initialized. It de-facto mandates what was known as Return Value Optimization (RVO) and allows non-movable objects to be returned in such a way.

Meanwhile, other cases of copy elision are still optional. For example, sometimes we want to create an object, set it up and return it.

widget setup_widget(int x) {
  auto w = widget(x);
  w.set_y(y);
  return w;
}

setup_widget will copy, or at least move w out. Compilers often perform Named Return Value Optimization (NRVO) in such cases, but it is not guaranteed. We cannot take a pointer to w in setup_widget and store it somewhere, because if the compiler chooses not to apply NRVO in this case, that pointer will become dangling as soon as setup_widget returns. And if widget is non-copyable, the code is ill-formed.

On practice, the workaround can be either:

Two-stage initialization, where a local variable is constructed in its destination (e.g. using default constructor) and then is immediately passed to function(s) by reference in order to complete the setup of the object
Always storing the object on heap, e.g. by returning std::unique_ptr<widget> instead of widget from factory functions

Both "solutions" are often viewed as anti-patterns.

Proposed solution

Informally

Copy elision for a local variable x of automatic storage duration is required when the only return statements between its point of declaration and the end of its scope are of the form return x;

Formally

It is said that a return statement returns a variable when its operand is a (possibly parenthesized) id-expression that names the variable.

A variable with automatic storage duration is called a named return value when all of the following conditions are satisfied:

the variable names a non-volatile complete object (other than a function parameter or a catch clause parameter) with the same type (ignoring cv-qualification) as the function return type; [ Note: Either the variable is of a non-reference type or it is a reference participating in lifetime extension. — end note ]
all the return statements in its potential scope, of which there is at least one, return the variable. [ Note: this implies that the enclosing function cannot be a coroutine. — end note ]

The object denoted by a named return value is constructed directly into the function call's return object. Statements that return a named return value perform no copy-initialization and do not lead to destruction of the object.

Examples

Constructing, "cooking" and returning a non-copyable, non-movable widget:

widget setup_widget(int x) {
  int y = process(x);
  auto w = widget(x);
  w.set_y(y);
  return w;
}

A more contrived example where guaranteed copy elision applies:

widget setup_widget() {
  while (true) {
    auto w = widget(1);
    if (…) return w;
    if (…) break;
    if (…) throw …;
    if (…) return w;
  }
  return widget(2);
}

An example where guaranteed copy elision does not apply:

widget setup_widget() {
  auto w = widget(1);
  if (…) {
    return w;
  } else {
    return widget(2);
  }
}

The example above can be "fixed" so that guaranteed copy elision does apply:

widget setup_widget() {
  if (…) {
    auto w = widget(1);
    return w;
  } else {
    return widget(2);
  }
}

Constructing, setting up and passing an object as a parameter using an immediately invoked lambda expression (consume_widget's parameter is directly initialized):

void consume_widget(widget);

void test() {
  int y = process(x);
  consume_widget([&] {
    auto w = widget(x);
    w.set_y(y);
    return w;
  }());
}

Q&A

Is the proposed change source or ABI breaking?

The proposal is not source-breaking, because it mandates copy elision in some of the cases that are currently optional.

The proposal is not ABI-breaking, because, in all known implementations, whether NRVO is applied for a function does not impact its calling convention.

What is the implementation cost of this feature and the impact on compilation time?

The proposal will make declarations of local variables with automatic storage duration context-dependent: storage of a variable will depend on statements and expressions in its potential scope. However, this analysis is simple, purely syntactic.

Whether a variable is a named return value only affects the storage of that variable, so determining it can be postponed to later stages of semantic analysis, when a tree of statements and scopes is already built. The implementation cost and the impact on compilation speed are thus deemed to be minimal.

Compilers that already do NRVO will enable it (or at least the required part of it) in all compilation modes. The proposal might even have a positive impact on compilation time, because such implementations will not have to check whether copy-initialization on the return type can be performed.

Can we implement similar functionality in C++ today?

In some cases, yes, with cooperation from the returned object type.

Suppose widget class defines the following constructor, among others:

template <typename... Args, std::invocable<widget&> Func>
widget(Args&&... args, Func&& func)
  : widget(std::forward<Args>(args)...)
  { std::invoke(std::forward<Func>(func)), *this); }

We can then use it to observe the result object of a prvalue through a reference before returning it:

widget setup_widget(int x) {
  int y = process(x);

  return widget(x, [&](widget& w) {
    w.set_y(y);
  });
}

However, it requires cooperation from widget and breaks when some of its other constructors accept an invocable parameter. We cannot implement this functionality in general.

Can we make other copy elision cases mandatory?

This proposal covers the most common case where copy elision allowed by class.copy.elision (1.1) is feasible to require.

class.copy.elision (1.2) leads to an extra allocation in case the control flow escapes the scope before the throw-expression is executed. It is thus would only be possible to require in highly specific cases and generally infeasible.

class.copy.elision (1.3) seems to make the declarations context-dependent in a way that would require greater, semantic analysis of the context. Requiring copy elision in this case seems infeasible.

class.copy.elision (1.4) covers two cases:

When the throw-expression and the try-block reside in different functions. In this case requiring copy elision is infeasible
When they reside in the same function. Such code can be refactored so that this copy elision is unnecessary, unlike (1.1). This use case, if found beneficial enough, can be tackled in a separate proposal

Requiring copy elision in more cases than currently allowed would be backwards incompatible. P0889 tries to find more cases in which copy elision can be allowed — if accepted, more useful cases of guaranteed copy elision could be found.

What about the relocation proposals?

This proposal is related to N4158 and P1144 in that they too can guarantee in more cases than today that moves won't occur. However, those instead suggest that relocation occurs (in N4158, relocation constructor will be called). This proposal suggests instead that even relocation won't happen, so that's an improvement over them in the case when copy elision becomes guaranteed.

What about the lazy parameters proposal?

P0927 also "guarantees copy elision". That proposal requires that the lazy parameter is only used once (to forward it to another lazy function or to its final destination), while in some cases it may be desirable to acquire and use it for some time before forwarding. This proposal would allow to do this in a clean way.

oktal3700/nrvo.md