Skip to content

Instantly share code, notes, and snippets.

@Morwenn
Last active July 20, 2019 13:48
Show Gist options
  • Save Morwenn/e49214c196353c29555678b3470e6f13 to your computer and use it in GitHub Desktop.
Save Morwenn/e49214c196353c29555678b3470e6f13 to your computer and use it in GitHub Desktop.
State of C arrays in C++

State of C arrays in C++

Fixing C arrays in the language and making them the vocabulary type they should always have been.

TODO: more motivation

TODO: mention that fixing std::array is also a noble goal, but not one that this proposal pursues, that said it's also worth noting that fixes applied to C arrays might also apply to std::array

TODO: make it only about stack arrays, somewhat disregard dynamic ones

Motivation: C arrays vs. std::array

The general wisdom since std::array was introduced in C++11 is that C arrays are comparatively unsafe and that std::array is strictly superior and should be used whenever possible. We propose an overview of the differences between C arrays and std::array, exposing the advantages and shortcomings of both, illustrating why someone might want to use one over the other.

Advantages of C arrays

compare to std::array

  • no templates to instantiate
  • better error messages (TODO)
  • better vectoriation (see Annex B)
  • syntactic sugar, including multidimensional arrays
  • automatic deduction of the size of the last dimension
  • arrays of unknown bounds
  • non-array variables can be converted to arrays of size 1 (CWG1596)

Shortcomings of C arrays

compare to std::array

  • decays to pointers
  • compares pointers
  • no lexicographical comparisons
  • can't be returned from functions -> annoying with traits taking function types
  • can't have zero-sized arrays
  • no copy semantics
  • could/should be a regular type

C arrays are worth fixing

TODO: motivate that despite their shortcomings they have value and could be the powerful vocabulary type they should have always been

TODO: mention mdspan for multi-dimensional arrays

Fixing C arrays in C++

Despite arrays often being considered obsolete or at least types that shouldn't be used, there have been some love and attempts to fix them through the years:

  • Functions can take references and pointers to arrays of unknown bounds (C++14)
  • Overload resolution was improved to overload on array size with list initialization (C++14)
  • Qualification conversions in arrays of pointers were improved (C++17)
  • operator<=> can't compare C arrays (C++20)
  • Comparisons between C arrays are now deprecated (C++20)
  • Array size can now be deduced in new expressions (C++20)
  • Arrays can now be constructed with parenthesis (C++20)
  • Pointers and references to array of unknown bound can now bind to an array of known bound (P0388)
    void f(int(&)[]);
    int arr[1];
    f(arr);          // Error today, works with P0388
    int(&r)[] = arr; // Error today, works with P0388
  • Overload resolution rules for arrays of known bound with list initialization were improved (P0388)
  • Countless CWG and LWG issues: CWG337, CWG619, LWG2061, LWG2118, LWG2280, LWG3005, etc.
  • The standard library also regularly adds/improves support for C arrays:
    • std::swap (C++11, C++17, C++20, LWG2554)
    • std::[c][r]begin, std::[c][r]end (C++11, C++14)
    • std::[s]size, std::data, std::empty, std::data (C++17, C++20)
    • std::unique_ptr, std::shared_ptr and factory functions (C++11, C++14, C++17, C++20)
    • std::is_bounded_array, std::is_unbounded_array (C++20)
  • TODO: check ranges:: support for C arrays

Some additional proposals in-flight to improve C arrays even more:

  • Relaxed incomplete multidimensional array type declaration (P0332, rejected?)
  • TODO: more?

This proposal is orthogonal to those ones and improves C++ support for C arrays in different ways.

Zero-sized arrays

TODO: add motivation, I have examples with 0-sized buffers in cpp-sort that simplifies optionally empty buffers handling

TODO: can probably copy the semantics of std::array, they seem to work well

TODO: zero-size arrays seem to be allowed in new expressions, C VLAs and the then proposed arrays of runtime bound, investigate (CWG1768)

TODO: mention non-standard compiler extensions that allow to declare 0-sized arrays and how they behave

TODO issues:

  • Can be used to trigger errors (ex: pre-C++11 STATIC_ASSERT trick)
  • Can be used in SFINAE conditions
  • Some compilers (which?) uses them for flexible array members

Comparisons

operator<=> already handled part of the issue by not being usable it on C arrays, plus the proposal came with a deprecation of C array comparisons altogether. If operator<=> compares two classes with array members, then lexicographical comparison is currently performed, which hints at the committee will to make array proper regular types.

We could fix this with a long-term plan, by keeping the status quo for a few years, making array comparisons ill-formed altogether for a few more years, then introducing the new behaviour to compare arrays element by element lexicographically, and allowing operator<=> to be used on C arrays at the same time. When removed they should be defined as = delete to avoid the decaying behaviour that would lead to a comparison of pointers.

One interesting property of fixing comparisons is that string literals could then be compared directly with == and give the expected result.

TODO: comparison of T[] and T*, make it ill-formed and require explicit cast?

TODO: comparison of arrays of different sizes should return false, right? Check what is the behaviour when an array appears as a class member.

Copy, move & decay

Array copy and returning arrays from functions are currently ill-formed, but when an array is passed to a function, it implicitly decays to a pointer to the first element. Therefore it would be easy to define array copy to be element-per-element (as it is when wrapped into a structure) for normal copy operations and for return types but difficult to change the currently used behaviour for function parameters. One way could be to deprecate then make ill-formed array function parameters, then change the behaviour to element-by-element copy some time later, possibly with a cycle corresponding to the one used for comparisons. However, that would still change in surprising ways the behaviour of arrays when passed to function templates expecting a parameter by copy. Making the deprecation warning could catch these and make the transition easier.

Move operations would be implemented by moving elements one-by-one to the target array.

Allowing to return C arrays from functions wouls also make some template metaprogramming easier: some traits such as std::result_type have issues handling arrays because they take a function type as a parameter, and when that function type's return type is an array, the program is ill-formed. It's among the reasons why std::invoke_result was introduced in C++17, and not having to separately handle array types in metaprogramming would make everybody's life easier.

It is worth noting that at the end of the deprecation/removal/new behaviour change for both comparisons and copy operations, array types would be regular and thus could benefit from everything regular types benefit from.

TODO: how to handle copy/move of arrays of different size? Make them ill-formed or truncate? Check the rules for arrays as class members, we want to match them.

Current Core & Library issues about arrays

There are still more core issues about improving arrays, which could be considered in a holistic way:

Impact on the standard

We described the core language changes required to improve C arrays, but once implemented, the impacts on the standard library are also quite holistic:

  • Every function expecting a regular type can take an array
  • Template parameters expecting a function type now handle functions returning arrays out-of-the-box
  • Allow std::variant to hold array types
  • Allow std::optional to hold array types (LWG3196)
  • TODO: more miscellaneous improvements

Annex A: stats in real world code

Stats about array function parameter declaration, array comparisons and implicit decay TODO: check whether a proposal already has stats TODO: gather stats somehow...

Annex B: better vectorization for C arrays

TODO: find issues, questions, examples where C arrays are better vectorized than std::array

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment