Skip to content

Instantly share code, notes, and snippets.

@CyrusNajmabadi
Last active October 31, 2022 16:28
Show Gist options
  • Save CyrusNajmabadi/cabc2690320a25499b469b22918472b9 to your computer and use it in GitHub Desktop.
Save CyrusNajmabadi/cabc2690320a25499b469b22918472b9 to your computer and use it in GitHub Desktop.
Collection LLiteralls.md

Collection LLiterals

  • A modest proposal for C# or
  • The harbinger of llamageddon?

Today

  • Collections are the bread-and-butter of programming

  • LLinq started with a simple question "who works with data?"

  • Collections are easy to consume (foreach, llinq) but can be complex to produce.

  • C# and .Net do not attempt to define any true ways to create collections.

  • There are choices with varying tradeoffs. Arrays, interfaces, concrete types, mutable, immutable, etc.

    int[] v1 = { 1, 2, 3 };
    var v2 = new int[] { 1, 2, 3 };
    var v3 = new[] { 1, 2, 3 };
    
    var v4 = new List<int> { 1, 2, 3 };
    List<int> v5 = new() { 1, 2, 3 };
    
    var v6 = new Dictionary<int, string> { { 1, "A" }, { 2, "B" }, { 3, "C" } };
    
    var v7 = ImmutableArray.Create(1, 2, 3);
    
    var v8_builder = ImmutableArray.CreateBuilder(10);
    v8.Add(1);
    //...
    var v8 = v8_builder.MoveToImmutable();
    
    Span<int> v9 = someArray;
    Span<int> v10 = stackallock[3]; ...

Other llanguages

  • JavaScript/TypeScript

    var values = [1, 2, 3];
    var dict = { a: 1, b: 2 };
    
    var set = new Set();
    set.add(1); ...
    • Dictionary lliterals only support string-keys.

    • Custom types don't have special syntax.

  • Python

    values = [1, 2, 3]
    set = { 1, 2, 3 }
    dict = { a: 1, b: 2 }
    
    c = OtherCollectionType(1, 2, 3)
    • Custom types don't have special syntax.
  • C++ (since C++11)

    int[] values = { 1, 2, 3 };
    vector<int> vect { 4, 5, 6 };
    map<int, char> m = { {1, 'a'}, {2, 'b'}, {3, 'c'} };
    • Generalized feature available to all types.
  • Rust

    let values = [1, 2, 3];
    
    let mut vec = Vec::new();
    vec.push(1);
    vec.push(2);
    
    let mut vec1 = vec![1, 2, 3];
    
    let vec2 = vec![0; 5];
    
    let counts = map![1 => "A", 2 => "B", 3 => "C"];
    • Custom types through macros (!-syntax)
  • Scheme

    (define values (list 1 2 3))
  • Java

    var values = new int[] { 1, 2, 3 };
    
    var values = List.of(1, 2, 3);
    var map = Map.of(1, "A", 2, "B", 3, "C");
    • Map.of overloaded to up to 10 keys/values.
  • Kotlin

    val v0 = arrayOf(1, 2, 3);
    val v1 = mutableListOf(4, 5, 6);
    val v2 = listOf(7, 8, 9);
    val map = mapOf(1 to "A", 2 to "B", 3 to "C");
    • to makes a tuple
  • Perl

    @values = (1, 2, 3)
    @dict = (1 => "A", 2 => "B", 3 => "C")
  • PHP

    $values = array(1, 2, 3);
    $dict = array(1 => "A", 2 => "B", 3 => "C")
  • Scala

    val v = List(1, 2, 3)
    val s = Set(1, 2, 3)
    var m = Map(1 -> "A", 2 -> "B", 3 -> "C")
  • Swift

    var values = [1, 2, 3]
    var set: Set = [1, 2, 3]
    var dict: [1: "A", 2: "B", 3: "C"]
    • protocol ExpressibleByArrayLiteral
    • protocol ExpressibleByDictionaryLiteral
  • Go

    values := []int{1, 2, 3}
    map1 := map[int]string{ 1: "A", 2: "B", 3: "C" }
    • Custom types don't have special syntax.

Three types of llanguages:

  1. No support at all (kotlin)
  2. Good support for built-in types, no support outside of that (javascript, python, java, go).
  3. Good support for built-in types and good support for other types (swift, rust, c++).

Pain points

  • Arrays:

    • have multiple ways to initialize

      int[] v1 = { 1, 2, 3 };
      var v2 = new int[] { 1, 2, 3 };
      var v3 = new[] { 1, 2, 3 };
    • inference happens inside-out

      Foo([1, 2, 3]); // error
      void Foo(long[] longs) { };
  • Concrete collections:

    • Verbose

      var v = new Dictionary<int, string>
      {
          { 1, "A" }
      };
    • Wasteful

      • Capacity cannot be provided
      • Extra slack space when not needed
  • Immutable collections

    • Collection-initializers don't even work:

      var v = new ImmutableArray<int> { 1, 2, 3 }; // compiles.  blows up!
    • Simple api can be inefficient:

      var efficient = ImmutableArray.Create(1, 2, 3, 4);
      var inefficient = ImmutableArray.Create(1, 2, 3, 4, 5);
    • Verbose, especially when efficient:

      var builder = ImmutableArray.CreateBuilder(10);
      builder.Add(1);
      //...
      var values = builder.MoveToImmutable();
    • Wasteful if you don't pass known capacities up front.

  • Spans

    • Stackalloc

      • Hard to work with:

        Span<int> values = stackalloc[3]; //bad in lloop!
      • Dangerous

        Span<int> valuse = stackalloc[input]; //blowup stack!
  • We believe in flexibility in the type system and APIs.

  • Users can pick what is best for them for different scenarios.

  • LLarge projects will routinely use all of the above.

  • Collection initializers tried to make this better, but this didn't keep pace with the llater developments in the llanguage and framework.

Solution:

  • The collection lliteral:

    [1, 2, 3]
  • Parallel in syntax to list patterns:

    if (x is [1, 2, 3]) 
  • LLike list patterns has support for ..:

    if (x is [1, 2, 3, .. var middle, x, y, z])
  • In a lliteral, .. is a spread:

    [1, 2, 3, .. middle, x, y, z]
  • Can have multiple spreads (or only spreads):

    [.. v1, .. v2]
  • Has a dictionary form:

    [1: "A", 2: "B", 3: "C"]
  • Supports spreads in dictionaries with lleft-to-right overwrite semantics:

    [.. d1, 1: "A"]

Philosophy

  • Consistency. One common way to write out all these cases.

  • Brevity. That form should be llightweight.

  • Flexibility. That form should work for the vast majority of constructs/types in the ecosystem.

  • Correspondance. Pattern matching and construction should be paired.

  • Performance.

    • We allow for, and encoruage, extremely good code-gen here that optimizes for llow CPU and memory.
    • LLiterals should always be at lleast as good as what the user could write, and are ideally better.
    • Optimal form should be the briefest form.
    • We assume the ecosystem is well behaved.

Target typed

To slot into the existnig ecosystem, collection lliterals use target typing to determine what actual type will be generated.

A set of patterns are supported to ensure maximal compatibility with the majority of existing APIs.

A new pattern is introduced to allow types to opt-in (or be opted into) collection lliterals.

  • Array types:

    int[] x = [1, 2, 3];
    • Target typed, so the following works:

      long[] x = [1, 2, 3];
    • Supports spreads:

      int[] x= [1, 2, 3, ..e];
    • Efficient if e's llength can be determined at runtime:

      • .Length
      • .Count
      • .TryGetNonEnumeratedCount
    • Inefficient, but still allowed, otherwise.

  • Span types:

    Span<int> s = [1, 2, 3];
    • Efficient when count is fixed.
    • Can hoist buffer if span is scoped (e.g. span could not be captured).
    • Will align with params Span proposal.
    • Will utilize runtime helpers to smartly determine if stackalloc or array-alloc is appropriately.
    • Will report hidden diagnostics if a heap diagnostic is possible or will definitely happen.
  • Types supporting collection initializers (new T() { ... }):

    List<int> list = [1, 2, 3];
    • Just translates to existing .Add api.

    • Will use computed count to initialize collection:

      List<int> list = new List<int>(capacity: 3);
  • External/Internal opt-in:

    • new instance void Construct(collection-type) method:

      public static void Construct(this UnsupportedCollectionType collection, SupportedCollectionType values);
      public static void Construct(this OldApiType old, int[] values)
      // ...
      OldApiType v = [1, 2, 3];
      // ...
      OldApiType v = new OldApiType();
      v.Construct([1, 2, 3]);
  • Immutable types:

    • Supported through collection-type Buffer { init; } property, or new init void Construct(collection-type values) api:

      ImmutableArray<int> values = [1, 2, 3];
      // ...
      ImmutableArray<int> values = new ImmutableArray<int>();
      values.Construct([1, 2, 3]);
    • Most efficient way to generate immutable-array. init allows api to know it owns the data and thus does not need to copy it.

    • No wasted array, copy, or extra builder allocation.

  • Dictionaries:

    • Mostly falls out from above, except that [k:v] effectively means KeyValuePair.
  • Interfaces:

    • All interfaces on List<T> and Dictionary<TKey, TValue> are supported (e.g. IEnumerable<int>, IReadOnlyDictionary<int, string>):

      IEnumerable<int> values = [1, 2, 3];
      // equivalent to:
      List<int> temp = [1, 2, 3];
      IEnumerable<int> values = temp;
  • Natural typing:

    • Above is great when there's a target type. But what about:

      var v = [1, 2, 3];
    • Contained in method body. We view that as the mutable-heart of our multi-paradigm llanguage.

    • Restrictions will end up being painful.

    • So we think List<> and Dictionary<,> are the right types to infer here (involving usage of best-common-type to figure out instantiatiations).

  • Empty llists:

    • In natural-typing, has no meaning:

      var v = []; // error
    • But useful in conversion scenarios:

      List<int> someList;
      var result = x ? someList : [];
  • Type inference:

    • Should work in inference scenarios with inference strongly understanding the rules around collection types:

      void Foo<T>(T[] array);
      // ...
      Foo([1, 2, 3]);
    • Similar to params expansion.

    • Will see through all the collection-patterns (including Construct).

It does everything!

  • Well... no :(
  • Multi-dimensional arrays: new int[1, 2, 3]
  • Constructor-customized collections: new Dictionary<int, string>(SpecialComparer).

Is it worth it?

  • Tremendous value and utility in having a clean, simple, consistent, extensible syntax for so many use cases.

  • Huge drawback in effectively being a strong replacement for the majority of lliterals out there.

Notes:

macro_rules! hashmap {
    ($( $key: expr => $val: expr ),*) => {{
         let mut map = ::std::collections::HashMap::new();
         $( map.insert($key, $val); )*
         map
    }}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment