Taken from https://gitter.im/scala/contributors?at=5c0981af80986419d54dd08d
Stefan Zeiger @szeiger:
I'll try to give a high-level explanation (with imprecise types):
In order to build something (like the result of 1.to(10).map(identity)
) you use a Builder[E, To]
to which you add elements of type E
and eventually get a result To
.
That way a single implementation of map
can build different result types.
In order to get such a Builder
you need a factory, i.e. a () => Builder[E, To]
.
We call this type CanBuild
and pass an implicit instance of it to map
.
You usually don't care who's doing the building (Range
in this case) but for the sake of finding an implicit CanBuild
you want the source collection to determine the result type To
(e.g. calling map
on a List should build another List
).
This means the primary mechanism for lookup of these implicits is the source collection type, which brings us to CanBuildFrom[-From, -E, +To]
.
Now CanBuild[E, To]
is simply a special case CanBuildFrom[Nothing, E, To]
.
There is complicated machinery in place that gives you a CanBuildFrom[From, E, To]
for every sensible combination of E
and From
.
Not only does this give you the factory for your Builder
at runtime, it also determines the type To
at compile-time.
In some cases you don't want the source to determine what to build, that's where breakOut
enters the scene.
breakOut
is a (non-implicit) method that gives you a CanBuildFrom[From, E, To]
provided there is an implicit CanBuild[E, To]
(a.k.a. CanBuildFrom[Nothing, E, To]
).
Usually there are many of those because for each source collection type you get one or more implicits to build the matching result types. These implicits would be ambiguous.
But if you also add a type annotation for the To
type, the complicated machinery is supposed to give you exactly one matching instance.
Maintaining this machinery is a nightmare. The smallest change can create ambiguities or mess with prioritization.
There's an interesting aspect I haven't covered yet.
Why does CBF have a method def apply(from: From): Builder[Elem, To]
? What could it possibly do with the source collection object?
Surely if you want to build a List
it doesn't matter if you're building it from a Range
or from another List
.
I didn't see the use for this until I worked on the new collections design and had to fix bugs in the old one.
Indeed if you consider code like Vector(1,2,3).map(identity)
there is no need for it.
You implicitly get a CanBuildFrom[Vector[Int], Int, ?]
which builds a Vector[Int]
.
But what about (Vector(1,2,3): Iterable[Int]).map(identity)
?
The expectation is that it works similar to dynamic dispatch.
Even though the return type will be Iterable[Int]
we still want it to build a Vector
, not a List
(the default implementation of Iterable
).
The CBF will be one of type CanBuildFrom[Iterable[Int], Int, ?]
so there is no way to make the implicit lookup determine a CBF that builds a Vector
.
Instead you get a generic CBF implementation for Iterable
which uses the source collection object to create the right Builder
.
I don't remember the exact names from the old collections, in the new design we call the object that does this an IterableFactory
and every Iterable
has a field iterableFactory
to retrieve it.
At one point I fixed the old collections to make this work consistently for all collections of kind * -> *
(via IterableFactory
) but it's still broken for other kinds (like maps which are of kind * -> * -> *
) because the old collections do not have factory types for these kinds.
The new design has them (MapFactory
in this case), so as long as you only upcast within the same kind you still get the dynamic dispatch.