Skip to content

Instantly share code, notes, and snippets.

@alabamenhu
Last active January 9, 2022 04:29
Show Gist options
  • Save alabamenhu/3877fa665012e24ce74495d1661f69f9 to your computer and use it in GitHub Desktop.
Save alabamenhu/3877fa665012e24ce74495d1661f69f9 to your computer and use it in GitHub Desktop.
Strongly Typed Raku (WIP)
=begin pod
Raku is a gradual typed language, allowing you to completely ignore typing for quick mock ups,
or enforce very strict typing to ensure reliability in critical programs. This is a guide for
certain issues that can come up when writing strongly typed Raku code.
=head1 Basic typing
By default, Raku doesn't care what you store in variables.
my $foo;
$foo = 1;
$foo = "one";
$foo = { * + 1 };
In a perfect world, we would always remember exactly what is being passed where
and make no mistakes whatsoever. But sometimes as programmers we can screw up:
my $number;
$number = "one";
say $number + 2;
In this case, it's quite easy to ensure that C<$number> always contains something
numeric. Compare the results of the following two lines.
my Numeric $number;
$number = 1; # ✓
$number = 1.11; # ✓
$number = 1 / 10; # ✓
$number = "one"; # error
The type of a variable can be set as either a class or a role. In Raku, classes
and roles function fairly independently of each other thanks to mixins, so don't
worry too much about whether you're using a class or a role: just ensure it locks
values into what you want. For numbers, note the difference:
# Classes
my Int $a; # only accepts 1, 2, 3, etc.
my Rat $b; # only accepts 1/1, 2/1, 3/1, etc.
my Num $c; # only accepts 1.0, 2.0, 3.0, etc.
# Roles
my Numeric $d; # accepts an Int, Rat, or Num
Be advised that a subclass will always be considered valid value for its parent's
class, so if you ask for C<ParentClass>, you might also get C<ChildClass> unless
you specifically disallow it (which can be done using subsets, see below).
All typing information can I<also> be included on signatures:
sub foo (Num $float, Int $integer, Numeric $any-number) { … }
One extra feature of signatures currently not elsewhere is the ability to autocoerce.
The two blocks are functionally equivalent:
sub foo (Str() $text) { … }
sub foo (Str $text) {
$text = $text.Str unless $text ~~ Str
}
This could works fine (ignoring that $text isn't r/w here) and will fail if the variable
passed cannot coerce itself into Str. Autocoercion is great in situations like logging
functions that will ultimately use the string value anyway. Be very judicious with it,
however, because many objects may have unexpected or undesirable coercions (C<Array>
coerces to numbers with the number of elements, so doing math on it with that value other
than to calculate offsets or loops likely represents an error.
=head2 Summary
Whenever you declare any variable, always ensure that you have the type that
best fits the variable. (Double and triple ask yourself if autocoercion is useful,
when in doubt, have calling code coerce manually).
=head1 Defined vs Types
Generally, it's okay for most variables to go undefined and still be passed around.
There are several subtle ways that undefined variables can accidentally come up
and not cause problems until sometime later. For instance, consider this code:
sub add(Int $a, Int $b) { 
...
$a + $b
}
my Int @array = 1, 2, 3;
my Int $value = @array[3]; # oops, we forgot Raku is 0-based.
...
foo $a, $b;
The code will error, but it will error at the end of C<foo()>, even though the
mistakes happened far earlier. This can make debugging more difficult. One
possibility could be to say:
sub add(Int $a, Int $b) {
die unless $a.defined;
die unless $b.defined;
...
$a + $b
}
That's an improvement: even though we don't need to use $a in a defined way,
we'll catch it at the very beginning. But we can keep it in the sub's
signature:
sub foo(Int $a where *.defined, Int $b where *.defined) { … }
By putting it in the signature, the error can be caught as soon as the
sub is called. Using `where *.defined` is a bit line-noise-y, so there's
a syntactical shortcut for it:
sub foo(Int:D $a, Int:D $b) { … }
The C<:D> constraint doesn't actually have to have a type defined (although
we're talking about strictly typing here so let's avoid that, mmkay?), in
which case it's implied to be C<Any:D>
sub add(:D $a, :D $b) { … }
Sometimes, instead of desiring a defined object, you might want to
explicitly work with undefined variables. This is useful when
you want to pass types as types (for example, in parameterization).
Undefinedness can be enforced simply with the C<:U> constraint.
sub foo(:U $a)
=head2 Summary
For strongly typed Raku, you should always ensure that every parameter
in a signature has a definedness constraint. Here's the interpretations
Any:D $defined-only
Any:U $undefined-only
Any:_ $either-DANGER
Any $also-either-DANGER
=head1 Subsets and Constraints
When you want to restrict the values, rather than the type, you'll want to use a subset.
For instance, dividing a number can be done by anything but zero, so let's imagine a
C<divide> function that takes two numbers and returns the result:
sub divide(Numeric $dividend, Numeric $divisor) {
die "Division by zero is impossible" if $divisor == 0;
return $quotient;
}
There are better ways to handle this. One way is to explicitly state the limitations:
sub divide(Numeric $dividend, Numeric $divisor where * != 0)
This is great for single ad-hoc constraints. But let's consider a function that wants
a CSS color value as a string.
sub pretty(Str $color where /<[0..9a..fA..F]> ** 6/) { … }
The problem here is two fold: (1) color is something we'll probably use over and over
again and we'd have to keep rewriting the constraint and (2) a CSS color is more than
just six hexadecimal digits. It could also be three hex digits, or a name, or use an
explicit color type format like C<rgba(123,45,67,.89)>. What can be done instead is
create a CSSColor subset which is a Str by type, but whose values are limited to those
that are valid in CSS:
subset CSSColor of Str where {
|| $_ ~~ /<[0..9a..fA..F> ** 6/
|| $_ ~~ /<[0..9a..fA..F> ** 3/
|| $_ ∈ <red green blue yellow>
|| …
}
sub pretty(CSSColor $color) { … }
Now in the sub we can safely print out $color being assured it is a valid CSS, as well
use it in any other sub. If CSS changes its definition of colors, we can modify just
the subset and it will apply to every instance.
Subsets don't have to work on just the values. For instance, imagine I want a sub
to accept objects that are I<both> Positional I<and> Associative? I can't include
that in the type information itself — only via where clauses:
sub double-duty($foo where Positional & Associative) { … }
But with a subset, I can:
subset TwoWays where Positional & Associative;
sub double-duty(TwoWays $foo) { … }
You can make very complex subsets:
subset Overkill where $_ ~~ Positional
&& .[0] == 0
&& .[3] == 3
&& .all ~~ Int
&& .elems < 10;
my Overkill $a = (0,1,2,3,4); # perfect
my Overkill $b = (1,1,2,3,4); # error: first element must be 0
my Overkill $c = (0..10).List; # error: more than ten elements
A really cool thing about subsets is that, written in a particular way, you can give
some very useful error messages.
subset Overkill where ($_ ~~ Positional || die "Must give a Positional value")
&& (.[0] == 0 || die "The first element must be zero")
&& (.[3] == 3 || die "The third element must be 3, but it was ", .[3])
&& (.all ~~ Int || die "All elements must be Ints")
&& (.elems < 10 || die "Must have less than 10 elements but found ", .elems);
my Overkill $a = (0,1,2,2,4); # die output: "The third element must be 3, but it was 2"
my Overkill $b = (1,1,2,3,4); # die output: "The first element must be zero"
my Overkill $c = (0..10).List; # die output: "Must have less than 10 elements but found 11"
Usually die is what you want, but you can also create specific exception types and throw them
instead if you intended to use C<CATCH> blocks regularly.
=head2 Summary
Define a subset by using a where clause followed a single value (will be smartmatched) or one
or more operations.
subset 8bit where 0..255 # ranges' smart match checks values
subset SmallOrBig where (0..10) | (100..1000) # junctions are allowed
subset UnderTen where * < 10 # whatevers are valid (only use one)
subset UnderTenEven where $_ < 10 && # have to use topic variable
&& $_ % 2 # to reference value more than once
subset ThreeItems where .elems == 3 # since it's topicalized, you can use .method
# without the $_ reference.
Use a subset for whenever there are bad values or complicated restrictions for a type.
You can also use them for complicated types (that mix two or more). Subsets are fairly
fundamental to both strongly typed and safe Raku coding.
=head1 Arrays and Hashes
Arrays and hashes present some problems when trying to strongly type because of the way that
parameterization can be defined. Consider an array:
my @integers = 1,2,3,'4';
This assignment is considered valid, but clearly contains a C<Str> that we don't want to
allow. Positional objects stored with a C<@> sigil can be quickly typed by simply placing
the type in front (remember that the C<@> sigil I<implies> the Positional type).
my Int @integers = …
And now it would catch accidentally putting in a Str. This is fairly straightforward. But
what if you want to have an array of arrays? The basic type definition is
my Array @arrays
But we can't say C<my Int Array @arrays>, because the C<Array> isn't defined in a C<@> sigil,
but we can use another format (which also works for C<$>-sigiled positionals):
my Array[Int] @arrays;
my Array[Array[Int]] $arrays-in-scalar;
Strongly typed Raku is very strict: the above arrays require extra work to work with:
my Array[Int] @arrays;
@arrays[0] = 1, 2, 3; # error! 1,2,3 is List[Any]
@arrays[0] = Array[Int].new(1,2,3); # correct
@arrays = (1,2,3), (4,5,6), (7,8,9) # each value is a List[Any]
@arrays = Array[Int].new(1,2,3),
Array[Int].new(4,5,6),
Array[Int].new(7,8,9); # correct, each value is an Array[Int]
Note the use of the C<Class[Type]> in the object constructor to enforce the correct
object type.
The same principle exists for Associatives like Hashes or Maps. You can almost think of
Positionals as hashes whose keys are integers. By default, the key of Associatives
are C<Str>, and the values are defeined similar to Positional value types:
my Int %integer-values = …
Sometimes, however, you may want to use a key type other than Str, for instance,
if you are creating a mapping between various objects. There is a special syntax
that uses curly braces to do this:
my Int %foo{Any} = …
This enables any kind of object to be used as a key. You don't actually need
the value to be defined to constrain the key, so to allow for any value, but
requiring integer keys, you can say:
my %any-value{Int}
For an Associative with keys
restricted to things like C<Rat> with values that are strings, you could say:
my Str %foo{Rat} = 1/2 => 'one half', 1 => 'one', 3/2 => 'one and a half';
Just like with positionals, it is possible to define type constraints along
with the main class when using a scalar value:
my Map[Str,Rat] = …
In this format, note the order of the values C<Associative[Value,Key]>. If
you want to only define the key with this format, you will need to explicitly
use the value type C<Any>.
=head2 Summary
Here's a quick review of the different ways to define type/value relationships:
Positionals (remember, the @sigil implies Positional, NOT Array or List)
#######################################
my Str @array; # a Positional containing Str values
my List[Int] $array-in-scalar; # a List containing Int values, stored in a scalar
my Array[Rat] @array; # a Positional containing Arrays that contain Rat values
my Array[List[Str]] @array; # a Positional containing Arrays that contain Lists that contain Str values
Associatives (remember, the %sigil implies Associative, NOT Hash or Map, and defaults to Str keys)
#######################################
my Rat %hash; # an Associative containing Rat values with Str keys
my Rat %hash{Int}; # an Associative containing Rat values with Int keys
my %hash{Int}; # an Associative containing Rat values with Any keys
my Hash[Rat] $hash-in-scalar; # a Hash containing Str keys and Rat values, stored in a scalar
my Hash[Rat,Int] $hash-in-scalar; # a Hash containing Num keys and Rat values, stored in a scalar
my Hash[Any,Int] $hash-in-scalar; # a Hash containing Num keys and Any values, stored in a scalar
While you can go crazy and define a value like `my Hash[Array[Map[Callable]],Str] %hash{str}`,
it's really overkill. If the structure is know in advance, you are almost I<always> in such
case better off to make small objects. In particular, hash access is slower than attribute access,
can help with code completion in IDEs, and very importantly, will error when accessing non-existent
attributes (both Hashes and Arrays will return type objects which might not error immediately, see
Definedness and Types above).
=head1 Other Parameterization
To create your own parameterization, define your role with extra brackets
role Positional[::Value] { … }
It is possible to parameterize classes, but it is also a bit more difficult because it
involves making a special inner-role and is currently Rakudo-specific (at the moment
not a problem, as Rakudo is the only Raku compiler). For more information, see
L<this SO post|https://stackoverflow.com/questions/57554660>.
class DescriptiveName { 
my role R[::T] {
has T $.value; # our parameterized value
# All "actual" methods will go begin here…
method new(Container: T $value) { self.bless: :$value }
# … and end here
}
# This handles the mixin process that combines the outer class with its inner role
method ^parameterize(Mu:U \this, Mu \T) {
my $type := this.^mixin: R[T];
$type.^set_name: this.^name ~ '[' ~ T.^name ~ ']';
$type
}
}
The C<::Identifier> syntax is defines a type capture. You can then use the captured type
anywhere else you would use a type. For example, if you have an array-like class, you
want to make sure you only add things of the same type.
role ArrayWrapper[::TypeCapture] {
has TypeCapture @internal;
method push(TypeCapture $elem) { @internal.push: $elem }
method pop( --> TypeCapture) { @internal.pop  }
}
It's somewhat traditional to use a single letter for the type capture, but you're free to use
whatever name you want. You can also use use type captures in signatures:
sub same-type-only(::T $foo, T $bar) { 
# dies unless $foo and $bar have matching types 
}
Unfortunately, this isn't available for slurpies because they don't (currently) allow typing.
=head2 Summary
=head1 Slurpies
Slurpies present a major problem for strongly typed Raku programming: they don't allow typing
(yet, at least). In generally, just don't. But if you need to use this, there are a few
work arounds. The easiest way is to add a where clause:
sub only-strings(*@slurpy where .all ~~ Str) { … }
Hashes are far more complicated because if you use .all, you can only match with a Pair, which
is not parameterizable. So we need to do two checks:
sub only-string-key-int-val(
*%slurpy where .keys.all ~~ Str
&& .values.all ~~ Int
) { … }
There is a I<huge> caveat with this method for both Positionals and Associatives. Your slurpy
will *not* be typed. To ensure further type matching, you'll need to create an entirely
new variable first.
sub only-strings(*@slurpy where .all ~~ Str) {
my Str @typed-slurpy = Array.new: @slurpy;
}
sub only-string-key-int-val(
*%slurpy where .keys.all ~~ Str
&& .values.all ~~ Int
) { 
my Int %typed-slurpy{Str} = Hash[Int,Str].new(%slurpy);
}
This is a complicated bit of boilerplate and fraught with many places where you can mess
up. While there may be a use case for it, it is probably much safer to just require a
typed array to be passed. Calling C<only-takes-str-array(Array[Str].new(…))> might be
annoying, but it helps to reinforce across the code base that we only want certain values.
Furthermore, IDEs and compilers are more likely to catch problems sooner with the typical
typed array/hash, than they are to catch C<where> clauses which by definition are checked
at runtime (maybe some day simple ones can be caught, but that's a long way off, and will
never be fully accurate).
=head2 Summary
Don't use slurpies in strongly typed Raku. Just. Don't. Do. It.
=end pod
@sdondley
Copy link

sdondley commented Jan 9, 2022

$foo = { * + 1 }; throws an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment