This is a work-in-progress based on this class tutorial. It covers features that are not yet implemented.
perlclasstut - Object-Oriented Programming via the class
keyword
This tutorial is a rough work-in-progress and only covers features of the new object syntax that are well-defined. The implementation is ongoing and while most of the basics are mapped out, there are some edge cases still being nailed down. We will not discuss those here.
With the release of Perl version X, the Perl language has added a new object system. Originally code-named Corinna, the new object system is a result of years of collaboration between the Corinna team and the Perl community to bring a modern, easy-to-use object system that still feels like Perl. If you're looking for information about Perl's original object system, see perlobj (and later, perlootut to see object systems built on top of the legacy system).
Note that the following assumes you understand Perl and have a beginner's knowledge of object-oriented programming. You should understand classes, instances, inheritance, and methods. We'll cover the rest. Also, this is only an introduction, not a full manual.
Further, for simplicity, we'll often refer to "object-oriented programming" as OOP.
The legacy object system in Perl remains and no code will be broken by this change.
There are many ways to describe object systems. Some focus on the implementation ("structs with behavior"), but we'll focus on the purpose. Objects are experts about a problem domain. You construct them with the information they need to do their job. For example, consider the case of an LRU cache. An LRU cache is a type of cache that keeps cache size down by deleting the least-recently-used cache entry. Let's construct a hypothetical cache:
my $cache = Cache::LRU->new( max_size => 20 );
In the above example, we will assume that max_size
is the maximum number of
entries in the cache. Adding a 21st unique entry will cause the "least
recently used" entry to be ejected from the cache.
And then you can tell the object to do something, by calling "methods" on the object. Let's save an item in the cache and retrieve it.
$cache->set( $customer_id => $customer );
my $cached_customer = $cache->get($customer_id);
How does it work internally? You don't care. You should trust the object to do the right thing. Read the docs. That's the published interface.
In this tutorial, we'll build the Cache::LRU
class so you can see how this
works, but after we have described a few fundamentals.
use feature 'class';
When you use the class
feature, four new keywords are introduced into the
current scope.
-
class
Declare a class.
-
method
Declare a method
-
field
Declare a field (data for the class)
-
role
Declare a role.
Note the use of the word "declare" for all of those definitions. Use of the
class
feature allows a declarative way of writing OOP code in Perl. It's
both concise and expressive. Because you're declaring your intent instead of
manually wiring all of the bits together, there are fewer opportunities for
bugs.
That the general syntax for each of these keywords is:
KEYWORD IDENTIFIER MODIFIERS? DEFINITION?
For example:
class Employee :isa(Person) {
...
}
In the above, class
is the KEYWORD, Employee
is the IDENTIFIER (the
unique name of the thing), :isa(Person)
is an optional MODIFIER that
assigns additional properties to the thing you've identified (in this case,
Employee
inherits from Person
), and the postfix block is the DEFINITION
of the class.
Note that, like the package
declarator, class
does not require a
postfix-block, even though we'll show some examples using it.
Also, modifiers are almost always regular Perl attributes, with an exception made for declaring the class version.
The class
keyword declares a class and the namespace for that class. In
future versions of Perl, it's possible we'll have private classes which are
lexically bound, so do not make assumptions about the implementation.
Let's get started on our Cache::LRU
class.
use feature 'class'; # from now on, this will be assumed
class Cache::LRU {}
# or
class Cache::LRU;
The above shows declaring a class and you can now make a new instance of it:
my $cache = Cache::LRU->new;
if ( $cache->isa('Cache::LRU') ) { # true
...
}
else {
# we never get to here
}
Of course, that's all you can do. It's kinda useless, but we'll cover more in a bit.
Note that the new
method is provided for you automatically. Do not declare
your own new
method in the class.
Any valid v-string may be used to declare the class version. This should be after the identifier:
class Cache::LRU v0.1;
my $cache = Cache::LRU->new;
say $cache->VERSION; # prints v0.1
Note: due to how the Perl grammar works, the version declaration must come before any attributes.
In OOP, sometimes you want a class to inherit from another class. This means that your class will extend the behavior of the parent class (er, that's the simple explanation. We'll keep it simple).
For example, a Cat
might inherit from Mammal
. In OOP, we often say that
a Cat
isa Mammal
. You do this with the :isa(...)
modifier.
class Cat :isa(Mammal);
Note that objects declared with class
are single-inheritance only. As an
alternative to multiple inheritance, we provide roles. More on that later.
In OOP, an abstract class is a class that cannot be instantiated. Instead,
another class must inherit from the abstract class and provide the full
functionality. In the "Inheritance" example above, the Mammal
class might
be abstract, so we declare it with the :abstract
modifier.
class Mammal :abstract {
...
}
Any attempt to instantiate an abstract class is a fatal error.
my $mammal = Mammal->new; # boom
Methods declared with a forward declaration (i.e. any method whose name is declared, but without any corresponding code block) must be provide by a subclass, either via direct implementation or via a role. At the present time, forward declarations of methods do not take signatures due to more work being needed to make signatures introspectable.
class Mammal :abstract {
method eat; # must be declared in a subclass at compile-time
}
Note that modifiers may not be duplicated, but the order in which they're specified does not matter.
class Mammal v1.0 :abstract :isa(Animalia);
class Mammal v1.0 :isa(Animalia) :abstract; # same thing
(With apologies to the biology fans who know that biological taxonomy is both misrepresented here and more complex than this simple hierarchy).
The field
keyword allows you to create data storage for your class. You can
create instance data and class data. This data is stored in normal Perl
variables, but with special syntax to bind them to the class.
Classes are not very useful without data. In our Cache::LRU
class, we have
a max_size
field to indicate how many cache entries we can have. Let's
declare that field, provide a "reader" for that field, and a default value of
20.
Underneath the hood, we'll also use the
Hash::Ordered module to provide the
actual caching. Note that Hash::Ordered
is written using legacy Perl, but
you shouldn't (and don't) have to care about that.
class Cache::LRU {
use Hash::Ordered;
field $cache { Hash::Ordered->new };
field $max_size :reader { 20 };
}
my $cache = Cache::LRU->new;
say $cache->max_size; # 20
In the above example, both $cache
and $max_size
are instance
variables, which are unique to every instance of the class. They are never
available outside the class. For each of them, we have an optional postfix
block to assign a default value to those fields. If you omit the block,
those fields will contain the value undef
unless your class assigns a value
to them.
Unlike Perl's legacy OOP system, you cannot use $cache->{cache}
, $cache->{'$class'}
or any other tricks to get at this data. It's completely
encapsulated. However, in case of emergency, the meta-object protocol (MOP)
will allow access to this data (but that's beyond the scope of this tutorial).
So how can we read the max_size
data? Because we used the :reader
attribute (also called a "modifier"). By default, the :reader
modifier
removes the $
sigil from the variable name and that becomes the name of a
read-only method. So declaring field $foo :reader
will create a foo
method that will return the value contained in $foo
. However, you can
change the name of the method:
field $max_size :reader(max_entries);
Naturally, we provide a corresponding :writer
modifier
field $rank :reader :writer;
By default, the :writer
modifier will prepend a set_
to the method name,
so the above allows:
say $object->rank; # returns the value of $rank
$object->set_rank('General'); # sets the value of $rank.
Important: being able to mutate an object (i.e. change the values of its fields via writer methods) is often a dangerous thing, as other code using that object may have already made decisions or assumptions based on the previous value of that field. If that previous value is no longer valid, those decisions or assumptions may now be inconsistent or incorrect.
Each writer method returns its own invocant to allow chaining:
$object->set_rank('General')
->set_name('Toussaint Louverture');
Though it's discouraged, you can set the name of the writer to the same name as the reader:
field $rank :writer(rank) :reader;
This allows for a common Perl convention of creating a single reader/writer method by overloading the behaviour of the method based on whether or not it is passed an argument:
say $object->rank; # returns the value of $rank
$object->rank('General'); # sets the value of $rank.
Obviously, the rank
method now does two entirely separate things,
which can be confusing and error-prone, but this technique is
ingrained in Perl OOP culture, so we support this edge case.
Having a default of 20 for max_size
is useful, but we need to allow the
programmer to say what the max size is. We do this with the :param
modifier.
field $max_size :reader :param { 20 };
This tells the class that this value may be passed as a named parameter to the constructor.
my $cache = Cache::LRU->new( max_size => 100 );
say $cache->max_size; # 100
It's important to remember that every constructor parameter is required to be passed to the constructor if a default is not provided. Thus, if we have this:
class NamedPoint {
field ( $x, $y ) :param :reader {0};
field $name :param :reader;
}
The above would allow you to do any of these:
my $point = NamedPoint->new( name => 'Origin' );
my $point = NamedPoint->new( name => 'Origin', x => 3 );
my $point = NamedPoint->new( name => 'Origin', x => 3, y => 3.14 );
But not this:
my $point = NamedPoint->new( x => 23, y => 42 ); # Missing 'name' initializer
If a field is required, but not passed to the constructor, you will get a fatal runtime error.
Now that we know how to construct a basic object, we probably want to do things
with it. To do that, we write methods. Methods use the method
keyword
instead of sub
. They also take argument lists. Let's look at a
"transposable" point class (i.e. X,Y --> Y,X).
class Point {
field ( $x, $y ) :reader :param;
method invert () {
( $x, $y ) = ( $y, $x );
}
method to_string () {
return sprintf "(%d, %d)" => $x, $y;
}
}
my $point = Point->new( x => 23, y => 42 );
say $point->to_string; # (23, 42)
$point->invert;
say $point->to_string; # (42, 23)
In the above, you can see that methods have direct access to field variables.
However, they also have $self
injected in them. So you could also write invert
as follows:
method invert () {
( $x, $y ) = ( $self->y, $self->x );
}
However, method calls are not only slower than direct variable access, but
it's more typing. Plus, if we don't use :reader
for a given field, we have
no method to call.
Putting all of this together, we get the following as a very basic
Cache::LRU
class:
use feature 'class';
class Cache::LRU {
use Hash::Ordered;
field $cache { Hash::Ordered->new };
field $max_size :param :reader { 20 };
method set( $key, $value ) {
$cache->unshift( $key, $value ); # new values in front
if ( $cache->keys > $max_size ) {
$cache->pop;
}
}
method get($key) {
return unless $cache->exists($key);
my $value = $cache->get($key);
$self->unshift( $key, $value ); # put it at the front
return $value;
}
}
With the above, we have a working LRU cache. It doesn't have a lot of
features, but it shows you the core of writing OOP code with the class
feature. We have a powerful, well-encapsulated declarative means of writing
objects without having to wire together all of the various bits and pieces.
The new class
syntax only provides for single inheritance. Sometimes you
need additional behavior that you would like to "transparently" provide. For
example, you might want two or more unrelated classes to be able to
serialize themselves to JSON, even though each class itself has nothing to do
with JSON. Let's do that with our Cache::LRU
class.
To provide functionality shared across unrelated classes, we use the role
keyword. A role is similar to a class, but it cannot be instantiated. Instead,
it is "consumed" by a class and the class provides the specifics of the role behavior. Roles
can both provide methods and exclude methods. For our JSON role, it might look
like this:
use feature 'class';
role Role::Serializable::JSON {
use JSON::PP 'encode_json'; # provided in core Perl since v5.13.9
method to_hash; # forward declaration: the class must provide this
method to_json () {
encode_json( $self->to_hash );
}
}
And you can use this in your class with the :does
attribute.
class Cache::LRU :does(Role::Serializable::JSON) {
...
}
But our class fails at compile-time because it doesn't have a to_hash
method.
So let's write one.
class Cache::LRU v0.1.0 :does(Role::Serializable::JSON) {
use Hash::Ordered;
use Carp 'croak';
field $cache { Hash::Ordered->new };
field $max_size :param :reader { 20 };
method set ( $key, $value ) {...}
method get($key) {...}
method to_hash () {
my %entries;
foreach my $key ($cache->keys) {
my $value = $cache->get($key);
my $ref = defined $value ? (ref $value || 'SCALAR') : 'UNDEF';
$entries{$key} = $ref;
}
return {
max_size => $max_size,
entries => \%entries,
}
}
}
In the above, the method to_hash;
forward declaration defines a method that
the Role::Serializable::JSON
role requires the consuming class to provide.
It can do so by either having the method defined in the class or consuming it
from another role.
The method to_json
provided by the role will be "flattened" into the
Cache::LRU
class almost as if it had been written there. However, fields
defined in the class are always lexically scoped (like a my
or state
variable) and so are not directly accessible to the role method.
With that, can do this:
my $cache = Cache::LRU->new(max_size => 5);
$cache->set( first => undef );
$cache->set( second => 'bob' );
$cache->set( third => { foo => 'bar' } );
say $cache->to_json;
And we should get output similar to the following:
{"max_size":5,"entries":{"first":"UNDEF","third":"HASH","second":"SCALAR"}}
You can also consume multiple roles:
class Foo :does(Role1) :does(Role2) {
...
}
Roles may declare fields, but those field variables are private to that role.
This protects against the case where a class and a role might both define
field $x
.
If any method defined directly in the class has the same name as a method provided by a role, a compile-time error will result. If two roles have duplicate method names, this will also cause a compile-time failure if they're consumed together. Traditionally, roles have syntax for "excluding" or "aliasing" methods, but this is not (yet) provided by the new mechanism. In practice, we find this is rarely an issue, but as roles are more widely shared, this will need to be addressed.
As a workaround, you can create a new object that consumes the role and store that object in a field, or you can use interstitial base classes that consume the role. Neither solution is great.
You can declare arrays and hashes as fields, with or without defaults:
field @colors { qw/green yellow red/ };
field %seen;
However, array and hash fields cannot have modifiers:
field %seen :reader; # compile-time error
field @array :param; # compile-time error
field %hash :writer; # compile-time error
Class data and methods are shared by all instances of a given class. They are
declared with the :common
attribute. For example, let's say you're making a
game and you only allow 20 point objects to be created. How do you track how
many are created? You don't. That's the responsibility of the class. Let's use
class data for this.
class Point {
field $num_points :common :reader { 0 }; # all classes share this
field ( $x, $y ) :param :reader;
ADJUST {
$num_points++;
if ( $num_points > 20 ) {
die "No more than 20 points may be created at any time";
}
}
DESTRUCT { $num_points-- }
}
In the above, ADJUST
is a phaser (like BEGIN
or END
), which is called
every time a class is instantiated. You can have multiple ADJUST
phasers
and they are called in order declared. So you could also write the above
ADJUST
as follows:
ADJUST { $num_points++ }
ADJUST {
if ( $num_points > 20 ) {
die "No more than 20 points may be created at any time";
}
}
The DESTRUCT
phaser behaves similarly to ADJUST
, but only fires when the
reference count of the object drops to zero (in other words, when it goes out
of scope).
We can now do this:
say Point->num_points; # 0
my $point1 = Point->new( x => 2, y => 4 );
say Point->num_points; # 1
# or
say $point1->num_points; # 1
my $point2 = Point->new; # accepts defaults
say Point->num_points; # 2
undef $point1; # triggers DESTRUCT
say $point1->num_points; # 1
There's a lot more to say about ADJUST
and DESTRUCT
, but some of the
finer points are sill being nailed down.
In Moose, you can declare attributes like this:
has limit => (
is => 'rw',
isa => 'Int',
);
With that, you can cannot pass anything but an integer to the constructor, nor
can you later do $object->limit('unlimited')
. Sadly, we do not have this
at the present time for the class syntax, but there is a work around:
Types::Standard and ADJUST
.
Note that this workaround is only safe for immutable objects. Mutable objects
will (for the time being) have to jump through more hoops to ensure type
safety.
The following trivial example shows the potential, but obviously, there's a
lot more you could do with Types::Standard
to make this more robust.
class Point {
use Types::Standard qw(is_Int);
field ( $x, $y ) :reader :param;
ADJUST {
my @errors;
is_Int($x) or push @errors => "x must be an integer, not $x.";
is_Int($y) or push @errors => "y must be an integer, not $y.";
if (@errors) {
die join ', ' => @errors;
}
}
}
With the above, you can guarantee that your Point
object only has integer
values for $x
and $y
.
use feature 'class';
role Role::Serializable::JSON {
use JSON::PP 'encode_json'; # provided in core Perl since v5.13.9
method to_hash; # the class must provide this
method to_json () {
encode_json($self->to_hash);
}
}
class Cache::LRU v0.1.0 :does(Role::Serializable::JSON) {
use Hash::Ordered;
use Carp 'croak';
field $num_caches :common :reader { 0 };
field $cache { Hash::Ordered->new };
field $max_size :param :reader { 20 };
field $created :reader { time };
ADJUST { # called after new()
$num_caches++;
if ( $max_size < 1 ) {
croak(...);
}
}
DESTRUCT { $num_caches-- }
method set( $key, $value ) {
$cache->unshift( $key, $value );
if ( $cache->keys > $max_size ) {
$cache->pop;
}
}
method get($key) {
return unless $cache->exists($key);
my $value = $cache->get($key);
$self->unshift( $key, $value );
return $value;
}
method to_hash () {
my %entries;
foreach my $key ($cache->keys) {
my $value = $cache->get($key);
my $ref = defined $value ? (ref $value || 'SCALAR') : 'UNDEF';
$entries{$key} = $ref;
}
return {
max_size => $max_size,
entries => \%entries,
created => $created,
num_caches => $num_caches,
}
}
}
I hope you've enjoyed this far-too-brief introduction to the new class
keyword. This has been the result of years of design effort from the Corinna
design team and the Perl community at large.
This work is dedicated to the memory of Jeff Goff and David Adler, two prominent members of the Perl community who were wonderful people and left this life far too soon.
Paul "LeoNerd" Evans and Damian Conway both were kind enough to help with some of my silly mistakes.