Skip to content

Instantly share code, notes, and snippets.

@dagolden
Created October 27, 2010 10:15
Show Gist options
  • Save dagolden/648782 to your computer and use it in GitHub Desktop.
Save dagolden/648782 to your computer and use it in GitHub Desktop.
Proposed commit message for push/pop/keys/etc patch
Allow push/pop/keys/etc to act on references
This patch updates all functions that operate directly on array or hash
containers to also accept references to arrays or hashes:
push $arrayref, @stuff;
unshift $arrayref, @stuff;
pop $arrayref;
shift $arrayref;
splice $arrayref, 0, 2;
keys $hashref;
values $hashref
($k,$v) = each $hashref
Because this patch is a substantial new feature and will likely merit
some discussion, I have organized this commit message as follows:
* Summary of benefits
* Summary of the patch
* Rationale
_Summary of benefits_
As I see it, the primary benefit is visually de-cluttering common
operations in situations where the context (acting on a "container") is
clear.
As a side effect of the implementation of this change, calling
keys/values directly on a reference roughly doubles performance.
keys $hashref; # ~twice as fast as "keys %$hashref"
(Prior to 5.12, keys %$hashref was similarly fast, but the optimization
was lost when keys/values added support for arrays)
I suspect that, if accepted, similar optimizations might be possible for
push/pop/etc ops.
_Summary of the patch_
For push/pop/shift/unshift/splice ops, I added a new 'ck_push' check
routine. If the child op is not an array or array dereference, the child
op is wrapped in an array dereference.
For keys/value/each, I added three new opcodes to handle the case where
the argument is not a hash or array (or explicit dereference of them).
All three ops are handled by one function that dereferences the argument
(including any magic) and then dispatches to the corresponding hash or array
implementation of the original function. If the argument is not a hash or
array, a runtime error is thrown. If there is ambiguous magic, %{}
magic is preferred over @{} (and explicit dereferencing is always a
fallback for such overloaded objects).
For all affected built-ins, the prototype has been changed to use the new
'single-term' prototype '+'. E.g. for push, the prototype is "+@".
Documentation and tests are included.
_Rationale_
Perl is a language that give great respect to the idea of context. I
use that term below in the linguistic sense in which the pattern of
usage guides interpretation (rather than in the programming sense of
"scalar" or "list" context).
There are a number of built-in functions that act directly on
"containers" (i.e. hashes and arrays) and that pass their first argument
by reference rather than flattening the container into a list. E.g.
because of the linguistic context of a function like "push" in
"push @array, @list", perl passes @array by reference.
In various venues and forums, I've heard people express the desire for
perl do the "obvious thing" when providing a reference directly to such
functions that act on containers. E.g. "push $arrayref, @list". The
linguistic context provided by "push" is exactly the same -- it is clear
that the goal is to act on the container itself.
In contrast, one can't reach the same conclusion about, for example,
"foo($_) for $arrayref". The "for" doesn't give any clear linguistic
context for whether $arrayref is intended to be flattened or used as a
single element list aliased to $_.
Therefore, this patch allows perl to use a reference to a container
directly for built in functions only where the linguistic context *already*
allows perl to pass the container by reference instead of flattening it.
In discussing this patch or variations of it on email and IRC, I've
heard lots of praise for the idea. The justifications I've seen are
various combinations of "easier to read", "less to type" and/or "less
annoying".
Those who dislike the idea of this patch generally seem to be of the
opinion that dereferencing "should be explicit" or that the patch
"isn't necessary" (which may just be a variation of the first
objection).
To the first objection, I say that this is a matter of style and
reasonable people already disagree on the "best" Perl style. This patch
delivers a desired feature to those who do want it without interfering
with the ability of others to explicitly dereference everything as a
matter of personal (or corporate) style.
I dismiss the second objection out of hand. By its very nature as a
community-driven project, nothing in the development of Perl is
"necessary" and the very definition of "necessary" will vary from person
to person. In this case, the work is desired and it is already done.
I also note that I view this patch as orthogonal to other proposals
regarding post-fix dereferencing. This patch scratches a lot of my
personal itch for post-fix dereferencing, but it may still be of value
to others if people should ever agree on a good style for it.
As mentioned in the summary, the changes to keys/values/each to accept
references happened to lead to a significant performance improvement,
recovering some of what was lost when 5.12 changed the implementation to
allow keys/values/each to act on arrays in the first place.
The changes to push/pop/etc did not require the same degree of changes
(i.e. I didn't add new opcodes for them) and I kept the patch as simple as
possible. However, if the patch is accepted, I suspect that the same
technique that allows optimization of "keys $hashref" could be explored
later to optimize "push $arrayref, @list" over "push @$arrayref, @list".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment