Skip to content

Instantly share code, notes, and snippets.

@x-yuri
Last active November 14, 2024 16:39
Show Gist options
  • Save x-yuri/50c4a9ce309ac41fe3e81798b0076745 to your computer and use it in GitHub Desktop.
Save x-yuri/50c4a9ce309ac41fe3e81798b0076745 to your computer and use it in GitHub Desktop.
perl: pack()

perl: pack()

I'm going to ignore the j, J, f, F, d, D, p, P, u, w template characters.

I'm running this on a system with x86_64 architecture (little endian) with perl-5.38-2, sizeof(int) == 4.

pack TEMPLATE, LIST takes LIST and packs it into a string according to TEMPLATE. E.g. pack('ac', 'a', 1) returns "a\x01". There are two template charaters here: a and c. Each template character tells pack() what to do with the next argument: a to take one character from the first argument ('a'), and put it into the resulting string, c to take the next argument (the 8-bit signed integer 1), and add chr(1) to the resulting string.

Template characters differ in what values they take:

  • a, A, Z, b, B, h, H (string template characters) take characters from a string (pack('a', 'a'))
  • c, C, W, s, S, l, L, q, Q, i, I, n, N, v, V, U (integer template characters) take integers (pack('c', 1))
  • x, X, @ take nothing
  • . takes an integer (pack('.', 1)), but it's rather similar to the previous group in what it does

Template characters can be followed by a repeat count, which tells pack() how many values a template character takes:

  • string template characters take values from one argument (pack('a2', 'aa') == 'aa'), but each next template character takes values from the next argument (pack('aa', 'a', 'a') == "aa")
  • for integer template characters each argument is a value (pack('c2', 1, 1) == "\x01\x01")

Some even say, that repeat count is length (string length) in the case of string template characters.

Then:

  • for a, A, Z a value is a character
  • for b, B, h, H a value is a string representation of a digit
  • a, Z pad values (if there are not enough characters) with \x00 (pack('a2', 'a') == "a\x00"), A pads with spaces (pack('A2', 'a') == 'a ')
  • Z is like a, but takes one character less and adds \x00 (pack('Z2', 'a') == "a\x00")
  • in the case of b, B a digit is 0..1, for h, H it's 0..f
  • b, B fill the resulting string with bits, h, H with nybbles
  • b, h start with LSB/low nybble (pack('b', '1') == "\x01"), B, H with MSB/high nybble (pack('B', '1') == "\x80")
  • c takes an integer in the range -128..127, C in the range 0..255, W in the Unicode range (0..0x10ffff), each argument produces one character in the resulting string (pack('c', 1) == "\x01")
  • s takes an integer in the range -0x8000..0x7fff and produces 2 characters in the system's native byte order (pack('s', 1) == "\x01\x00" in the case of a little endian system)
  • S takes an integer in the range 0..0xffff and produces 2 characters in the system's native byte order
  • l takes an integer in the range -0x8000_0000..0x7fff_ffff and produces 4 characters in the system's native byte order (pack('l', 1) == "\x01\x00\x00\x00" in the case of a little endian system)
  • L takes an integer in the range 0..0xffff_ffff and produces 4 characters in the system's native byte order
  • q takes an integer in the range -0x8000_0000_0000_0000..0x7fff_ffff_ffff_ffff and produces 8 characters in the system's native byte order (pack('q', 1) == "\x01\x00\x00\x00\x00\x00\x00\x00" in the case of a little endian system)
  • Q takes an integer in the range 0..0xffff_ffff_ffff_ffff and produces 8 characters in the system's native byte order
  • the range and the number of produced characters is system-dependent for i and I, in my case they're -0x8000_0000..0x7fff_ffff/4 (l) and 0..0xffff_ffff/4 (L) respectively
  • n takes an integer in the range 0..0xffff and produces 2 characters in the big-endian byte order (pack('n', 1) == "\x00\x01")
  • N takes an integer in the range 0..0xffff_ffff and produces 4 characters in the big-endian byte order (pack('N', 1) == "\x00\x00\x00\x01")
  • v takes an integer in the range 0..0xffff and produces 2 characters in the little-endian byte order (see S in the case of a little endian system)
  • V takes an integer in the range 0..0xffff_ffff and produces 4 characters in the little-endian byte order (see L in the case of a little endian system)

To put it briefly:

  • a a character of a null-padded string
  • A a character of a space-padded string
  • Z a character of a null-terminated string (null-padded)
  • b a bit of a string (LSB first)
  • B a bit of a string (MSB first)
  • h a nybble of a string (low nybble first)
  • H a nybble of a string (high nybble first)
  • c signed char (8-bit)
  • C unsigned char (8-bit)
  • W a Unicode character
  • s signed short (16-bit)
  • S unsigned short (16-bit)
  • l signed long (32-bit)
  • L unsigned long (32-bit)
  • q signed quad (64-bit)
  • Q unsigned quad (64-bit)
  • i signed int (native)
  • I unsigned int (native)
  • n unsigned short (16-bit, big endian)
  • N unsigned long (32-bit, big endian)
  • v unsigned short (16-bit, little endian)
  • V unsigned long (32-bit, little endian)
  • U UTF-8 representation of a Unicode code point

By default pack() operates in C0 (character) mode. In this mode values are added to the resulting string as characters:

  • pack('a', 'a') == 'a'
  • pack('A', 'a') == 'a'
  • pack('Z2', 'a') == "a\x00"
  • pack('b8', '11111111') == "\xff" (a value is added as soon as there's a byte)
  • pack('B8', '11111111') == "\xff"
  • pack('h2', '11') == "\x11"
  • pack('H2', '11') == "\x11"
  • pack('c', 1) == "\x01"
  • pack('C', 1) == "\x01"
  • pack('W', 1) == "\x01"
  • pack('s', 1) == "\x01\x00" (each byte is added as a separate character)
  • pack('S', 1) == "\x01\x00"
  • pack('l', 1) == "\x01\x00\x00\x00"
  • pack('L', 1) == "\x01\x00\x00\x00"
  • pack('q', 1) == "\x01\x00\x00\x00\x00\x00\x00\x00"
  • pack('Q', 1) == "\x01\x00\x00\x00\x00\x00\x00\x00"
  • pack('i', 1) == "\x01\x00\x00\x00" (i == l in my case)
  • pack('I', 1) == "\x01\x00\x00\x00" (I == L in my case)
  • pack('n', 1) == "\x00\x01"
  • pack('N', 1) == "\x00\x00\x00\x01"
  • pack('v', 1) == "\x01\x00" (v == S in the case of a little endian system)
  • pack('V', 1) == "\x01\x00\x00\x00" (V == L in the case of a little endian system)

In U0 (UTF-8 byte) mode, which is turned on with, well, U0 (pack('U0...', ...)), values are added to a sequence of bytes. Before returning from pack() the sequence of bytes is typecasted or becomes the resulting string (or so it looks). As such the resulting sequence of bytes should be valid UTF-8. Also in this mode the ranges of a, A, Z and W reduced to 0..0xff:

  • pack('U0a2', "\xdf\xbf") == "\x{7ff}" ("\xdf\xbf" is the UTF-8 representation of "\x{7ff}")
  • pack('U0A2', "\xdf\xbf") == "\x{7ff}"
  • pack('U0Z3', "\xdf\xbf") == "\x{7ff}\x00"
  • pack('U0b16', '1111' . '1011' . '1111' . '1101') == "\x{7ff}" (what b takes in hex is 'fbfd')
  • pack('U0B16', '1101' . '1111' . '1011' . '1111') == "\x{7ff}" (waht B takes in hex is 'dfbf')
  • pack('U0h4', 'fdfb') == "\x{7ff}"
  • pack('U0H4', 'dfbf') == "\x{7ff}"
  • pack('U0c2', 0xdf - 0x100, 0xbf - 0x100) == "\x{7ff}" (the c range is -0x80..0x7f, so we need to adjust the values)
  • pack('U0C2', 0xdf, 0xbf) == "\x{7ff}"
  • pack('U0W2', 0xdf, 0xbf) == "\x{7ff}"
  • pack('U0s', 0xbfdf - 0x10000) == "\x{7ff}"
  • pack('U0S', 0xbfdf) == "\x{7ff}"
  • pack('U0l', 0xbfdf) == "\x{7ff}\x00\x00"
  • pack('U0L', 0xbfdf) == "\x{7ff}\x00\x00"
  • pack('U0q', 0xbfdf) == "\x{7ff}\x00\x00\x00\x00\x00\x00"
  • pack('U0Q', 0xbfdf) == "\x{7ff}\x00\x00\x00\x00\x00\x00"
  • pack('U0i', 0xbfdf) == "\x{7ff}\x00\x00" (i == l in my case)
  • pack('U0I', 0xbfdf) == "\x{7ff}\x00\x00" (I == L in my case)
  • pack('U0n', 0xdfbf) == "\x{7ff}"
  • pack('U0N', 0xdfbf) == "\x00\x00\x{7ff}"
  • pack('U0v', 0xbfdf) == "\x{7ff}" (v == S in the case of a little endian system)
  • pack('U0V', 0xbfdf) == "\x{7ff}\x00\x00" (V == L in the case of a little endian system)
  • pack('U0U', 0x7ff) == "\x{7ff}" (U adds to the sequence of bytes the UTF-8 representation of its argument)

Do note that the sequence of bytes doesn't have to be valid UTF-8 at any intermediate step (pack('U0aXac', "\x80", "\xdf", 0xbf - 0x100) == "\x{7ff}", X erases the last byte).

In addition to turning on the U0 mode explicitly, it's turned on implicitly when TEMPLATE starts with U (pack('Ua2', 0x7ff, "\xdf\xbf") == "\x{7ff}\x{7ff}"). In C0 mode U produces UTF-8 representation of its argument (pack('C0U', 0x7ff) == "\xdf\xbf", pack('aU', "\x80", 0x7ff) == "\x80\xdf\xbf"). You can always switch the mode midway explicitly (pack('...U0...C0...', ...)).

Or in other words, generally W and U take a code point and produce a character (pack('W', 1) == "\x01", pack('U', 1) == "\x01"). But in U0 mode W takes UTF-8 representation and produces a character (pack('U0W2', 0xdf, 0xbf) == "\x{7ff}"), and in C0 mode U takes a code point and produces UTF-8 representation (pack('C0U', 0x7ff) == "\xdf\xbf").

x produces a null (pack('x') == "\x00").

X takes a step back, removing the characters in the process (pack('aX', 'a') == '').

@ moves the current position in the resulting string, truncating or null-filling it in the process (pack('a@0', 'a') == '', pack('@1') == "\x00"). The repeat count is an absolute position counted from the beginning of the resulting string.

. is like @ that takes its argument not from a repeat count (pack('a.', 'a', 0) == '', pack('.', 1) == "\x00").

Template characters might be grouped with parenthesis. In this case the @/.'s arguments are counted from the start of the innermost group (pack('a(a@0)', 'a', 'a') == 'a').

To add to the brief list:

  • x a null
  • X a step back
  • @ moving the cursor
  • . moving the cursor
$ docker run --rm -itv "$PWD":/host alpine:3.20
/ # apk add perl perl-test2-suite perl-utils
/ # for f in host/.*.pl host/*.pl; prove "$f" || break; done

perlpacktut
pack
unpack
Pack/Unpack Tutorial

What follows is my experiments:

use Test2::V0;
# produces \x00 if not enough arguments
is pack('a'), "\x00", 'a: no args';
is pack('a', ''), "\x00", 'a: empty arg';
# takes a character
is pack('a', "\x00"), "\x00", 'a: \x00';
is pack('a', "\x01"), "\x01", 'a: non-printable';
is pack('a', 'a'), 'a', 'a: ascii';
is pack('a', '雪'), '雪', 'a: non-ascii';
# ignores extra characters
is pack('a', 'aa'), 'a', 'a: extra char';
# pads with \x00
is pack('a2', 'a'), "a\x00", 'a: padding';
# takes characters only from one argument
is pack('a*', 'a', 'a'), 'a', 'a: extra arg';
# a* takes all the characters
is pack('a*', 'aa'), 'aa', 'a*';
done_testing;
use Test2::V0;
# produces a space if not enough arguments
is pack('A'), ' ', 'A: no args';
is pack('A', ''), ' ', 'A: empty arg';
# takes a character
is pack('A', "\x00"), "\x00", 'A: \x00';
is pack('A', "\x01"), "\x01", 'A: non-printable';
is pack('A', 'a'), 'a', 'A: ascii';
is pack('A', '雪'), '雪', 'A: non-ascii';
# ignores extra characters
is pack('A', 'aa'), 'a', 'A: extra char';
# pads with spaces
is pack('A2', 'a'), 'a ', 'A: padding';
# takes characters only from one argument
is pack('A*', 'a', 'a'), 'a', 'A: extra arg';
# A* takes all the characters
is pack('A*', 'aa'), 'aa', 'A*';
done_testing;
use Test2::V0;
# always produces \x00
is pack('Z'), "\x00", 'Z: no arg';
is pack('Z', ''), "\x00", 'Z: empty arg';
# takes a character and adds \x00
is pack('Z2', "\x00"), "\x00\x00", 'Z: \x00';
is pack('Z2', "\x01"), "\x01\x00", 'Z: non-printable';
is pack('Z2', 'a'), "a\x00", 'Z: ascii';
is pack('Z2', '雪'), "雪\x00", 'Z: non-ascii';
# ignores extra characters
is pack('Z2', 'aa'), "a\x00", 'Z: extra char';
# pads with \x00
is pack('Z3', 'a'), "a\x00\x00", 'Z: padding';
# takes characters only from one argument
is pack('Z*', 'a', 'a'), "a\x00", 'Z: extra arg';
# Z* takes all the characters and adds \x00
is pack('Z*', 'aa'), "aa\x00", 'Z*';
done_testing;
use Test2::V0;
# produces \x00 if not enough arguments
is pack('b'), "\x00", 'b: no args';
is pack('b', ''), "\x00", 'b: empty arg';
# takes a digit and adds it to LSB
is pack('b', '0'), "\x00", 'b: 0';
is pack('b', '1'), "\x01", 'b: 1';
# takes chr(x) as 1 if x & 1 else as 0
# is pack('b', $_),
# ord($_) & 1 ? "\x01" : "\x00",
# 'b: ' . sprintf '0x%04x', ord $_
# for map chr, 0..65535;
# ignores extra digits
is pack('b', '11'), "\x01", 'b: extra digit';
# takes digits only from one argument
is pack('b*', '1', '1'), "\x01", 'b: extra arg';
# b* takes all the digits
is pack('b*', '11'), "\x03", 'b*';
done_testing;
use Test2::V0;
# produces \x00 if not enough arguments
is pack('B'), "\x00", 'B: no args';
is pack('B', ''), "\x00", 'B: empty arg';
# takes a digit and adds it to MSB
is pack('B', '0'), "\x00", 'B: 0';
is pack('B', '1'), "\x80", 'B: 1';
# takes chr(x) as 1 if x & 1 else as 0
# is pack('B', $_),
# ord($_) & 1 ? "\x80" : "\x00",
# 'B: ' . sprintf '0x%04x', ord $_
# for map chr, 0..65535;
# ignores extra digits
is pack('B', '11'), "\x80", 'B: extra digit';
# takes digits only from one argument
is pack('B*', '1', '1'), "\x80", 'B: extra arg';
# b* takes all the digits
is pack('B*', '11'), "\xc0", 'B*';
done_testing;
use Test2::V0;
# produces \x00 if not enough arguments
is pack('h'), "\x00", 'h: no args';
is pack('h', ''), "\x00", 'h: empty arg';
# takes a digit and adds it to the low nybble
is pack('h', sprintf '%x', $_),
chr($_),
'h: '. sprintf '%x', $_
for 0..15;
# takes chr(x) as x % 16, but there are 2 special cases
# is pack('h', $_),
# chr(
# ord('a') <= ord($_) && ord($_) <= ord('z') ? (ord($_) - ord('a') + 10) % 16
# : ord('A') <= ord($_) && ord($_) <= ord('Z') ? (ord($_) - ord('A') + 10) % 16
# : ord($_) % 16),
# 'h: '. sprintf '0x%04x', ord $_
# for map chr, 0..65535;
# ignores extra digits
is pack('h', '11'), "\x01", 'h: extra digit';
# takes digits only from one argument
is pack('h*', '1', '1'), "\x01", 'h: extra arg';
# h* takes all the digits
is pack('h*', '11'), "\x11", 'h*';
done_testing;
use Test2::V0;
# produces \x00 if not enough arguments
is pack('H'), "\x00", 'H: no args';
is pack('H', ''), "\x00", 'H: empty arg';
# takes a digit and adds it to the high nybble
is pack('H', sprintf '%x', $_),
chr($_ << 4),
'H: '. sprintf '%x', $_
for 0..15;
# takes chr(x) as x % 16, but there are 2 special cases
# is pack('H', $_),
# chr((
# ord('a') <= ord($_) && ord($_) <= ord('z') ? (ord($_) - ord('a') + 10) % 16
# : ord('A') <= ord($_) && ord($_) <= ord('Z') ? (ord($_) - ord('A') + 10) % 16
# : ord($_) % 16
# ) << 4),
# 'H: '. sprintf '0x%04x', ord $_
# for map chr, 0..65535;
# ignores extra digits
is pack('H', '11'), "\x10", 'H: extra digit';
# takes digits only from one argument
is pack('H*', '1', '1'), "\x10", 'H: extra arg';
# H* takes all the digits
is pack('H*', '11'), "\x11", 'H*';
done_testing;
use Test2::V0;
# produces \x00 if no args
is pack('c'), "\x00", 'c: no args';
# takes a signed char
is pack('c', $_), chr($_ & 0xff), "c: $_" for -0x80 + int rand 0x80;
is pack('c', $_), chr($_), "c: $_" for int rand 0x80;
# c* takes all the args
is pack('c*', 1, 1), "\x01\x01", 'c*';
done_testing;
use Test2::V0;
# produces \x00 if no args
is pack('C'), "\x00", 'C: no args';
# takes an unsigned char
is pack('C', $_), chr($_), "C: $_" for int rand 0x100;
# C* takes all the args
is pack('C*', 1, 1), "\x01\x01", 'C*';
# c(-1) == C(255)
is pack('c', -1), pack('C', 255), 'c vs C';
done_testing;
use Test2::V0;
# produces \x00 if no args
is pack('W'), "\x00", 'W: no args';
# takes a code point
is pack('W', $_), chr($_), "W: $_" for int rand 0x110000;
# W* takes all the args
is pack('W*', 1, 1), "\x01\x01", 'W*';
done_testing;
use Test2::V0;
# produces \x00\x00 if no args
is pack('s'), "\x00\x00", 's: no args';
# takes a signed short
is pack('s', $_),
chr($_ & 0xff)
. chr($_ >> 8 & 0xff),
"s: $_"
for -0x8000 + int rand 0x8000;
is pack('s', $_),
chr($_ & 0xff)
. chr($_ >> 8 ),
"s: $_"
for int rand 0x8000;
# s* takes all the args
is pack('s*', 1, 1), "\x01\x00\x01\x00", 's*';
done_testing;
use Test2::V0;
# produces \x00\x00 if no args
is pack('S'), "\x00\x00", 'S: no args';
# takes an unsigned short
is pack('S', $_),
chr($_ & 0xff)
. chr($_ >> 8 ),
"S: $_"
for int rand 0x10000;
# S* takes all the args
is pack('S*', 1, 1), "\x01\x00\x01\x00", 'S*';
# s(-1) == S(0xffff)
is pack('s', -1), pack('S', 0xffff), 's vs S';
done_testing;
use Test2::V0;
# produces \x00\x00\x00\x00 if no args
is pack('l'), "\x00\x00\x00\x00", 'l: no args';
# takes a signed long
is pack('l', $_),
chr($_ & 0xff)
. chr($_ >> 8 & 0xff)
. chr($_ >> 16 & 0xff)
. chr($_ >> 24 & 0xff),
"l: $_"
for -0x8000_0000 + int rand 0x8000_0000;
is pack('l', $_),
chr($_ & 0xff)
. chr($_ >> 8 & 0xff)
. chr($_ >> 16 & 0xff)
. chr($_ >> 24 ),
"l: $_"
for int rand 0x8000_0000;
# l* takes all the args
is pack('l*', 1, 1), "\x01\x00\x00\x00\x01\x00\x00\x00", 'l*';
done_testing;
use Test2::V0;
# produces \x00\x00\x00\x00 if no args
is pack('L'), "\x00\x00\x00\x00", 'L: no args';
# takes an unsigned long
is pack('L', $_),
chr($_ & 0xff)
. chr($_ >> 8 & 0xff)
. chr($_ >> 16 & 0xff)
. chr($_ >> 24 ),
"L: $_"
for int rand 1 << 32;
# L* takes all the args
is pack('L*', 1, 1), "\x01\x00\x00\x00\x01\x00\x00\x00", 'L*';
# l(-1) == L(0xffff_ffff)
is pack('l', -1), pack('L', 0xffff_ffff), 'l vs L';
done_testing;
use Test2::V0;
# produces \x00\x00\x00\x00\x00\x00\x00\x00 if no args
is pack('q'), "\x00\x00\x00\x00\x00\x00\x00\x00", 'q: no args';
# takes a signed quad
is pack('q', $_),
chr($_ & 0xff)
. chr($_ >> 8 & 0xff)
. chr($_ >> 16 & 0xff)
. chr($_ >> 24 & 0xff)
. chr($_ >> 32 & 0xff)
. chr($_ >> 40 & 0xff)
. chr($_ >> 48 & 0xff)
. chr($_ >> 56 ),
"q: $_"
for -(1 << 63) + int rand 1 << 63;
is pack('q', $_),
chr($_ & 0xff)
. chr($_ >> 8 & 0xff)
. chr($_ >> 16 & 0xff)
. chr($_ >> 24 & 0xff)
. chr($_ >> 32 & 0xff)
. chr($_ >> 40 & 0xff)
. chr($_ >> 48 & 0xff)
. chr($_ >> 56 ),
"q: $_"
for int rand 1 << 63;
# q* takes all the args
is pack('q*', 1, 1), "\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00", 'q*';
done_testing;
use Test2::V0;
# produces \x00\x00\x00\x00\x00\x00\x00\x00 if no args
is pack('Q'), "\x00\x00\x00\x00\x00\x00\x00\x00", 'Q: no args';
# takes an unsigned quad
is pack('Q', $_),
chr($_ & 0xff)
. chr($_ >> 8 & 0xff)
. chr($_ >> 16 & 0xff)
. chr($_ >> 24 & 0xff)
. chr($_ >> 32 & 0xff)
. chr($_ >> 40 & 0xff)
. chr($_ >> 48 & 0xff)
. chr($_ >> 56 ),
"Q: $_"
for int rand ~0 + 1;
# Q* takes all the args
is pack('Q*', 1, 1), "\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00", 'Q*';
# q(-1) == Q(0xffff_ffff_ffff_ffff)
is pack('q', -1), pack('Q', ~0), 'q vs Q';
done_testing;
use Test2::V0;
# i == l
done_testing;
use Test2::V0;
# I == L
done_testing;
use Test2::V0;
# produces \x00\x00 if no args
is pack('n'), "\x00\x00", 'n: no args';
# takes an unsigned short (big-endian)
is pack('n', $_),
chr($_ >> 8 )
. chr($_ & 0xff),
"n: $_"
for int rand 0x10000;
# n* takes all the args
is pack('n*', 1, 1), "\x00\x01\x00\x01", 'n*';
done_testing;
use Test2::V0;
# produces \x00\x00\x00\x00 if no args
is pack('N'), "\x00\x00\x00\x00", 'N: no args';
# takes an unsigned long (big-endian)
is pack('N', $_),
chr($_ >> 24 )
. chr($_ >> 16 & 0xff)
. chr($_ >> 8 & 0xff)
. chr($_ & 0xff),
"N: $_"
for int rand 1 << 32;
# N* takes all the args
is pack('N*', 1, 1), "\x00\x00\x00\x01\x00\x00\x00\x01", 'N*';
done_testing;
use Test2::V0;
# v == S
done_testing;
use Test2::V0;
# V == L
done_testing;
use Test2::V0;
use Encode;
# produces \x00 if no args
is pack('U'), "\x00", 'U: no args';
# takes a code point
is pack('U', $_),
chr($_),
'U: ' . sprintf '%x', $_
for int rand 0x110000;
# C0U encodes some code points as is, which encode() encodes as 0xfffd (replacement character):
# * surrogates (0xd800..0xdfff)
# * 0xfdd0..0xfdef
# * ($i << 16) + 0xfffe..($i << 16) + 0xffff, where $i in [0..16]
is pack('C0U', $_),
encode('UTF-8', chr($_)),
'U: ' . sprintf '%x', $_
for sub {
my $n = shift;
return $n if $n < 0xd800;
$n += 0xe000 - 0xd800;
return $n if $n < 0xfdd0;
$n += 0xfdf0 - 0xfdd0;
return $n if $n < 0xfffe;
$n += 0x10000 - 0xfffe;
for my $p (0..16) {
return $n if $n < ($p << 16) + 0xfffe;
$n += ($p << 16) - (($p << 16) + 0xfffe);
}
}->(int rand 0x110000 - (0xe000 - 0xd800 + 0xfdf0 - 0xfdd0 + (0x10000 - 0xfffe) * 17));
# U* takes all the args
is pack('U*', 1, 1), chr(1) . chr(1), 'U*';
done_testing;
use Test2::V0;
# produces \x00
is pack('x'), "\x00", 'x';
is pack('x2'), "\x00\x00", 'x2';
# x* produces nothing
is pack('x*'), '', 'x*';
done_testing;
use Test2::V0;
# can't X at position 0
like dies { pack 'X' }, qr('X' outside of string), 'X: outside of string';
# truncates back to a relative position
is pack('aX', 'a'), '', 'X';
# X0 and X* do nothing
is pack('X0'), '', 'X0';
is pack('X*'), '', 'X*';
done_testing;
use Test2::V0;
# produces \x00 with no repeat count
is pack('@'), "\x00", '@: no repeat count';
# truncates to an absolute position
is pack('a@0', 'a'), '', '@0';
is pack('a@*', 'a'), '', '@*';
# null-fills to an absolute position
is pack('@1'), "\x00", '@1';
# the position is relative to the innermost (
is pack('a(a@0)', 'a', 'a'), 'a', '(@0)';
done_testing;
use Test2::V0;
# does nothing if no args
is pack('.'), '', '.: no args';
# truncates to an absolute position
is pack('a.', 'a', 0), '', '.0';
# null-fills to an absolute position
is pack('.', 1), "\x00", '.1';
# the position is relative to the innermost (
is pack('a(a.)', 'a', 'a', 0), 'a', '(.0)';
done_testing;
use Test2::V0;
sub px {
my ($n, $f, $l) = @_;
$f //= '';
$l //= 'x';
$n < 0 ? sprintf("-0${l}%${f}${l}", -$n) : sprintf("0${l}%${f}${l}", $n);
}
# sub pb {
# px @_, 'b';
# }
sub resilient_warnings(&) {
my $code = shift;
warnings { my $r = dies { $code->() } };
}
sub resilient_warning(&) {
my $code = shift;
warning { my $r = dies { $code->() } };
}
# sub reverse_bits {
# oct '0b' . join '', reverse split '', sprintf '%08b', shift;
# }
#
# sub reverse_nybbles {
# my $b = shift;
# ($b << 4 & 0xff) + ($b >> 4);
# }
# 110xxxyy 10yyzzzz
# 11000010 10000000 = c2 80
# 000 1000 0000 = 0x80
# 11011111 10111111 = df bf
# 111 1111 1111 = 0x7ff
#
# 1110wwww 10xxxxyy 10yyzzzz
# 11100000 10100000 10000000 = e0 a0 80
# 0000 1000 0000 0000 = 0x800
# 11101111 10111111 10111111 = ef bf bf
# 1111 1111 1111 1111 = 0xffff
#
# 11110uvv 10vvwwww 10xxxxyy 10yyzzzz
# 11110010 10000000 10000000 10000000 = f2 80 80 80
# 0 1000 0000 0000 0000 0000 = 0x10000
# 11110100 10001111 10111111 10111111 = f4 8f bf bf
# 1 0000 1111 1111 1111 1111 = 0x10ffff
# U0a
is resilient_warning { pack 'U0a', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0a: invalid: ' . px ord $_, '02'
for "\x80";
is pack('U0a*', $_),
"\x{7ff}",
'U0a: valid: ' . join ' ', map px(ord $_, '02'), split '', $_
for "\xdf\xbf";
# U0A
is resilient_warning { pack 'U0A', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0A: invalid: ' . px ord $_, '02'
for "\x80";
is pack('U0A*', $_),
"\x{7ff}",
'U0A: valid: ' . join ' ', map px(ord $_, '02'), split '', $_
for "\xdf\xbf";
# U0Z
is resilient_warning { pack 'U0Z*', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0Z: invalid: ' . px ord $_, '02'
for "\x80";
is pack('U0Z*', $_),
"\x{7ff}\x00",
'U0Z: valid: ' . join ' ', map px(ord $_, '02'), split '', $_
for "\xdf\xbf";
# U0b
is resilient_warning { pack 'U0b*', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0b: invalid: ' . $_
for '0000' . '0001';
is pack('U0b*', $_),
"\x{7ff}",
'U0b: valid: ' . $_
for '1111' . '1011' . '1111' . '1101';
# U0B
is resilient_warning { pack 'U0B*', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0B: invalid: ' . $_
for '1000' . '0000';
is pack('U0B*', $_),
"\x{7ff}",
'U0B: valid: ' . $_
for '1101' . '1111' . '1011' . '1111';
# U0h
is resilient_warning { pack 'U0h*', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0h: invalid: ' . $_
for '08';
is pack('U0h*', $_),
"\x{7ff}",
'U0h: valid: ' . $_
for 'fdfb';
# U0H
is resilient_warning { pack 'U0H*', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0H: invalid: ' . $_
for '80';
is pack('U0H*', $_),
"\x{7ff}",
'U0H: valid: ' . $_
for 'dfbf';
# U0c
is resilient_warning { pack 'U0c', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0c: invalid: ' . px $_ + 0x100, '02'
for 0x80 - 0x100;
is pack('U0c*', @$_),
"\x{7ff}",
'U0c: valid: ' . join ' ', map px($_ + 0x100, '02'), @$_
for [0xdf - 0x100, 0xbf - 0x100];
# U0C
is resilient_warning { pack 'U0C', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0C: invalid: ' . px $_, '02'
for 0x80;
is pack('U0C*', @$_),
"\x{7ff}",
'U0C: valid: ' . join ' ', map px($_, '02'), @$_
for [0xdf, 0xbf];
# U0W
is resilient_warning { pack 'U0W', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0W: invalid: ' . px $_, '02'
for 0x80;
is pack('U0W*', @$_),
"\x{7ff}",
'U0W: valid: ' . join ' ', map px($_, '02'), @$_
for [0xdf, 0xbf];
# U0s
is resilient_warning { pack 'U0s', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0s: invalid: ' . px $_, '02'
for 0x80;
is pack('U0s', $_),
"\x{7ff}",
'U0s: valid: ' . px $_ + 0x10000, '04'
for 0xbfdf - 0x10000;
# U0S
is resilient_warning { pack 'U0S', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0S: invalid: ' . px $_, '02'
for 0x80;
is pack('U0S', $_),
"\x{7ff}",
'U0S: valid: ' . px $_, '04'
for 0xbfdf;
# U0l
is resilient_warning { pack 'U0l', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0l: invalid: ' . px $_, '02'
for 0x80;
is pack('U0l', $_),
"\x{7ff}\x00\x00",
'U0l: valid: ' . px $_, '04'
for 0xbfdf;
# U0L
is resilient_warning { pack 'U0L', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0L: invalid: ' . px $_, '02'
for 0x80;
is pack('U0L', $_),
"\x{7ff}\x00\x00",
'U0L: valid: ' . px $_, '04'
for 0xbfdf;
# U0q
is resilient_warning { pack 'U0q', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0q: invalid: ' . px $_, '02'
for 0x80;
is pack('U0q', $_),
"\x{7ff}\x00\x00\x00\x00\x00\x00",
'U0q: valid: ' . px $_, '04'
for 0xbfdf;
# U0Q
is resilient_warning { pack 'U0Q', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0Q: invalid: ' . px $_, '02'
for 0x80;
is pack('U0Q', $_),
"\x{7ff}\x00\x00\x00\x00\x00\x00",
'U0Q: valid: ' . px $_, '04'
for 0xbfdf;
# i == l
# I == L
# U0n
is resilient_warning { pack 'U0n', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0n: invalid: ' . px $_, '02'
for 0x80;
is pack('U0n', $_),
"\x{7ff}",
'U0n: valid: ' . px $_, '04'
for 0xdfbf;
# U0N
is resilient_warning { pack 'U0N', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0N: invalid: ' . px $_, '02'
for 0x80;
is pack('U0N', $_),
"\x00\x00\x{7ff}",
'U0N: valid: ' . px $_, '04'
for 0xdfbf;
# v == S
# V == L
# the resulting string must be valid utf8
is pack('U0aXac', "\x80", "\xdf", 0xbf - 0x100), "\x{7ff}", 'U0: what matters is the end state';
# U0 mode is selected implicitly when TEMPLATE starts with U
is pack('Ua2', 0x7ff, "\xdf\xbf"), "\x{7ff}\x{7ff}", 'implicit U0 mode';
is pack('aU', "\x80", 0x7ff), "\x80\xdf\xbf", 'implicit C0 mode';
done_testing;
use Test2::V0;
sub px {
my ($n, $f, $l) = @_;
$f //= '';
$l //= 'x';
$n < 0 ? sprintf("-0${l}%${f}${l}", -$n) : sprintf("0${l}%${f}${l}", $n);
}
sub resilient_warnings(&) {
my $code = shift;
warnings { my $r = dies { $code->() } };
}
sub resilient_warning(&) {
my $code = shift;
warning { my $r = dies { $code->() } };
}
# U0a wraps values not in the range 0..0xff
is pack('U0a', $_),
$_,
'U0a: wrap: ' . px ord($_), '02'
for "\x00";
is resilient_warning { pack 'U0a', $_ },
match qr(Malformed UTF-8 character: \\xff),
'U0a: wrap: ' . px ord($_), '02'
for "\xff";
is warning { pack 'U0a', $_ },
match qr(Character\(s\) in 'a' format wrapped),
'U0a: wrap: ' . px ord($_), '02'
for "\x{100}";
# U0A wraps values not in the range 0..0xff
is pack('U0A', $_),
$_,
'U0A: wrap: ' . px ord($_), '02'
for "\x00";
is resilient_warning { pack 'U0A', $_ },
match qr(Malformed UTF-8 character: \\xff),
'U0A: wrap: ' . px ord($_), '02'
for "\xff";
is warning { pack 'U0A', $_ },
match qr(Character\(s\) in 'A' format wrapped),
'U0A: wrap: ' . px ord($_), '02'
for "\x{100}";
# U0Z wraps values not in the range 0..0xff
is pack('U0Z2', $_),
$_ . "\x00",
'U0Z: wrap: ' . px ord($_), '02'
for "\x00";
is resilient_warnings { pack 'U0Z2', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\x00),
match qr(Malformed UTF-8 character: \\xff\\x00),
],
'U0Z: wrap: ' . px ord($_), '02'
for "\xff";
is warning { pack 'U0Z2', $_ },
match qr(Character\(s\) in 'Z' format wrapped),
'U0Z: wrap: ' . px ord($_), '02'
for "\x{100}";
# c wraps values not in the range -0x80..0x7f
is warning { pack 'c', $_ },
match qr(Character in 'c' format wrapped),
'c: wrap: ' . px $_, '02'
for -0x81;
is pack('c', $_), "\x80", 'c: wrap: ' . px $_, '02' for -0x80;
is pack('c', $_), "\x7f", 'c: wrap: ' . px $_, '02' for 0x7f;
is warning { pack 'c', $_ },
match qr(Character in 'c' format wrapped),
'c: wrap: ' . px $_, '02'
for 0x80;
# U0c wraps values not in the range -0x80..0x7f
is warning { pack 'U0c', $_ },
match qr(Character in 'c' format wrapped),
'U0c: wrap: ' . px $_, '02'
for -0x81;
is resilient_warning { pack 'U0c', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0c: wrap: ' . px $_, '02'
for -0x80;
is pack('U0c', $_), "\x7f", 'U0c: wrap: ' . px $_, '02' for 0x7f;
is resilient_warnings { pack 'U0c', $_ }, [
match qr(Character in 'c' format wrapped),
match qr(Malformed UTF-8 character: \\x80),
],
'U0c: wrap: ' . px $_, '02'
for 0x80;
# C wraps values not in the range 0..0xff
is warning { pack 'C', $_ },
match qr(Character in 'C' format wrapped),
'C: wrap: ' . px $_, '02'
for -0x01;
is pack('C', $_), "\x00", 'C: wrap: ' . px $_, '02' for 0x00;
is pack('C', $_), "\xff", 'C: wrap: ' . px $_, '02' for 0xff;
is warning { pack 'C', $_ },
match qr(Character in 'C' format wrapped),
'C: wrap: ' . px $_, '02'
for 0x100;
# U0C wraps values not in the range 0..0xff
is resilient_warnings { pack 'U0C', $_ }, [
match qr(Character in 'C' format wrapped),
match qr(Malformed UTF-8 character: \\xff),
],
'U0C: wrap: ' . px $_, '02'
for -0x01;
is pack('U0C', $_), "\x00", 'U0C: wrap: ' . px $_, '02' for 0x00;
is resilient_warning { pack 'U0C', $_ },
match qr(Malformed UTF-8 character: \\xff),
'U0C: wrap: ' . px $_, '02'
for 0xff;
is warning { pack 'U0C', $_ },
match qr(Character in 'C' format wrapped),
'U0C: wrap: ' . px $_, '02'
for 0x100;
# W doesn't produce wrap messages
is dies { pack 'W', $_ },
match qr(Use of code point 0xFFFFFFFFFFFFFFFF is not allowed; the permissible max is 0x7FFFFFFFFFFFFFFF),
'W: wrap: ' . px $_, '06'
for -0x000001;
is pack('W', $_), "\x00", 'W: wrap: ' . px $_, '06' for 0x000000;
is pack('W', $_), "\x{10ffff}", 'W: wrap: ' . px $_, '06' for 0x10ffff;
is pack('W', $_), "\x{110000}", 'W: wrap: ' . px $_, '06' for 0x110000;
# U0W wraps values not in the range 0..0xff
is resilient_warnings { pack 'U0W', $_ }, [
match qr(Character in 'W' format wrapped),
match qr(Malformed UTF-8 character: \\xff),
],
'U0W: wrap: ' . px $_, '02'
for -0x01;
is pack('U0W', $_), "\x00", 'U0W: wrap: ' . px $_, '02' for 0x00;
is resilient_warnings { pack 'U0W', $_ }, [
match qr(Malformed UTF-8 character: \\xff),
],
'U0W: wrap: ' . px $_, '02'
for 0xff;
is warning { pack 'U0W', $_ },
match qr(Character in 'W' format wrapped),
'U0W: wrap: ' . px $_, '02'
for 0x100;
# s doesn't produce wrap messages
is pack('s', $_), "\xff\x7f", 's: wrap: ' . px $_, '04' for -0x8001;
is pack('s', $_), "\x00\x80", 's: wrap: ' . px $_, '04' for -0x8000;
is pack('s', $_), "\xff\x7f", 's: wrap: ' . px $_, '04' for 0x7fff;
is pack('s', $_), "\x00\x80", 's: wrap: ' . px $_, '04' for 0x8000;
# U0s doesn't produce wrap messages
is resilient_warnings { pack 'U0s', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\x7f),
match qr(Malformed UTF-8 character: \\xff\\x7f),
],
'U0s: wrap: ' . px $_, '04'
for -0x8001;
is resilient_warning { pack 'U0s', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0s: wrap: ' . px $_, '04'
for -0x8000;
is resilient_warnings { pack 'U0s', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\x7f),
match qr(Malformed UTF-8 character: \\xff\\x7f),
],
'U0s: wrap: ' . px $_, '04'
for 0x7fff;
is resilient_warning { pack 'U0s', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0s: wrap: ' . px $_, '04'
for 0x8000;
# S doesn't produce wrap messages
is pack('S', $_), "\xff\xff", 'S: wrap: ' . px $_, '04' for -0x0001;
is pack('S', $_), "\x00\x00", 'S: wrap: ' . px $_, '04' for 0x0000;
is pack('S', $_), "\xff\xff", 'S: wrap: ' . px $_, '04' for 0xffff;
is pack('S', $_), "\x00\x00", 'S: wrap: ' . px $_, '04' for 0x10000;
# U0S doesn't produce wrap messages
is resilient_warnings { pack 'U0S', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff),
match qr(Malformed UTF-8 character: \\xff\\xff),
],
'U0S: wrap: ' . px $_, '04'
for -0x0001;
is pack('U0S', $_), "\x00\x00", 'U0S: wrap: ' . px $_, '04' for 0x0000;
is resilient_warnings { pack 'U0S', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff),
match qr(Malformed UTF-8 character: \\xff\\xff),
],
'U0S: wrap: ' . px $_, '04'
for 0xffff;
is pack('U0S', $_), "\x00\x00", 'U0S: wrap: ' . px $_, '04' for 0x10000;
# l doesn't produce wrap messages
is pack('l', $_), "\xff\xff\xff\x7f", 'l: wrap: ' . px $_, '08' for -0x8000_0001;
is pack('l', $_), "\x00\x00\x00\x80", 'l: wrap: ' . px $_, '08' for -0x8000_0000;
is pack('l', $_), "\xff\xff\xff\x7f", 'l: wrap: ' . px $_, '08' for 0x7fff_ffff;
is pack('l', $_), "\x00\x00\x00\x80", 'l: wrap: ' . px $_, '08' for 0x8000_0000;
# U0l doesn't produce wrap messages
is resilient_warnings { pack 'U0l', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\x7f),
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\x7f),
],
'U0l: wrap: ' . px $_, '08'
for -0x8000_0001;
is resilient_warning { pack 'U0l', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0l: wrap: ' . px $_, '08'
for -0x8000_0000;
is resilient_warnings { pack 'U0l', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\x7f),
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\x7f),
],
'U0l: wrap: ' . px $_, '08'
for 0x7fff_ffff;
is resilient_warning { pack 'U0l', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0l: wrap: ' . px $_, '08'
for 0x8000_0000;
# L doesn't produce wrap messages
is pack('L', $_), "\xff\xff\xff\xff", 'L: wrap: ' . px $_, '08' for -0x0000_0001;
is pack('L', $_), "\x00\x00\x00\x00", 'L: wrap: ' . px $_, '08' for 0x0000_0000;
is pack('L', $_), "\xff\xff\xff\xff", 'L: wrap: ' . px $_, '08' for 0xffff_ffff;
is pack('L', $_), "\x00\x00\x00\x00", 'L: wrap: ' . px $_, '08' for 1 << 32;
# U0L doesn't produce wrap messages
is resilient_warnings { pack 'U0L', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff),
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff),
],
'U0L: wrap: ' . px $_, '08'
for -0x0000_0001;
is pack('U0L', $_), "\x00\x00\x00\x00", 'U0L: wrap: ' . px $_, '08' for 0x0000_0000;
is resilient_warnings { pack 'U0L', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff),
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff),
],
'U0L: wrap: ' . px $_, '08'
for 0xffff_ffff;
is pack('U0L', $_), "\x00\x00\x00\x00", 'U0L: wrap: ' . px $_, '08' for 1 << 32;
# q doesn't produce wrap messages
is pack('q', $_),
"\x00\x00\x00\x00\x00\x00\x00\x80",
'q: wrap: ' . px $_, '016'
for -(1 << 63); # -0x8000_0000_0000_0000
is pack('q', $_),
"\xff\xff\xff\xff\xff\xff\xff\x7f",
'q: wrap: ' . px $_, '016'
for (1 << 63) - 1; # 0x7fff_ffff_ffff_ffff
is pack('q', $_),
"\x00\x00\x00\x00\x00\x00\x00\x80",
'q: wrap: ' . px $_, '016'
for 1 << 63; # 0x8000_0000_0000_0000
# U0q doesn't produce wrap messages
is resilient_warning { pack 'U0q', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0q: wrap: ' . px $_, '016'
for -(1 << 63); # -0x8000_0000_0000_0000
is resilient_warnings { pack 'U0q', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff\\xff\\xff\\xff\\x7f),
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff\\xff\\xff\\xff\\x7f),
],
'U0q: wrap: ' . px $_, '016'
for (1 << 63) - 1; # 0x7fff_ffff_ffff_ffff
is resilient_warning { pack 'U0q', $_ },
match qr(Malformed UTF-8 character: \\x80),
'U0q: wrap: ' . px $_, '016'
for 1 << 63; # 0x8000_0000_0000_0000
# Q doesn't produce wrap messages
is pack('Q', $_),
"\xff\xff\xff\xff\xff\xff\xff\xff",
'Q: wrap: ' . px $_, '016'
for -1; # -0x0000_0000_0000_0001
is pack('Q', $_),
"\x00\x00\x00\x00\x00\x00\x00\x00",
'Q: wrap: ' . px $_, '016'
for 0; # 0x0000_0000_0000_0000
is pack('Q', $_),
"\xff\xff\xff\xff\xff\xff\xff\xff",
'Q: wrap: ' . px $_, '016'
for ~0; # 0xffff_ffff_ffff_ffff
# U0Q doesn't produce wrap messages
is resilient_warnings { pack 'U0Q', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff),
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff),
],
'Q: wrap: ' . px $_, '016'
for -1;
is pack('U0Q', $_),
"\x00\x00\x00\x00\x00\x00\x00\x00",
'Q: wrap: ' . px $_, '016'
for 0;
is resilient_warnings { pack 'U0Q', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff),
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff),
],
'Q: wrap: ' . px $_, '016'
for ~0;
# i == l
# I == L
# n doesn't produce wrap messages
is pack('n', $_), "\xff\xff", 'n: wrap: ' . px $_, '04' for -0x0001;
is pack('n', $_), "\x00\x00", 'n: wrap: ' . px $_, '04' for 0x0000;
is pack('n', $_), "\xff\xff", 'n: wrap: ' . px $_, '04' for 0xffff;
is pack('n', $_), "\x00\x00", 'n: wrap: ' . px $_, '04' for 0x10000;
# U0n doesn't produce wrap messages
is resilient_warnings { pack 'U0n', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff),
match qr(Malformed UTF-8 character: \\xff\\xff),
],
'U0n: wrap: ' . px $_, '04'
for -0x0001;
is pack('U0n', $_), "\x00\x00", 'U0n: wrap: ' . px $_, '04' for 0x0000;
is resilient_warnings { pack 'U0n', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff),
match qr(Malformed UTF-8 character: \\xff\\xff),
],
'U0n: wrap: ' . px $_, '04'
for 0xffff;
is pack('U0n', $_), "\x00\x00", 'U0n: wrap: ' . px $_, '04' for 0x10000;
# N doesn't produce wrap messages
is pack('N', $_), "\xff\xff\xff\xff", 'N: wrap: ' . px $_, '08' for -0x0000_0001;
is pack('N', $_), "\x00\x00\x00\x00", 'N: wrap: ' . px $_, '08' for 0x0000_0000;
is pack('N', $_), "\xff\xff\xff\xff", 'N: wrap: ' . px $_, '08' for 0xffff_ffff;
is pack('N', $_), "\x00\x00\x00\x00", 'N: wrap: ' . px $_, '08' for 1 << 32;
# U0N doesn't produce wrap messages
is resilient_warnings { pack 'U0N', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff),
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff),
],
'U0N: wrap: ' . px $_, '08'
for -0x0000_0001;
is pack('U0N', $_), "\x00\x00\x00\x00", 'U0N: wrap: ' . px $_, '08' for 0x0000_0000;
is resilient_warnings { pack 'U0N', $_ }, [
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff),
match qr(Malformed UTF-8 character: \\xff\\xff\\xff\\xff),
],
'U0N: wrap: ' . px $_, '08'
for 0xffff_ffff;
is pack('U0N', $_), "\x00\x00\x00\x00", 'U0N: wrap: ' . px $_, '08' for 1 << 32;
# v == S
# V == L
# U doesn't produce wrap messages
is dies { pack 'U', $_ },
match qr(Use of code point 0xFFFFFFFFFFFFFFFF is not allowed; the permissible max is 0x7FFFFFFFFFFFFFFF),
'U: wrap: ' . px $_, '06'
for -1;
is pack('U', $_), "\x00", 'U: wrap: ' . px $_, '06' for 0x000000;
is pack('U', $_), "\x{10ffff}", 'U: wrap: ' . px $_, '06' for 0x10ffff;
is pack('U', $_), "\x{110000}", 'U: wrap: ' . px $_, '06' for 0x110000;
# C0U doesn't produce wrap messages
is dies { pack 'C0U', $_ },
match qr(Use of code point 0xFFFFFFFFFFFFFFFF is not allowed; the permissible max is 0x7FFFFFFFFFFFFFFF),
'C0U: wrap: ' . px $_, '06'
for -1;
is pack('C0U', $_), "\x00", 'C0U: wrap: ' . px $_, '06' for 0x000000;
is pack('C0U', $_), "\xf4\x8f\xbf\xbf", 'C0U: wrap: ' . px $_, '06' for 0x10ffff;
is pack('C0U', $_), "\xf4\x90\x80\x80", 'C0U: wrap: ' . px $_, '06' for 0x110000;
done_testing;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment