✅ PHP 8.4.11, Debian package, Debian Linux 13:
$ docker run --rm -it debian:13
temp# apt-get update -qq && apt-get install -qy php locales
temp# locale -a
C
C.UTF-8
POSIX
temp$ php --version; php -a
PHP 8.4.11 (cli)
php > var_dump(setlocale(LC_ALL, 'C.UTF-8'));
string(7) "C.UTF-8"
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
❌ PHP 8.4.11, Homebrew package, macOS 15:
$ uname -sr; which php; php --version
Darwin 24.4.0 # macOS 15 Sequoia
/opt/homebrew/bin/php
PHP 8.4.11 (cli)
$ php -a
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
php > var_dump(setlocale(LC_ALL, 'C.UTF-8'));
string(7) "C.UTF-8"
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(1) array(1) { [0]=> string(1) "?" }
✅ PHP 8.5.0-dev, compile (plainly), macOS 15:
$ git log --oneline -1
359f4420a4 (origin/master, origin/HEAD, master) Merge branch 'PHP-8.4'
$ READLINE_DIR=/opt/homebrew/opt/readline ./configure --enable-debug --with-readline && make -j8
$ ./sapi/cli/php --version
PHP 8.5.0-dev…
$ ./sapi/cli/php -i | grep -A10 pcre
pcre
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 10.45 2025-02-05
PCRE Unicode Version => 16.0.0
PCRE JIT Support => enabled
PCRE JIT Target => ARM-64 64bit (little endian + unaligned)
$ ./sapi/cli/php -a
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
php > var_dump(setlocale(LC_ALL, 'C.UTF-8'));
string(7) "C.UTF-8"
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
Start to reduce differences, based on: https://github.com/Homebrew/homebrew-core/blob/7a7679884bcce97cc378e733417631bcb02e2678/Formula/p/php.rb
✅ PHP 8.5.0-dev, compile (with external pcre), macOS 15:
$ ./configure --enable-debug --with-external-pcre --with-libedit && make -j8
$ ./sapi/cli/php --version
PHP 8.5.0-dev …
$ ./sapi/cli/php -i | grep -A10 pcre
pcre
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 10.45 2025-02-05
PCRE Unicode Version => 16.0.0
PCRE JIT Support => enabled
PCRE JIT Target => ARM-64 64bit (little endian + unaligned)
…
$ ./sapi/cli/php -a
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
php > var_dump(setlocale(LC_ALL, 'C.UTF-8'));
string(7) "C.UTF-8"
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
❌ PHP 8.5.0-dev, compile (with intl, mbstring, and external pcre), macOS 15:
$ ./configure --enable-debug --enable-intl --enable-mbregex --enable-mbstring --with-external-pcre --with-libedit && make -j8
$ ./sapi/cli/php --version
PHP 8.5.0-dev …
$ ./sapi/cli/php -i | grep -A10 pcre
pcre
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 10.45 2025-02-05
PCRE Unicode Version => 16.0.0
PCRE JIT Support => enabled
PCRE JIT Target => ARM-64 64bit (little endian + unaligned)
…
$ ./sapi/cli/php -a
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
php > var_dump(setlocale(LC_ALL, 'C.UTF-8'));
string(7) "C.UTF-8"
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(1) array(1) { [0]=> string(1) "?" }
❌ PHP 8.5.0-dev, compile (with intl), macOS 15:
$ ./configure --enable-debug --enable-intl --with-libedit && make -j8
$ ./sapi/cli/php --version
PHP 8.5.0-dev …
$ ./sapi/cli/php -i | grep -A10 pcre
pcre
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 10.45 2025-02-05
PCRE Unicode Version => 16.0.0
PCRE JIT Support => enabled
PCRE JIT Target => ARM-64 64bit (little endian + unaligned)
…
$ ./sapi/cli/php -a
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
php > var_dump(setlocale(LC_ALL, 'C.UTF-8'));
string(7) "C.UTF-8"
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(1) array(1) { [0]=> string(1) "?" }
❌ PHP 8.5.0-dev, compile (plainly), macOS 15:
$ ./configure --enable-debug --with-libedit && make -j8
$ ./sapi/cli/php --version
PHP 8.5.0-dev …
$ ./sapi/cli/php -i | grep -A10 pcre
pcre
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 10.45 2025-02-05
PCRE Unicode Version => 16.0.0
PCRE JIT Support => enabled
PCRE JIT Target => ARM-64 64bit (little endian + unaligned)
…
$ ./sapi/cli/php -a
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
php > var_dump(setlocale(LC_ALL, 'C.UTF-8'));
string(7) "C.UTF-8"
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(1) array(1) { [0]=> string(1) "?" }
✅ PHP 8.5.0-dev, compile (plainly with readline instead of libedit), macOS 15:
$ READLINE_DIR=/opt/homebrew/opt/readline ./configure --enable-debug --with-readline && make -j8
$ ./sapi/cli/php --version
PHP 8.5.0-dev …
$ ./sapi/cli/php -i | grep -A10 pcre
pcre
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 10.45 2025-02-05
PCRE Unicode Version => 16.0.0
PCRE JIT Support => enabled
PCRE JIT Target => ARM-64 64bit (little endian + unaligned)
$ ./sapi/cli/php -a
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
php > var_dump(setlocale(LC_ALL, 'C.UTF-8'));
string(7) "C.UTF-8"
php > var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
int(0) array(0) {}
Great. So both libedit and readline mess with setlocale, where readline manages to avoid the bug, and libedit instead causes the bug. Let's switch to a regular file and drop the php -a approach.
https://github.com/search?q=repo%3Agnu-mirror-unofficial%2Freadline%20setlocale&type=code https://github.com/search?q=repo%3Acdesjardins%2Flibedit%20setlocale&type=code
$ cat /tmp/nbsp.php
<?php
var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
var_dump(setlocale(LC_ALL, 'C.UTF-8'));
var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
❌ PHP 8.5.0-dev, compile (actually plain, without readline or libedit), macOS 15:
$ ./configure --enable-debug && make -j8
$ ./sapi/cli/php /tmp/nbsp.php
int(0)
array(0) {
}
string(7) "C.UTF-8"
int(1)
array(1) {
[0]=>
string(1) "?"
}
✅ PHP 8.5.0-dev, compile (actually plain), Debian Linux 13:
$ docker run --rm -it debian:13
temp# apt-get update -qq && apt-get install -qy git locales pkg-config build-essential autoconf bison re2c libxml2-dev libsqlite3-dev vim
temp# locale -a
C
C.UTF-8
POSIX
temp# cd /root && git clone --depth=2 https://github.com/php/php-src.git && cd php-src
temp:~/php-src# git log --oneline -1
359f4420a4 (origin/master, origin/HEAD, master) Merge branch 'PHP-8.4'
temp:~/php-src# ./buildconf && ./configure --enable-debug && make -j8
temp:~/php-src# ./sapi/cli/php --version
PHP 8.5.0-dev …
temp:~/php-src# ./sapi/cli/php -i | grep -A10 pcre
pcre
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 10.45 2025-02-05
PCRE Unicode Version => 16.0.0
PCRE JIT Support => enabled
PCRE JIT Target => ARM-64 64bit (little endian + unaligned)
…
temp:~/php-src# cat /tmp/nbsp.php
<?php
var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
var_dump(setlocale(LC_ALL, 'C.UTF-8'));
var_dump(preg_match("/\s+/", "foo\u{00A0}bar", $m), $m);
temp:~/php-src# ./sapi/cli/php /tmp/nbsp.php
int(0)
array(0) {
}
string(7) "C.UTF-8"
int(0)
array(0) {
}