Skip to content

Instantly share code, notes, and snippets.

@karenetheridge
Last active February 18, 2025 19:10
Show Gist options
  • Save karenetheridge/0bb61034f757be7f822d37432ef5ca90 to your computer and use it in GitHub Desktop.
Save karenetheridge/0bb61034f757be7f822d37432ef5ca90 to your computer and use it in GitHub Desktop.
mysterious bug in HTTP::Headers::Fast::XS (?)
# vim: set ts=8 sts=2 sw=2 tw=100 et :
use strict;
use warnings;
use 5.020;
use experimental 'signatures';
use Data::Dumper;
use List::Util 'pairs';
use HTTP::Request;
use URI;
use Plack::Request;
use HTTP::Message::PSGI;
use HTTP::Headers::Fast 0.21;
use HTTP::Headers::Fast::XS; # use of this is required to see the error!
sub request ($method, $uri_string, $headers = [], $body_content = undef) {
my $uri = URI->new($uri_string);
my $host = $uri->host;
my $req = HTTP::Request->new($method => $uri, [], $body_content);
$req->headers->push_header(@$_) foreach pairs @$headers, $host ? (Host => $host) : ();
$req->headers->header('Content-Length' => length($body_content))
if defined $body_content and not defined $req->headers->header('Content-Length')
and not defined $req->headers->header('Transfer-Encoding');
$req->protocol('HTTP/1.1'); # required, but not added by HTTP::Request constructor
$req = Plack::Request->new($req->to_psgi);
# Plack is unable to distinguish between %2F and /, so the raw (undecoded) uri can be passed
# here. see PSGI::FAQ
$req->env->{REQUEST_URI} = $uri . '';
$req->env->{'psgi.url_scheme'} = $uri->scheme;
return $req;
}
my $request = request('POST', 'http://example.com/foo', [ 'Content-Type' => 'text/plain' ], 'hello');
# XXX this is the important line to not remove?! is this bleeding over into the next request?
$request->headers->remove_header('Content-Length');
$request = request('POST', 'http://example.com/foo', [ 'Content-Type' => 'text/bloop' ], 'plain text');
# look, CONTENT_LENGTH is set here!
print STDERR "### request env is ", Dumper($request->env);
# good: CONTENT_LENGTH is 10. bad: CONTENT_LENGTH is missing!
print STDERR "### env CONTENT_LENGTH is ", ($request->env->{CONTENT_LENGTH}//'<undef>'), "\n";
__END__
The bug is where the CONTENT_LENGTH field in the psgi environment is not being
returned when queried for directly, even though a Data::Dumper does show it to
be set.
several things need to be true in order to see this bug:
- the remove_header line above must be present, even though it is acting on a
completely different object than the subject of our test.
- HTTP::Headers::Fast::XS must be loaded
- installed version of Plack must be 1.0042 or older. I have bisected this to
commit 1c0bdb0e153: "lowercase headers before passing to HTTP::Headers::Fast"
when commenting out the change to lc() from this commit, the problem
manifests.
ALTHOUGH, when Data::Dumpering the HTTP::Headers::Fast object constructed in
Plack::Request::headers, the object looks identical whether or not the lc()
change is in effect... so something in HTTP::Headers::Fast::_header_set is
doing an equivalent change (at least as far as we can tell from looking via
pure-perl). And this is also true when the XS library is loaded, although in
that case, something is going very wrong with the Content-Length header.
So my theory is that something in HTTP::Headers::Fast::XS::_header_set is
wrong. Also note that HTTP::Headers::Fast::remove_header doesn't have an XS
implementation, but it still needs to be called (on a different object!) to
make this problem show itself.
@karenetheridge
Copy link
Author

karenetheridge commented Feb 10, 2025

@karenetheridge
Copy link
Author

karenetheridge commented Feb 11, 2025

from #p5p:

15:53 < ether> posting here too, in case someone wants to be nerd-swiped by it: https://gist.github.com/karenetheridge/0bb61034f757be7f822d37432ef5ca90
15:53 < ether> "the case of the mysteriously disappearing hash entry"
16:15 < mauke> has to do with literal hash keys
16:15 < mauke> my $tmp = 'CONTENT_LENGTH'; $env->{$tmp} works
16:20 < ether> mauke: are you aware of an issue similar to this?
16:20 < mauke> no
16:20 < haarg> perhaps HTTP::Headers::Fast::XS should be left to die
16:21 < mauke> I'm looking at the C code and I wouldn't use it
16:21 < ether> it's really weird though how several things need to be in place for this to happen, even things which seem totally unrelated
16:21 < mauke> maybe perl is working around it, but it's using tolower/toupper wrong
16:21 < mauke> oof, alloca()
16:21 < ether> I spent about 4 hours boiling this down from the original while watching Star Treks
16:22 < haarg> it overrides several internal methods from HTTP::Headers::Fast, it isn't like most other ::XS modules that have some cooperation from the module using its routines
16:22 < haarg> also you appear to be the only user of it on CPAN
16:23 < mauke> hah. "This is still EXPERIMENTAL and in development. Try it out, enjoy, and use at your own risk."
16:25 < khw> for some definition of 'enjoy'
16:25 < ether> indeed :D
16:25 < ether> FWIW this was the problem that made me whine to srezic about him not setting AUTOMATED_TESTING=1 on his smokers
16:26 < ether> I was getting cryptic http message mismatch errors and needed a ton more diagnostics to see where things were going wrong
16:26 < ether> and then was able to find that $request->body was coming up empty when it shouldn't be
16:28 < mauke> this code doesn't look like it handles COW strings
16:29 < mauke> it just writes into random scalars
16:45 < mauke> oh look, if I blindly change all SvPV in HTTP::Headers::Fast::XS to SvPV_force, the tests fail with "Modification of a read-only value attempted ..."
16:46 < mauke> my recommendation: avoid this module
16:47 < ether> lolsob
16:48 < haarg> 10 year old, single release, experimental module, used by nobody
16:49 < mauke> the lc thing sort of makes sense now
16:50 < haarg> checked to make sure - we don't use it at booking
16:50 < mauke> because even for read operations, the module wants to normalize (i.e. rewrite in-place) field names to lowercase
16:50 < mauke> which is kind of bad if those field names come from read-only or shared scalars
16:52 < mauke> (also, I'm not sure what this CXT stuff is about, but it doesn't smell thread safe)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment