Skip to content

Instantly share code, notes, and snippets.

@kentfredric
Last active February 23, 2020 05:10
Show Gist options
  • Save kentfredric/8f6ed343f4a16a34b08a to your computer and use it in GitHub Desktop.
Save kentfredric/8f6ed343f4a16a34b08a to your computer and use it in GitHub Desktop.
  1. CGI.pm is shit
  2. CGI is shit
  3. <"ARGV"> shouldn't work under use strict because thats a string dereferencing a symbolic ref.
  4. Hash Keys can't retain tainting and so can be used to propagate un-vetted data into safe spaces:
  my $hash = unsafe_thing_that_returns_a_hash();
  $dbh->query(join keys %{$hash}); # data will be untained regardless of what it is.
  1. CGI.pm should probably do something smarter than simply returning the first param when >1 params are passed and you only expected there to be one.
  2. ^ Doubly so when one of those params is a file
  3. CGI.pm should probably not unify GET parameters and uploaded files like that.
  4. CGI should not introduce bugs when this data is seen as the effective params set:
[
  file => "ARGV",
  file => \$handle,
]

because clearly, when you say "Is that a file"

if ( $cgi->upload('file') ) { # true
  my $file = $cgi->param('file'); # this should not be a string, but it is

This is not good.

CGI Params return list debacle.

Yep. CGI.pm made a bad choice. CGI is not Perl, it is written in perl. Programmers can and do make dumb decisions. Perl does not stop you from making dumb decisions, because Perl doesn't want to stop you making good decisions.

CGI is not recommended, and has not been for some time, and is no longer shipped in the Perl distribution to reflect this.

If you have something to bitch about with CGI: File a bug. Stop Using CGI. Tell people who are using CGI not to.

Software doesn't magically stop being shit because you complain about it on a stage.

That's literally all your options. Being written in Perl instead of Ruby doesnt magically make it more shit. Writing it in some other language wouldnt make it inherently non-shit.

But, you know, go on, hold up the shittiest modules as a representation of the language itself and then present the problem as an intractable one.

Obviously, because Internet Explorer is shit, that whatever language it was written in is shit, and that language will continue to be shit simply because people continue to use Internet Explorer.

Stop Using C++ guys, it has bugs!.

K

"The Community"

Perl Monks is not "The Perl Community", it is one face of a subset of "The Perl Community". There's also RT, IRC, MetaCPAN, StackOverflow, blogs.perl.org, twitter, Facebook, and undoubtedly many more places that "The Perl Community" congregate in.

The community can't really be considered "Gated" in this manner, you've just chosen to find one semi-gated community and decide "That's the Community". That'd be like me referring to a pub as being "The General Public" and then calling "The General Public" a "Gated Community" because pubs have bouncers.

WAT

"You can declare variables without specifying a data type"

Yes. This is true to an extent. And as you said, other languages have this.

But the slide given demonstrates 2 exceptions, because it demonstrates the use of firstclass "Array" data types, and "Hash" data types, which require specific syntax to manipiulate.

This is not your typical "FooType variable" type constraints, but it does clearly seperate "Hash" and "Array" "types" from "Scalar" "types"

For contrast, without the "@" and "%" prefixes, you'd only have the options of

    $int = 0
    $str = "hello";
    $array = [ ];
    $hash  = { };

If this was our reality, we'd be closer to PHP's

   $int = array( )

Except that array is sort of like Schrödingers Array where its a quantum superposition of array and hash, where it behaves like the opposite of the one you wanted at least some of the time.

   $int = array( "an", "array" );
   $int = array( "key" => "value" );            # WAT
   $int = (object)(array( "key" => "value" ));  # WAT 

Its clear we have a little more granularity in types than that.

And there's JavaScript of course that similarly blurs the lines (in useful ways)

   x = { "0": "hello", "1": "world" }; 
   y = [ "hello", "world" ];
   x[0] == y[0] && x[1] == y[1]    // Wat? 
   y["foo"] = "bar";               // WAT 
   JSON.stringify(y);              // WAT? Where did "foo" go? 

Data is an unknown data type and you can't force it to be.

Yes. And this is a standard feature of Dynamic programming languages.

JavaScript, Ruby, Python, etc. They all do this.

Multiple paths for different data types.

Again, Yes, you can have different handling for different types of "detected" types.

This is also a function of most dynamic programming languages.

Languages without this flexibility may support mutliple dispatch instead.

This is why things like

   $("Query String");
   $(function(jQuery) {   });

Are supportable in jQuery.

Because its often more convenient to have an API that allows multiple signatures for a common function, than it is to have to define multiple function names for the different argument options.

"A Perl Standard" ... well, I don't know about that. Its a common convention you see in many places. And how applicable it is to a situation depends on a whole bunch of design trade-offs.

You can do it, and I would even argue, you should!

Because that's exactly the behaviour that is implied by default values for arguments.

Because:

   object.function(argument)

and

   object.function(argument,secondargument)

is essentially asking the question "if second argument is of magical type NOTSPECIFICED, substitute it with an alternative value"

And that's not a whole lot different from "Type Undefined"

And "do something different if the user forgot to pass an argument" is pretty much a standard in every language.

There's similarly no way to tell JavaScript "hey, this argument must not be an array", so you have to detect that at runtime.

So obviously JavaScript sucks and everyone should stop using it.

Incidentally, Perl6 supports these if you want them, so ... yay for Perl6!

"The problem is when Hashes and Arrays are considered as Secure Data types"

And that's just the Crux of it. They aren't. Anyone who writes code with an API knows this.

Anyone making this mistake is the idiot.

Neither the language itself or the culture surrounding it claims that structured data is inherently secure as a function of being structured.

Because wether or not a data structure is secure is clearly a function of where that data came from.

For instance, PHP permits forms to send POST/GET requests which result in arbitrarily structured arrays.

And people who are stupid take that data verbatim as trustworthy.

And that's just it: They're stupid for doing that.

If you don't sanitize your input at the border prior to passing it to something that requires that input to be "Safe", then your data is not "Safe".

You didn't sanitize it, there's no way it can be safe.

And some layers must assume the caller did due dilligence.

For instance if you have a function called "rm" and it takes a string of a file to delete, its not rm's job to work out whether or not the user calling it was permitted, and its not rms job to work out if the string passed is a file-that-should-be-deleted or not.

You fix that or pass it to RM, and if rm deletes it, rm did its job, and did exactly what you told it to do. If you told it to do the wrong thing, that's all on you.

[ACTUAL POSSIBLE ISSUE] Hash Keys are Not Tainted.

This is the only place I've seen a justifiable complaint about taint mode.

Given the point of taint mode is to make things explode if you didn't sanitize things correctly, it would be a problem indeed if somebody passed json like:

   { "DROP TABLES *": "Harmless value" }

And somebody, somehow, managed to sanitize the values properly, but managed to pass those keys unsanitized to a query.

Because:

   ->query( join keys %HASH )

Would silently work, regardles of wether or not the hash key itself was ever sanitized.

The downside is that this problem is possibly too hard to solve, because hash keys are heavily used in the VM itself, and underlies things like Package stashes, and so any taint checks here could slow down perl in entirely unacceptable ways.

Additionally, people have relied on the taintless behaviour of hash keys on purpose, and have copied variables into hashes simply to remove taint flags, which makes changing that behaviour without breaking a lot of code difficult.

I would still maintain that your coding ethics SHOULD NOT rely on taint mode to keep you safe, and you should exercise sufficient levels of dilligence.

But taint mode should still be reliable enough to catch where you've left a barn door open by accident.

But your criticism focused wrongly on hash values as being the attack vector, when its hash keys.

Bugzilla treats hash values as secure.

Yes. This is a bug in Bugzilla.

No, this is not something Perl community promotes or encourages.

All data passed to a function must be treated as equally suspicious, structured or not.

Given the syntax of the _load_from_db function based on the way the slide presents it, the _load_from_db function is intended as a lower level function that just "does what its told", which is why it allows a passing a free-from component of an SQL-Query in verbatim in the first place.

I would posit that the error here is not that it takes a hash as an argument and treats it as "Secure", its that the second method with "detaint" is detainting at the wrong level.

_load_from_db should also be treating $param as "Secure", where any quoting/escaping for passing the value to the database should only be considered an API convenience, not a security feature.

And _load_from_db should be barfing if any tainted data is passed to it, not transparently detainting it.

"On all 3 CGI Modules"

You cite one CGI module.

The other two are Web Platforms, which happen to support the CGI interface under the hood, but are otherwise not related to CGI.pm

And the other two are generally NOT used with the CGI interface, because CGI is shit, and CGI.pm is shit.

The rest of this part of your analysis seems to be complaining that different web platforms do different things, and I'm struggling to understand how that could be construed to be a point, let alone, a point that demonizes Perl itself as being "Bad".

These are all design decisions of the independent modules in question, which all just happen to be written in Perl.

Any language can make such a decision, and Any dynamic language could be equally vulnerable to these sorts of issues.

I imagine if you compared the exact same metrics with different languages and competing web frameworks, you'd again, get a variety of different results.

And I would naturally expect something to happen automatically server-side as a result of extra parameters, because that's a common idiom on the web

( PHP might be more consistent here, because unlike the other languages where parsing GET requests requires dedicated modules that implement parts of the HTTP Specification, PHP implements those parts natively in the language itself and magically pumps the values into global variables. )

"You can't tell what your datatype is made of!"

Wait. You just argued you could, earlier, and complained that it wasn't automatic.

The example used ref, which indicates the type of data you have.

This is, again, a common pattern in dynamic programming languages, and you've used this false claim with other false claims to construct an argument that paints Perl as being radically different from other weakly typed programming languages.

You just have to apply dilligence and write code that handles the different cases.

Its when you don't care what your data is and then you try to do things with that data that require you to care, that things go amok.

The trick is to not do that.

CGI.pm Exploit.

Yes. CGI.pm is shit, stop using it. Stop running conferences where you tell people how not to do things with software we're telling people not to use.

Tell people not to use CGI.pm, because it is shit.

CGI.pm is not "Perl".

[ACTUAL POSSIBLE ISSUE] <"ARGV"> Expolit

You can demonstrate this bug much more clearly with

   use strict;
   use warnings;
   
   @ARGV = ('echo broken|');
   
   while( <"ARGV"> ) {
      print $_;
   }

This problem is again exacerbated because CGI is shit, as CGI created the enviroment wherein you have a list of arbitrary user data in @ARGV, and CGI.pm made things worse by its ->params() bullshit.

But your understanding is wrong.

   my $file = $cgi->param('file');

This does NOT return a list of files in your situation.

   use strict;
   use warnings;
   
   sub print_context {
      print defined wantarray
      ? ( wantarray ? 'list' : 'scalar' )
      : 'void';
   }
   
   my $file = print_context; # scalar

Because CGI.pm now spews a nice big warning if you're dumb enough to use it in List Context.

CGI::param called in list context from {...}, this can lead to vulnerabilities. See the warning in "Fetching the value or values of a single named parameter" at {...}

Instead, CGI.pm does what you'd expect any other thing to do in a situation where the user passed two variables: It returns only one of them.

So your demon about list context is nowhere to be seen here.

However, <"ARGV"> being communicated to is clearly a bug that should be fixed, because @ARGV is a symbol, and strings dereferencing to symbols transparently like that is typically something strictures prohibits.

2-Arg Open And Friends Suck

Well. Yes, that's why we've been recommending againt using them in code since 5.8 shipped.

while(<>) implying 2-arg open is also a thing we've recognised as "bad", so we now have while(<<>>).

But its worth mentioning that if you are writing a Command Line application, the ability to pass a filename, or a string that can be executed to produce content can be useful.

Yes, in a Server context, this is very much not what you want, because you generally don't want users to pass in executables to invoke.

Its kinda useful being able to have

   while(<>) {
   }

In a CLI app and having that transparently support:

   echo foo | perl ./script.pl -
   perl ./script.pl ./sourcefile_1 ./source_file_2 
   perl ./script.pl ./source_file_1 ./source_file_2 'gzcat ./source_file_3|'

But you of course don't want magic data in ARGV when you're running a server.

But ... that's just an argument to "STOP USING CGI"

Because CGI stupidly makes the CLI and the HTTP request a unified interface, so you get the worst of both worlds.

So, again, you've got an exploit that fundementally relies on ancient technology that we're setting on fire and trying to kill for ages now.

Misleading Documentation

What did you do, step in a time machine? You're giving presentations in December using a screenshot of an example dated 2009, that was updated back in April.

https://github.com/leejo/CGI.pm/blob/58dd6be3899d2cb24a2dcd873729b8be0cc43052/examples/file_upload.cgi#L16

I can't confirm if the example has been updated to accommodate this, and I still maintain CGI is shit and you should expect bad things to happen if you use it.

Either way, that's what we call "A bug in CGI.pm", and I was under the impression the right approach is to file a bug when they're found, so that they can be fixed.

Unless of course you want to be a 1337 hacker geek and keep that to yourself for 8 months so you can grand-stand on a stage and feel popular while making it out like its some programming language to blame for bad software existing.

golf clap

"I'm not blaming CGI developers"

But you should. All these problems are design decisions made by programmers.

They could have easily made decisions that avoided these pitfalls.

The language holding your hand wouldn't magically make this go away.

Yes, there are bugs in the language, and CGI's poor design choices make those bug bigger.

But lets not fool ourselves that CGI.pm is the shining standard of decent perl modules on CPAN, like you seem to have a hardon for trying to argue.

And you should blame people who use CGI.pm for using CGI.pm, because...

... CGI.pm is no longer considered good practice for developing web applications, including quick prototyping and small web scripts. There are far better, cleaner, quicker, easier, safer, more scalable, more extensible, more modern alternatives available at this point in time.

And it has said that for over a year now ( May 2014 )

That's right up there in The Manual, and anyone who is part of "The Perl Community" will gladly tell you that CGI.pm is shit, and you shouldn't use it.

And if you're not going to listen to either of those very loud signals, you very much should be considered part of the problem.

I mean, what else do we have to do to convince you not to use CGI.pm?

Maybe we should be adding security holes instead of removing them to discourage its use?

CLOSING.

In short, your diagnosis revolves mostly on needing CGI.pm, and revolves around mistakes that CGI.pm encourages you to make that don't make sense in a Web Context, but do make sense in a CLI context.

And then you conflate intepretations of fundementals that aren't true, and you think Perl is the problem.

But maybe, you should just stop using CGI.pm, and tell people "Stop using CGI.pm"

And maybe, you could consider filing bugs about things and getting them fixed, not bitching that something that is broken and wondering why nobody has fixed it yet.

But hey, at least you know how Perl5 and Perl6 are fundementally different ;), so you got something right :p

AFTERTHOUGHTS on Questions

Why do Perl people bitch when its attacked?

Well, its become a bit of an annoying trope.

Its an "in thing" to go around complaining about Perl, and at least half the people who are "attacking it" are using arguments about it that are false.

So attacking perl with stupid arguments has become a Meme in the programming community, and it is largely perpetuated by people who don't know the language, who see its syntax, never invest the effort to properly understand it, and then critise it, not objectively, but out of ignorance.

So there's naturally a level of kickback.

And we also try to be better than have debates where we're constantly insulting languages for non-objective reasons, becuase we'd rather shut the fuck up and write some code ;)

Just if people could learn to criticise it objectively ( and file fucking bugs plz ), then maybe we could consume that objective criticism and use it to improve the language.

@dr-kd
Copy link

dr-kd commented Jan 1, 2016

Thumbsup from here. @D3Ve1inE: yeah please stop this, it's boring. To reiterate briefly from my experience. CGI.pm is shit. Do not write new code using it. Port old code away from it where possible. Use Plack/PSGI and modules like CGI::Compile/Plack::App::CGI to port away. While I would love it if there was a backcompat CGI.pmButNotDreadful module, it's not going to happen. There are too many poor choices in API design to make that practical to achieve. Why don't you do a talk next year on "proper, reasonably secure web programming in Perl using the modern perl stack rather than the ancient and broken CGI.pm module" rather than continuing to tilt at windmills? I've been doing reasonably big things in perl for a few years now, and generally the security audits have gone fine with only a tiny little extra effort on the team's part. You appear to be arguing that this is not possible?

@pwr22
Copy link

pwr22 commented Jan 1, 2016

Good to see an informed write up of the guy's talk

I stopped watching after about fifteen minutes because it became pretty clear it was a waste of time and was mostly about the olympic sport of "Perl Bashing". Which I totally used to do, early in my second year of uni because I was parroting off what one of my lecturers said...

Thankfully I at some point became more informed. I hope the talk giver does at some point too and they don't feel too negatively about themself when they realise how they came across. To a layman they might be hilarious but to someone familiar with the field he's wading into he sounds roughly like a five year old preaching issues with quantum theory to experienced physicists

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment