-
-
Save getify/463013 to your computer and use it in GitHub Desktop.
| <?php | |
| // located at: http://another.tld/auth.php | |
| $api_callback = $_REQUEST["callback"]; | |
| if ($_COOKIE["token_1"] == "abcd1234" && $_GET["token_2"] == "efgh5678") { | |
| $msg = "Yes, your API call was successful!"; | |
| } | |
| else { | |
| $msg = "API call not authorized."; | |
| } | |
| ?> | |
| // this is a JSON-P style response from the API | |
| <?=$api_callback?>({"msg": "<?=$msg?>"}); |
| <?php | |
| // located at: http://another.tld/auth.php | |
| $token_1 = "abcd1234"; | |
| $token_2 = "efgh5678"; | |
| $auth_callback = $_REQUEST["callback"]; | |
| setcookie("token_1",$token_1); | |
| ?> | |
| // in JS, document.domain is not settable or spoofable so it's | |
| // reliable to protect a cross-domain JSON-P call | |
| if (document.domain == "something.tld") { | |
| <?=$auth_callback?>({"token_2": "<?=$token_2?>"}); | |
| } |
| // this file is loaded and run on http://something.tld/index.html | |
| function make_jsonp_call(url) { | |
| var script = document.createElement("script"); | |
| script.src = url; | |
| script.type = "text/javascript"; | |
| document.getElementsByTagName("head")[0].appendChild(script); | |
| } | |
| function api_done(resp) { | |
| alert(resp.msg); | |
| } | |
| function get_auth(auth) { | |
| var token_2 = auth.token_2; | |
| // not only do we have token_2 by way of the auth parameter, | |
| // but token_1 is stored in a browser cookie now. together, | |
| // these two tokens will authorize our API call. | |
| make_jsonp_call("http://another.tld/api.php?token_2="+token_2+"&callback=api_done"); | |
| } | |
| make_jsonp_call("http://another.tld/auth.php?key=987654321&callback=get_auth"); | |
I liken this to the classic CS problem of trying to take any given matched string and find the exact regular expression that matched it.
Since it can be proven that there are nearly infinitely many different regex's that could match the given string, you can't prove that you can easily find exactly the regex the string came from. The regex match process is a lossy one-way street. There's the easy base regex /thestring/, but what if I had originally randomly started with some other crazy regex that matched the same string? You have no way of knowing which regex I started with, except just starting with the base regex and exhaustively brute-force trying all possible variations on the regex grammar, and asking if that is the regex I started with.
You could probably do this brute force, but it's not gonna be pretty or easy or trivial.
Things that Rhino needs some help with in order to decode a "noalnum" string:
- global object needs to be referenced by a variable named "window" (duhh)
- need "atob" and "btoa" functions
- the "toString" function on the global/window object needs to return "[object Window]" instead of "[object Global]"
- The Array prototype needs a "filter" that doesn't have to do anything in particular other than be a function, and also have (on the function object itself) a toString method that returns the sort of string Firefox returns ("function filter () {\n [native code]\n}")
- The String prototype needs a working "fontcolor" function (trivial)
- The global/window "Date" function has to be replaced by a function that just returns a random Javascript-style date string (this is due to a NullPointerException bug in the Rhine Date() function)
I think that's pretty much it.
@fearphage -- i'm not sure if you realize how noalnum works... but it's not just a single-pass pattern replacement. in that demo URL above, you can "step" through the algorithm one at a time, and see that what it's doing is a bit more complicated than that.
You must know that not all algorithms are trivially reversible. I myself can't prove mathematically that this algorithm is or is not trivial to do so, except that I can assert that I could easily "perturb" this predictable algorithm by randomly choosing any of a thousand different possible replacements for the "s" character... which means that a single API Key string with even a few characters length to it could be randomly converted to a hundred thousand or more different noalnum sequences. The particular demo URL may be predictable/deterministic in that "s" may always be one noalnum sequence, but it'd be trivial to severely complicate the reversing by just adding some randomness in there.
I think it's quite clear (and provable) that the noalnum string would provide a pretty decent barrier to pattern recognition/replacement brute force breaking. Again, it's not perfect and unbreakable, but I suspect it would take a lot of effort at least.
It remains to be proven one way or the other however that it's trivial to load in a server-side DOM emulation like env.js and the noalnums just magically work on the server. I don't have Rhino so I can't verify at the moment. env.js doesn't work in V8, so I couldn't test it there. But I did try just defining some common "window" and "alert" objects/functions, and even the noalnum for the "." character couldn't be evaluated directly.
If someone can prove that "env.js" loaded into Rhino is all it takes to reverse any possibly complicated noalnum, or even most of them, you still haven't proven that this approach has no value. You've simply proved that Rhino+env.js is just one feasible spoofing vector. Anyone who doesn't have that environment (Java for Rhino, right?) still may be dead in the water, or rather, facing an uphill battle at best.
Also, I bet this approach + [something else yet to be devised] could be combined to thwart the Rhino/env.js crowd, or at least get it to the point where it wasn't trivially easy to overcome.
I should also say that while browser (not JS engine) automation, like in a VM, would be something I would like to prevent, the spirit of this question was just to ensure that someone couldn't easily set up a proxy server script against a particular Ajax type request. If it turns out that browser automation in a VM is the only viable way to get around this "security", then we've actually accomplished pretty much what I wanted: make it so that a browser has to interpret an Ajax request.
I'm not concerned with trying to enforce that a browser isn't being automated... there are CAPTCHAs and other such approaches that can be used to ensure a human is sitting at the console.