UPD: just released a library implementing all the features described below: https://github.com/asvd/jailed
When there is a need to run an untrusted code in JavaScript, one may jail it within a web-worker. But what makes it secure, also makes it restricted. One may only send messages and transfer json-serialized data, but it is not possible to directly call a function, or use an object of an opposite site.
This text describes an approach which may be used to simulate an exporting of a set of functions into a worker scope, so that those functions will be usable almost in the same way as if those would actually be provided to the worker.
The concept is not new: in the world of Big Serious Programming Languages it is known as a remote procedure call, which is roughly speaking an invocation of a piece of code of an application at the remote site by sending a message over the network.
In case of workers in JavaScript everything is a bit simplier: a worker is not so remote, the communication channel is already built-in, and a function signature may only consist of the function name. Moreover, the dynamic nature of JavaScript allows to create a set of wrappers at the worker's site, so that those wrappers would look and behave exactly as the actual functions provided by the application, and this would make an impression of that those functions have actually been exported into the worker. Under the hood of those wrappers, a message is sent to the application site, which finally leads to the actual function invocation.
A potentially insecure code (which should be restricted and is
therefore launched in a worker) is referred in this text as a
plugin. Additionally there is another piece of code (also loaded into a
worker and resided in a special file named shovel.js
), which
prepares the wrappers representing the exported functions on the
worker's site, as described above.
When referring to the related files in the code snippets below, the
absolute paths should be provided. Assumming that all scipts reside in
the same directory, a special variable localPath
is used, which is a
string contatining an absolute path of that directory. It may be
defined as following:
var scripts = document.getElementsByTagName('script');
var localPath = scripts[scripts.length-1].
src.split('/').slice(0, -1).join('/');
The kind of connection (between an application and a plugin) described above, can be initialized with the following steps:
-
Create a worker
-
Load
shovel.js
into the worker; -
Send a message to the worker containing a list of exported functions names, along with the path of the plugin to be loaded;
-
Handle the message on the worker's site by the code of
shovel.js
, prepare a set of wrapper functions (each reperesenting an exported function), and finally load the plugin; -
The code of the plugin may now use the wrappers created by
shovel.js
as if those were the functions actually exported by the application. Since the plugin runs in the restricted environment of a worker, any other routines (not exported explicitly when initializing the plugin) are not accessible.
Translating everything written above into JavaScript, there is a function which creates a worker, sends a list of exported function names and the path of the plugin to be loaded, and handles the messages from the worker:
var loadPlugin = function(path, api) {
// creating a worker as a Blob enables import of local files
var code = 'importScripts("'+localPath+'/shovel.js");';
var worker = new Worker(
window.URL.createObjectURL(new Blob([code]))
);
var names = [];
for (var i in api) {
if (api.hasOwnProperty(i)) {
names.push(i);
}
}
worker.postMessage({path: path, api: names});
// message treated as a remote request to invoke a function
worker.addEventListener('message', function(e) {
api[e.data.name].apply(null, e.data.args);
});
}
The shovel.js
is loaded on the opposite site. It should create the
wrappers for the exported functions, and finally load the plugin
source:
// application only sends an initialization message
self.addEventListener('message', function(e){
for (var i = 0; i < e.data.api.length; i++) {
setMethod(e.data.api[i]);
}
importScripts(e.data.path);
});
// stores the wrappers for the exported functions
var remote = {};
// creates a single wrapper function
var setMethod = function(name) {
remote[name] = function() {
// arguments should be a pure array for proper serialization
var args = [];
for (var i = 0; i < arguments.length; i++) {
args.push(arguments[i]);
}
// requesting the actual function invocation
self.postMessage({name: name, args: args});
};
}
Upon being loaded, a plugin runs in a worker, and still may use the
remote
object containing the exported functions. Now it is possible
to perform something like this:
// exporting the alert method
loadPlugin(localPath+'/plugin.js', {alert: alert});
// runs in the worker, cannot access the main application, with except
// for the explicitly exported alert() method
remote.alert('Hello from the plugin!');
(exporting the alert()
method is probably not the best idea)
There is an online demo of another example demonstrating this concept used to provide an opportunity of managing a simple advertising banner to a 3rd-party library from a partner.
There are several ways of how this approach can be extended:
Since the arguments are serialized into a JSON string upon being sent as a message, the implementation above does not support using a callback as an argument. But it may be important to send some result of the function execution back to the worker.
The callbacks may be supported in the similar way as the exported functions are: those should be stored on the worker's site upon a wrapper execution, and some identifiers should be sent in a message instead, along with other arguments.
When the application handles a message requesting an exported function
invocation, it should replace the callback identifiers with the fake
callbacks, which would actually send a message back to the worker
with an identifier of a callback to invoke. That message is then
handled by shovel.js
and finally leads to the actual callback invocation.
The tricky thing about the callbacks is that they should be cleared after being called, otherwise they will consume memory forever. This means that a callback can only be executed once. The similar applies to the case when several callbacks are provided to the function (for instance, a success and a failure callbacks). In most cases, only one of them will be used, which implies that all the callbacks related to a single function invocation act should be cleared upon the first usage of any of them.
Nevertheless, such restrictions still do not break the convenience of the exported functions usage for most cases.
Upon a plugin initialization, it may perform the similar exporting of its own functions in order to make them accessible by an applictaion. This could be useful for a plugin which performs some heavy calculations: the worker runs in a separate thread and does not slowdown the application. A direct usage of a function exported by a plugin may be more convenient comparing to manually sending and handling messages.
Since a plugin should report the result somehow, this only makes sence along with the callbacks described it the previous section.
In order to implement the same thing in Node.JS environment, a
subprocess should be used instead of a worker. The process created by
the fork()
method of the child_process
module also provides a
built-in communication channel, so the plugin initialization code is
nearly the same.
The only difference is that some additional measure should be taken to place insecure code into a jailed environment: a subprocess is not restricted by default and may perform some unwanted things, like using the filesystem.
In order to jail the code, it should be executed using
runInNewContext()
method of the vm
module. The sandbox object provided
to that method should only contain a set of wrappers for the exported
functions, and probably a couple of service functions like
setTimeout()
and setInterval()
.
Additionally the jailed code should be executed in the strict mode (otherwise it
could break the sandbox using arguments.callee.caller
). To ensure the
strict mode, the 'use strict'
should be added at the beginning of the
script content before execution.
This means that the script should be first loaded as a
string with a function like this:
// loads a file with the given path
// provides its contents to the callback
var loadContents = function(path, sCb, fCb) {
if(path.substr(0,7).toLowerCase() == 'http://' ||
path.substr(0,8).toLowerCase() == 'https://') {
// loading the remote file
var receive = function(res) {
if (res.statusCode != 200) {
fCb();
} else {
var content = '';
res.on(
'readable',
function() {
var chunk = res.read();
content += chunk.toString();
}
);
res.on('end', function(){ sCb(content); });
}
}
try {
require('http').get(url, receive).on('error', fCb);
} catch (e) {
fCb();
}
} else {
// loading the local file
try {
sCb(require("fs").readFileSync(path).toString());
} catch(e) {
fCb();
}
}
}
After the script content is loaded and supplemented with 'use strict'
, it may be evaluated by runInNewContext()
method.
--
You can find me on twitter: https://twitter.com/asvd0
Also check out some of my projects on github (ordered by my impression of their significance):
Helios Kernel: isomorphic javascript module loader
Jailed: a library for sandboxed execution of untrusted code
Lighttest: isomorphic unit-testing library
@linuxenko It creates blob with
code
as contents in order to load it into a Worker. Alternatively is possible to create a file - let's say worker.js - with those contents and load it withnew Worker('worker.js')
but it will result in an extra request.