During the past several years the way of managing JavaScript dependencies evolved bringing some advanced solutions. One of the concepts which became very popular today, is a module pattern. The beginning of this article explains the idea pretty well. This concept was then reused in many modern dependency-management solutions, and was finally suggested as a specification of AMD API, the most known implementation of which is probably the RequireJS.
Exporting feature, the problems of which are discussed in this text, is a part of the module pattern. Strictly speaking, the module pattern itself has no relation to the dependency resolution, it is rather designed for managing and storing the data in a special way. But it naturally integrates as a base for the dependency management libraries.
The module pattern is based upon a contcept of a function which takes arguments and returns a value, where arguments stand for the module dependencies, and returned value is an object provided by a module. The core part of a module pattern is a function expression which may look like this:
// objects created by dependencies are provided as arguments
function( dep1, dep2, dep3 ) {
// perform the needed actions to build-up some new library objects
var routines = ...
// export the created objects
return routines;
}
This function will be called by a dependency management library as soon as the module dependencies are ready, and the objects created by the dependencies will be provided as arguments. Inside the function body, the module builds-up its routines and returns the created object. This object will then be handled by the dependency management library, and will later be provided in a similar way to other modules which will require this module as a dependency.
The main point of exporting is that the exported objects never get outside - they do not mess-up the global namespace, and are only provided exactly to the modules where demanded.
In the module pattern described above, the exporting stands for a data transfer from one module to another, explicitly specifying for each object the module which should provide it. Bonus feature is an opportunity to use the local scope of the function to keep some private data.
The similar approach of transfering objects between the modules is used in the CommonJS specification implimented in Node.JS. There is no fabric function for each module, but the logic of exporting is the same - the module providing the needed object should be explicitly specified.
This idea applied to the module dependency management system gives a nice picture: each module is stored in a separate file, the needed objects are provided directly by its dependencies, and the module itself defines which objects will it provide. And it looks pretty well until being put into real life conditions.
First of all, it appears that such approach is not very scalable - it takes an effort to split a module which has grown too big, into several pieces: if a part of a module logic will go to a new module, all the dependent modules should be updated to properly export the dettached routines from the new module, and therefore the links between the exported and imported objects should be set up again.
Similar problem appears when we need to make a common module for loading several other modules which are often used at once. Because of the exporting, such common module should first import the objects from those modules, put the data into a common object, and export this object further. For instance, in Node.js such a common module could look like this:
common = {
dep1: require('dep1'),
dep2: require('dep2'),
dep3: require('dep3')
};
module.exports = common;
And it would be fine, if we just need to include this common module instead of the three original dependencies, but the usage of the imported objects should now also be updated. So if previously a dependency was used like this:
var dep1 = require('dep1');
dep1.doSomething();
now it should be reused in a new way:
var common = require('common');
common.dep1.doSomething();
And this should be updated for every use-case of the imported object.
Here we can also point out another inconvenience brought by the exporting: it appears that the API of a library depends on how the library is organised (because the exported objects are hardly linked to the modules structure). This complicates the refactoring: if you wish to remake the module structure, you will also have to make an effort to keep the library API. In fact there's not too much work, but as result people often use to implement a librariy as a single huge module instead. This still works for resolving the dependencies, but the idea of splitting the big code into the smaller parts is already ruined at this point.
Another problem is that with this approach one has to write a lot of subsidiary stuff for each module. In addition to the fabric function listed in the module pattern example above, for a dependency management system we also need to identify somehow the modules which provide the objects substituted as arguments (list their paths or some kind of module ids). Additionally we need to define the module instance itself in a special way so that it could be recognized by a dependencies management system to be later reused by modules which will need it as a dependency.
This could be illustrated by how dependencies are specified in RequireJS. Declaration of a module with dependencies could look like this:
define(
['dep1', 'dep2', 'dep3'],
function( dep1, dep2, dep3 ){
...
}
);
The first argument of the define() function is a list of a module identifiers, and the objects exported by that modules are mapped to the arguments. The code obviously becomes more complicated if there are more dependencies:
define(
[ 'dep1', 'dep2', 'dep3', 'dep4', 'dep5', 'dep6', 'dep7', 'dep8'],
function(dep1, dep2, dep3, dep4, dep5, dep6, dep7, dep8){
...
}
);
Now there is much more chance to make a mistake. To solve this, the creators of RequireJS invented another way of listing dependencies and mapping them to the exported objects (this solution is called the 'simplified CommonJS'):
define(
function (require) {
var dep1 = require('dep1'),
dep2 = require('dep2'),
dep3 = require('dep3'),
dep4 = require('dep4'),
dep5 = require('dep5'),
dep6 = require('dep6'),
dep7 = require('dep7'),
dep8 = require('dep8');
...
}
});
This way of specifying the dependencies is easier to read and more convenient to use. But now we have the second way to do the same thing, and the amount and structure of the code needed to set-up the dependencies is just outstanding. Why can't we simply list the dependencies? The only reason is that we also need to specify a correspondence between the dependency and the object it exports.
These complications are only brought by the exporting feature, particulary by the fact that the exported objects are always linked to the modules which export them. Other aspects of the module pattern make no problem: putting the code into a function still provides a convenient way for managing private data not intended to be exported (by using function local variables), and this function could also be used by a dependency management solution to be called at the appropriate time (when all the dependencies are ready).
Therefore exporting forces programmers to pass the data through each module, and it should be done for each exported object. This rule is actually a limitation, because it implies that a module always results into an object.
The modular configuration of a library is a matter of internals, and it should be arranged according to the library structure. On the other hand the API of a library should be created according to how the library should be used. In case of export approach these two things are linked together, and therefore the internal structure of a library influences the API upon each refactoring, as shown above. If we break this link, the issue will be solved. How could this be achieved?
In the existing export-based solutions, a module (along with the exported object) is identified by its name (file path, or some kind of module id, which is then resolved by a module loading system). This identifier is a string which refers a module in some kind of globally accessible registry (filesystem, or an external config defining the module ids).
We could create another similar but independent registry for storing the library objects, and let modules decide themselves when and what they wish to create on that registry. Such approach would mean a switch the modules behaviour from 'producing objects' to 'performing actions' (while the action could also mean producind an object, but not necessarily).
Now to get a library object, we will ask the objects registry. The dependency declaration code is simplified: it should be enough to simply list the needed dependencies in the module head, and all these workarounds for making up the correspondance between the modules and objects, are not needed anymore.
Let us try to figure out what kind of registry should it be. A module should be able to create an object on that registry, and this object should be accessible by its identifier from any part of code. I guess you already pointed out what I am implying - we already have this kind of registry. This is the global scope. But everyone knows that using global scope is a bad practice, isn't it?
Well, not exactly. Globals are bad when used without any control, by
creating a global whenever a variable is needed. But if we there would be
a single globally accessible registry, conventionally named something like
LIB
, and containing the library objects, each for a single library
routines, each named the same as the library -
that would be similar to referring the exported object by the module path
or identifier (which are also global as explained above).
In the export approach there is a convention according to which the module provides an object to export. Following this convention makes the exporting aprroach work. But if a module for some reason does not export an object, it will result in no object upon import. Storing the library object in that kind of registry with the name of the library, is the similar kind of convention.
The upcoming new ES6 standard includes the native module concept which also provides the exporting feature. It is a bit more advanced (comparing to the simple exporting reviewed in this text), in sence that it allows a module to export several objects at once, and then to specify a particular object to be imported form a module. Nevertheless the library objects are still linked to the module and are identified by the module, which means that discussed issues also apply to that apprach.
Moreover, instead of simplifying the task of a library refactoring and splitting it into smaller modules, the feature of importing a particular object provided by a module, implies that the whole library is located inside a single module. In fact this simply legalizes the huge single-module libraries!
Prohibiting implicit globals is just great, but should we really treat this new kind of exporting as a 'good practice', or rather as yet another anti-pattern which in fact only consumes the developer's effort to support itself?
This text is an attempt to explain the decision on the module format for the Helios Kernel loader. After the library release, I sometimes received a feedback with complains about that its module format does not allow to export the objects created by modules. In fact, this is not the case - it is still possible to implement any approach for managing the created objects on the top of Helios Kernel (just like existing solutions are implemented on the top of lower-level browser API). But this is not necessary: instead it is suggested to follow more flexible approach and treat the modules not as 'producing objects', but rather as 'performing actions' (so that it would be the modules which can do, not only make), as described in this text.
Comments and suggestions are welcome
--
You can find me on twitter: https://twitter.com/asvd0
Also check out some of my projects on github (ordered by my impression of their significance):
Helios Kernel: isomorphic javascript module loader
Jailed: a library for sandboxed execution of untrusted code
Lighttest: isomorphic unit-testing library
We can:
That's what simplified CommonJS wrapper is about.