Thursday, 17 December 2009

GET information, POST changes

A micro-post, but this isn't said often enough: Don't use GET requests to change data server-side, use POST (for example, when using Ajax to modify things server-side). GETs are supposed to be idempotent, which is a fancy way of saying that doing them repeatedly has the same effect as doing them once. So, GET the contents of a message on a message board, but POST new messages to the board.

More (in incredibly turgid prose!) in Section 9.1.2 of the HTTP spec.

Tuesday, 13 October 2009

A better way to hide .htaccess, WEB-INF

If you use Apache as your web server of choice, you may wish to have files or directories that Apache pretends are not there. For me, this is because I like to have Apache proxy a servlet container backend, but I'm too lazy to separate out the files, and so I just point Apache and the servlet container at the same directory and tell Apache to pass on the relevant requests to the servlet container (e.g. JSPs, servlets, etc.). The only problem is, that common directory will tend to contain things like the WEB-INF directory and its subdirectories, which I kind of don't want Apache to serve up to the public! If you use .htaccess files, you'll have the same sort of situation.

The usual answer is to simply deny access to the file or directory in question, as in this default rule from the Apache2 config:
<Files ~ "^\.ht">
Order allow,deny
Deny from all
</Files>
That sends a 403 (Forbidden) reply when someone tries to access .htaccess. And the same sort of thing can be applied for WEB-INF using the DirectoryMatch directive:
<DirectoryMatch "(^|/)WEB-INF($|/)">
Deny from all
</DirectoryMatch>
And that's great, but I'm a bit paranoid -- why does the outside world need to know that the thing is there at all? I'd much rather it sent a 404. And huzzah, mod_rewrite to the rescue!
<DirectoryMatch "(^|/)WEB-INF($|/)">
RewriteEngine on
RewriteRule .* - [L,R=404]
</DirectoryMatch>
Since the RewriteRule is already inside the DirectoryMatch that will only match what we want, its own regex can just match everything. The L flag says this is the last rule, but the R flag is the magic: It forces a redirect, and if you use the R=xyz form, it redirects with the given code; in this case, a 404. This does the right thing even if you have a custom error document (and you have custom error documents, right?).

Voilà! As far as the outside world is concerned, there just isn't a WEB-INF directory there at all.

If you like, you can have a general rule that works whether mod_rewrite is loaded or not:
<DirectoryMatch "(^|/)WEB-INF($|/)">
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule .* - [L,R=404]
</IfModule>
<IfModule !mod_rewrite.c>
Deny from all
</IfModule>
</DirectoryMatch>
This will do a 404 if mod_rewrite is loaded, but fall back to a 403 of not.

Final note: If you're letting Apache generate directory listings for you by not including a directory file, the above won't hide the WEB-INF diretory (or whatever you're hiding) in those listings. It's easy enough to do it, though: Inside the relevant directive for the directory being listed, just use IndexIgnore to tell the mod_autoindex module to ignore it:
IndexIgnore WEB-INF
In my case, since I want WEB-INF to be hidden everywhere, I can just include it inside my main Directory directive. I never let Apache generate listings for me anyway, but it's nice to have a backstop if I blow away my index file.

Sunday, 13 September 2009

Private Methods in JavaScript

The question of truly private methods comes up intermittently in JavaScript mailing lists. Most respondents point to Crockford's solution, but then also point out how much memory it takes (each instance gets its own copy of methods; but on the up side, it also gives you truly private member variables as well as methods).

Recently I've been switching to the module pattern for each class, thanks in part to kicking around the Prototype source code a fair bit (and also having read Juriy Zaytsev's article on named function expressions). Since private scope is an intrinsic part of the module pattern, I wondered whether it could help us define truly private methods. And the answer is: Of course it can.

The Pattern


For those not familiar with it, the module pattern is basically:
  • Use an enclosing function to provide scope
  • Use named functions within your function
  • Export the ones you want to have as public methods by returning them from the function (usually as properties on an object)
E.g., something like this:
var CoolModule = (function() {
function foo() { ... }
function bar() { ... }
function cool() { ... }

return {foo: foo, bar: bar};
})();
The anonymous function is the scoping function, which we define and then run immediately, assigning its return value to the CoolModule variable. We end up with a CoolModule object with the properties foo and bar that refer to named functions with the same names. Note that in this example, I did not include cool in the returned object's properties. So what's cool for?

The cool function is accessible anywhere within the scoping function, but nowhere outside of it. So foo can use it, as can bar, but nothing else can. It's a truly private function for the CoolModule module. That's already really useful, but we can also apply it to classes:

Defining Classes


Okay, so let's look at building a class that way. Most toolkits provide some means of creating a "class" based on an object hash of methods. For instance, using Prototype's Class.create, we can define a class like so:
var Thingy = Class.create({

initialize: function(name) {
this.name = name;
},

doSomething: function() {
// ...
}

});
That passes an object literal with named properties for each of our "methods", which Class.create uses to produce a "class" for us (well, more technically a constructor function with the given methods as members on the prototype; JavaScript doesn't have classes, of course, it's a prototypical language — but that's a different subject).

Defining Thingy that way has a problem: All of its functions are anonymous. They're assigned to properties, and those properties have names, but our functions do not. This is a problem any time any tool wants to help us out &mdash for instance, if the browser wants to tell us what function an error occurred in, or if a debugger wants to show us the call stack. So we're better off using the module pattern here, like so:
var Thingy = Class.create((function(){

function initialize(name) {
this.name = name;
}

function doSomething() {
// ...
}

return {initialize: initialize, doSomething: doSomething};
})());
Now our functions have proper names. (They can have different names from the properties that reference them, which is sometimes handy, but let's keep them the same for now.)

Doing that gives us a truly private scope (our anonymous function), which surely helps with giving us truly private methods, right?

Right.

Class Methods


You remember that cool function from earlier, the one we defined but didn't export. The easiest and most obvious benefit we get from our scoping function is that we immediately have class methods with no need for any other work: We just define a function and don't export it in the return value. It's available to all of our other functions, but it's completely closed off from the outside. It's not tied to any instance, but class methods are quite useful things.
var Thingy = Class.create((function(){

function initialize(name) {
this.name = name;
}

function doSomething(param) {
var transmogrified = transmogrify(param);
// ...
}

function transmogrify(str) {
return str.toUpperCase();
}

return {initialize: initialize, doSomething: doSomething};
})());
Our public method doSomething uses our private class method transmogrify to do something to its parameter, presumably something other methods will also need to do but which shouldn't be a public feature of the class. Very handy!

But can this help us get private instance methods, too, without massive memory overhead?

Yup. Remember that JavaScript doesn't have methods, it has functions (very powerful ones). And how a function falls into our somewhat inappropriate class-based terminology ("function", "instance method", "class method") depends entirely on how it's called; there's nothing intrinsic to the function that makes it one of those things. So if we have truly private class methods, that means we also have truly private instance methods, it's just how we call them. In particular, to get an instance method, all we do is call it in a way that we set its context (its "this" value). JavaScript provides three ways to do that:
  1. By assigning it to a property on an object, and then calling it via method notation: obj.method()
  2. By using Function#call
  3. By using Function#apply
The only difference between the last two is how they accept arguments to pass to the function.

Based on the above, so far, I've come up with four ways to get truly private instance methods (at least two of which I know I've seen elsewhere [one after I came up with it, one before]; and I can't imagine the other two are really original, either, even if I haven't seen them). They have varying trade-offs between speed, convenience, and maintainability.

Calling All Instances


The first way is really quite obvious: Using Function#call or Function#apply. Here's an example:
var Thingy = Class.create((function(){

function initialize(name) {
this.name = name;
}

function doPublic() {
doPrivate.call(this);
}

function doPrivate() {
alert(this.name);
}

return {
initialize: initialize,
doPublic: doPublic
};
})());
var t = new Thing("Fred");
t.doPublic(); // Alerts "Fred" via private instance method
Our doPublic function calls our doPrivate function via Function#call; our truly private instance method then shows the name of the Thingy. doPrivate's context ("this") is set to the instance, and so it has normal access to the instance's properties, including name. Voilà! (Sort of.)

Going Procedural


If, like me, you thought "doPrivate.call(this)" looked vaguely familiar, we may have something in common: A background in procedural languages. Using Function#call to call a function and set its context looks very much like just calling a function and passing it a parameter. So even though it's cheating a bit, let's just give a nod to procedural programming here:
var Thingy = Class.create((function(){

function initialize(name) {
this.name = name;
}

function doPublic() {
doPrivate(this);
}

function doPrivate(self) {
alert(self.name);
}

return {
initialize: initialize,
doPublic: doPublic
};
})());
var t = new Thing("Fred");
t.doPublic(); // Alerts "Fred"
Here, we just have doPublic call doPrivate in a procedural fashion, passing in the "this" reference. Since there are no private properties in JavaScript, doPrivate has access to just as much information as it did when called in a more instance-like way.

Okay, but really the holy grail has to be calling the method in the "normal" way: this.method()

Leveraging Prototypes


As you probably know, when an object is created via a constructor function, its prototype gets set to the prototype property of the constructor function. All of the members on that prototype are inherited by the object. They're not copied, they're inherited. That opens the door to a means of getting the holy grail syntax with truly private methods, but at a cost: What if we create a new object with "this" as its prototype? Then we can put the private methods on it, and no one else can see them. Well, this does work, but the costs are pretty high. Let's walk through it.

First, how do we create a new object and assign "this" as its prototype? Well that part's actually quite easy: We just have a constructor function lying around, set its prototype property to the instance we want to be the prototype, create a new instance with it, and then restore the constructor's prototype (just to be tidy, we shouldn't keep a reference to the other instance anywhere). (Some browsers let us directly manipulate the prototype of an actual object via a __proto__ property, but others don't and it hasn't been standardized, so we'll stick to the constructor way.) Here's what that looks like:
var Helper = (function() {

// A factory for setting up our private instances
function factory() {
}

// Our public makePrivate function
function makePrivate(instance, privateMethods) {
var rv, name;

// Set the prototype of our factory
factory.prototype = instance;

// Get our private instance
rv = new factory();

// Reset the factory prototype (just to be tidy)
factory.prototype = Object.prototype;

// Copy over the private methods
for (name in privateMethods) {
rv[name] = privateMethods[name];
}

// Done
return rv;
}

// Publish the "makePrivate" function
return {makePrivate: makePrivate};
})();
And we can use it like this:
var Thingy = Class.create((function() {
var privateMethods;

// The private methods for our class
privateMethods = {
doPrivate: doPrivate
};

function initialize(name) {
this.name = name;
}

function doPublic() {
var self;

// Get a version of "this" that has our private stuff
self = Helper.makePrivate(this, privateMethods);

// Use it
self.doPrivate();
}

function doPrivate() {
alert(this.name);
}
return {
initialize: initialize,
doPublic: doPublic
};

})());
var t = new Thingy("Fred");
t.doPublic(); // Alerts "Fred"
Leaving aside the setup overhead, we've reached the holy grail, right?

Hmmm, not quite. This is all very well for if we read properties, but if we write them, we run into a bit of a problem: Setting a property on our private instance will only set it there, not on its prototype, so when we discard our private instance (as we do at the end of doPublic), any changes are lost with it &mdash e.g., if doPrivate changes anything:
    function doPrivate() {
alert(this.name);
this.name = this.name.toUpperCase();
}
...the changes are lost:
var t = new Thing("Fred");
t.doPublic(); // Alerts "Fred"
alert(t.name); // Also alerts "Fred", not "FRED"
Not good. Now, your first thought (if you're like me) is that all we need is a method that sets the property on the underlying instance, and all private methods need to be sure to use it. But no, if you think about it a little longer, you realize that not only the private methods but also the public ones have to use it (since if we call them, this will refer to our private instance). Worse, not only your own methods, but all superclass methods also have to use it. Now, if you're an adherent to the philosophy that all property access should be through getters and setters anyway, that will work, and your setter would look like this:
function setName(name) {
var inst = this.hasOwnProperty('_original')
? this._original
: this;
inst.name = name;
}
...assuming you've modified makePrivate to store the original instance on the private instance as the property _original. If you're not an adherent of that getter/setter philosophy, though (and I've never seen any substantial JavaScript written by anyone who was), it's a problem.

There's another way: Just as we have to do something when we "go private", we just add a step for when we're "done" with the private thing. The "done" function takes any properties explicitly assigned to the private instance and copies them to the original. Like so:
var Helper = (function() {

// A factory for setting up our private instances
function factory() {
}

// Our public enterPrivate function
function enterPrivate(instance, methods) {
var rv, name;

// Set the prototype of our factory
factory.prototype = instance;

// Get our private instance
rv = new factory();

// Restore the factory prototype (just to be tidy)
factory.prototype = Object.prototype;

// Remember the private methods and the original
// object, we'll want them for later
rv._privateMethods = methods;
rv._original = instance;

// Copy over the private methods
for (name in methods) {
rv[name] = methods[name];
}

// Done
return rv;
}

// Our public exitPrivate function
function exitPrivate(instance) {
var name, original;

original = instance._original;
for (name in instance) {
if (name != '_original' &&
instance.hasOwnProperty(name) &&
!(name in instance._privateMethods)
) {
original[name] = instance[name];
}
}
return original;
}

// Export our public functions
return {enterPrivate: enterPrivate, exitPrivate: exitPrivate};
})();
..and then we can use it like this:
var Thingy = Class.create((function() {
var privateMethods;

// The private methods for our class
privateMethods = {
doPrivate: doPrivate
};

function initialize(name) {
this.name = name;
}

function doPrivate() {
alert(this.name);
this.name = this.name.toUpperCase();
}

function doPublic(name) {
var self;

// Public->Private boundary, need to "enter" private mode
self = Helper.enterPrivate(this, privateMethods);

// ...do things...
self.doPrivate();

// Private->Public boundary, need to "exit" private mode
Helper.exitPrivate(self);
}

return {
initialize: initialize,
doPublic: doPublic
};

})());
var t = new Thing("Fred");
t.doPublic(); // Alerts "Fred"
alert(t.name); // Also alerts "FRED"
Now we're safe from losing our updated properties when we're done with the private instance (provided we don't forget to call exitPrivate!).

Naturally the enterPrivate and exitPrivate functions mean we incur a lot of runtime overhead; but at least once we're "in," we're able to do things with completely normal syntax and it Just Works. And using prototypes in this way is very cool. Unfortunately, though, all of that copying back and forth and object instantiation really does kill performance.

Enh, Forget Prototypes


If we're going to be doing a lot of copying of things when "entering" and "exiting" private code, we can try to improve performance by ditching object instantiation and fancy prototype stuff and just mucking about with the original instance directly:
var Helper = (function() {

// Our public enterPrivate function
function enterPrivate(instance, methods) {
var old, name;

old = {};
for (name in methods) {
if (instance.hasOwnProperty(name)) {
old[name] = instance[name];
}
instance[name] = methods[name];
}
if (instance.hasOwnProperty('_oldStuff') && !('_oldStuff' in old)) {
old._oldStuff = instance._oldStuff;
}
if (instance.hasOwnProperty('_methods') && !('_methods' in old)) {
old._methods = instance._methods;
}
instance._methods = methods;
instance._oldStuff = old;
return instance;
}

// Our public exitPrivate function
function exitPrivate(instance) {
var methods, old, name;

methods = instance._methods;
delete instance._methods;
old = instance._oldStuff;
delete instance._oldStuff;

for (name in methods) {
delete instance[name];
}
for (name in old) {
instance[name] = old[name];
}
return instance;
}

// Export our public functions
return {enterPrivate: enterPrivate, exitPrivate: exitPrivate};
})();
And then we use it like so:
var Thingy = Class.create((function() {
var privateMethods;

// Our private methods
privateMethods = {
doPrivate: doPrivate1
};

function initialize(name) {
this.name = name;
}

function doPrivate() {
alert(this.name);
this.name = this.name.toUpperCase();
}

function doPublic(name) {

// Public->Private boundary, need to "enter" private mode
Helper.enterPrivate(this, privateMethods);

// Do some stuff
this.doPrivate();

// Private->Public boundary, need to "exit" private mode
Helper.exitPrivate(this);
}

return {
initialize: initialize,
doPublic: doPublic
};

})());
var t = new Thingy();
t.doPublic(); // Alerts "Fred"
alert(t.name); // Alerts "FRED"
Well, like the prototype solution, it works, but it has some pretty ugly performance.

I should also mention that both the prototype and the overlay solutions will wreak havoc if you have a subclass with a private method that has the same name as a public method on a base class. You Have Been Warned.

Performance Matters


Okay, so bringing it all together, we've looked at four mechanisms (one with two variations) that avoid the problem of creating a copy of every method for every instance. We've mentioned that a couple of them incur quite a bit of overhead. But...how much overhead, really? I mean, do they cut speed by 10%, 20%, 30%? Enquiring minds want to know!

Well, I put together basic test for six options:
  1. Don't worry, just use public methods and tell people not to call them (e.g., by naming convention or some such).
  2. Use the Function#call variety.
  3. Use procedural programming.
  4. Use the prototype trick, with an "exit" that finds and copies back properties set since going private.
  5. Use the prototype trick, being careful to always use "set" methods.
  6. Using the overlay trick (updating the instance in place).
You can grab the source or run it for yourself here; here's the skinny from some popular browsers on Windows:

Firefox:



IE7 (I don't have IE8 on my little netbook):



Chrome:



Safari:



Opera:



As you can see, our "holy grail" solutions (the last three in each set) perform abysmally. It varies by browser quite a bit, but across the board they're really dramatically slow. Mind you, my implementations of them may well be sub-optimal (I dashed them off), but the implementation would have to improve hugely to make performance acceptable. Combine that with the inevitable bugs related to forgetting an "enter" or "exit" call, and I think you'd be hard-pressed to find a good use case for them.

The results for the first three are interesting. Chrome and IE7 are apparently faster at finding a function object by looking at the properties of the instance than they are by looking on the scope chain; Firefox, Safari, and Opera are faster when we ask for the function from the scope chain. The theme that definitely emerges, though, is that if performance matters to you, just use public methods people are supposed to leave alone, or make your truly-private functions procedural (although the cost of calling them via Function#call instead isn't very high, if you really prefer that).

As a side-note, although I didn't test it, I would expect Crockford's pattern (closures for each instance, each instance gets its own copy of all functions) to have call times in line with the procedural timings above, as that's effectively what it does, just at the instance rather than class level.

Does Performance Matter?


So the next question is, okay, so some of the mechanisms impact call time by huge amounts &mdash more than an order of magnitude in some cases. But...does it matter? Computers are crazy fast these days, right?

Chrome, the fastest in this test (and many others) varied from performing a private call in 0.000003418396717245264495027600135 seconds (three hundreths of a millisecond) at its fastest to doing it in 0.00013514974591847767326197426749 seconds (1.3 milliseconds) at its slowest. IE7, the slowest in this test (and many others) varied from 0.000092537755404204915605567071365 seconds (9/10ths of a millisecond) per call down to 0.48923679060665362035225048924 seconds (just under half a second) per call. And these tests were done on a netbook-class machine (wonderful little unit, don't let the picture in the WP article put you off, it's a color-balance problem in the photo).

Everyone has to make their own decision about this. My stake in the ground is that it does matter, yes, particularly as long as the majority of web users are still using the slowest web browser out of the big players: IE. In our modern web applications, a click by a user might kick off 10-20 underlying JavaScript function calls (at least!). If we say 10 and half of those are "private," it makes the difference between IE in responding in no less than 5 milliseconds to taking at least 2.4 seconds. That's a big drop in user-perceived performance. So for my money, within an application I probably won't worry and will just make things public; in library work (either internal libraries used by multiple apps, or publicly-released libraries) I'll probably go with truly private class methods and pass in instances when necessary (e.g., procedural use).

Whatever you end up doing, I hope this roundup of private method options is useful.

Happy coding,

—— T.J.

Wednesday, 9 September 2009

Simple, Efficient Supercalls in JavaScript

In this post, I'll be outlining a simple means of doing supercalls in JavaScript without causing unnecessary runtime overhead, running into maintenance issues, or falling afoul of the new "strict" mode of ECMAScript 5.

Update 2012/04/04: A few weeks back I had one of those "ah hah" moments that you sometimes have, which made me realize that while the below is indeed efficient and simple enough to use, there's another way that's just as efficient and lots simpler, both to use and in terms of how much support code is required. And so I created the Lineage toolkit, see this announcement post for more.

The Short Version


The mechanism in brief is:
  • Have a "make a class" function (most libraries do).
  • In that function, detect that a subclass is overriding a superclass function and, if so, put a reference to the superclass function on the subclass function instance as a property.
  • To call a superclass function from a subclass function, authors use Function#call or Function#apply on the $super property of the subclass function instance (being sure to get the function directly, not via this). This is simplest and fastest with named functions, but possible with anonymous ones too.
  • Allow for mixins, if you're into that sort of thing (and why not, they're cool).
That's it. Dead simple, and dramatically more efficient than the wrapping subclass functions in closures, doing function decompilation (which has never been standardized), etc. It leads to more maintainable (and readable) code than having subclasses refer to their parents by name, and even encourages the use of named functions, which is a good idea for lots of other reasons.

Okay, on to the details, including a full sample implementation, and discussion around the various choices and issues.

What Lead To This


Let me step back a moment and explain how this came about. I have a project requiring a fair bit of server-side JavaScript and also object hierarchy. Not wanting to use "raw" JavaScript hierarchy or roll my own, I naturally reached for my favorite JavaScript library, Prototype, since this is one of the many things it does. Now, Prototype is currently tied to web browsers (something the core team are fixing even as I write this), so I figured until Prototype 2 comes out, I'd just copy the relevant parts (the Class class) and then switch to using Prototype 2 later when it comes out.

So I started doing that, but then remembered that I'm not a big fan of how Prototype does supercalls — you know, when a subclass overrides a superclass method but then wants to call the superclass's version. My initial issue was mostly with the API it provides (the superclass's function is a special parameter to the subclass function, which seems odd to me), but as I was copying Class over and reading through the code, and realized that what Prototype does under the covers to make supercalls work (which includes function decompilation, wrapping in closures, and creating a new function on every call to a supercall-enabled method) is an awful lot of runtime overhead.

So I thought: "Surely I could just tweak this a bit to..." Yes, that's right, I fell right into the anti-pattern of reinventing the wheel. But I'm glad I did, because it taught me a lot I didn't know before, and lead me (with some help from my friends) to come up with this dead-simple mechanism for supercalls. (Caveat: I can't imagine this mechanism is unique, but it was new to me.)

Hierarchy In JavaScript


I won't go into a thorough discussion of hierarchy in JavaScirpt, but suffice to say that whether you use Prototype or something else, it probably provides something that lets you create "classes" by calling a special helper function, passing in an object that has properties referencing the instance methods you want in your class, and getting back a constructor you can call:
// Defining the class
var SpiffyClass = Helper.makeClass({
initialize: function() {
// ...initialize an instance...
},
nifty: function() {
// ...do something nifty...
}
});

// Creating an instance
var spiffy = new SpiffyClass();
This pattern — passing an object literal into a helper function — has been very successful in recent years, and for good reason: It's terse but expressive. (There's a problem with it we'll come back to later, but it gets the job done.)

Inside The Helper


Let's look into the helper to see what it's doing. Here's a first-cut on what Helper.makeClass might look like — the code comments basically tell the story (download it here if the below is awkward to read; I hate Blogger!):
// Take I, doesn't help with supercalls.
// Inspired by Prototype's Class class (http://prototypejs.org)
// Copyright (C) 2009-2010 by T.J. Crowder
// Licensed under the Creative Commons Attribution License 2.0 (UK)
// http://creativecommons.org/licenses/by/2.0/uk/
var Helper = (function(){

// This function is used to create the prototype object for our generated
// constructors if the class has a parent class. See makeConstructor for details.
function protoCtor() { }

// Build and return a constructor; we do this with a separate function
// to minimize what the new constructor (a closure) closes over.
function makeConstructor(base) {

// Here's our basic constructor function (each class gets its own, a
// new one of these is created every time makeConstructor is called).
function ctor() {
// Call the initialize method
this.initialize.apply(this, arguments);
}

// If there's a base class, hook it up. We go indirectly through `protoCtor`
// rather than simply doing "new base()" because calling `base` will call the base
// class's `initialize` function, which we don't want to execute. We just want the
// prototype.
if (base) {
protoCtor.prototype = base.prototype;
ctor.prototype = new protoCtor();
protoCtor.prototype = {}; // Don't leave a dangling reference
}

// Set the prototype's constructor property so `this.constructor` resolves
// correctly
ctor.prototype.constructor = ctor;

// Return the newly-constructed constructor
return ctor;
}

// This function is used when a class doesn't have its own initialize
// function; since it does nothing and can only appear on base classes,
// all instances can share it.
function defaultInitialize() {
}

// makeClass: Our public "make a class" function.
// Arguments:
// - base: An optional constructor for the base class.
// - ...: One or more specification objects containing properties to
// put on our class as members. If a property is defined by more
// than one specification object, the last in the list wins.
// Returns:
// A constructor function for instances of the class.
//
// Typical use will be just one specification object, but allow for more
// in case the author is drawing members from multiple locations.
function makeClass() {
var base, // Our base class (constructor function), if any
argsIndex, // Index of first unused argument in 'arguments'
ctor, // The constructor function we create and return
members, // Each members specification object
name; // Each name in 'members'

// We use this index to keep track of the arguments we've consumed
argsIndex = 0;

// Do we have a base?
if (typeof arguments[argsIndex] == 'function') {
// Yes
base = arguments[argsIndex++];
}

// Get our constructor; this will hook up the base class's prototype
// if there's a base class
ctor = makeConstructor(base);

// Assign the members from the specification object(s) to the prototype
// Again, typically there's only spec object, but allow for more
while (argsIndex < arguments.length) {
// Get this specification object
members = arguments[argsIndex++];

// Copy its members
for (name in members) {
ctor.prototype[name] = members[name];
}
}

// If there's no initialize function, provide one
if (!('initialize' in ctor.prototype)) {
// Note that this can only happen in base classes; in a derived
// class, the check above will find the base class's version if the
// subclass didn't define one.
ctor.prototype.initialize = defaultInitialize;
}

// Return the constructor
return ctor;
}

// Return our public members
return {makeClass: makeClass};
})();
So far, nothing new, this is just the kind of thing you'll find in most libraries.

Supercalls


The helper as written is already really useful:
var Parent = Helper.makeClass({
nifty: function() {
return "Nifty!";
}
});
var Child = Helper.makeClass(Parent, {
spiffy: function() {
return "Spiffy!";
}
});
var c = new Child();
alert(c.nifty()); // Alerts "Nifty!" using Parent#nifty
alert(c.spiffy()); // Alerts "Spiffy!" using Child#spiffy
But what if we want to override #nifty in Child to do some processing on Parent's return value? Well, we can do that by going direct to the parent and using Function#call (or Function#apply), but the syntax is quite awkward:
var Child = Helper.makeClass(Parent, {
nifty: function() {
var rv = Parent.prototype.nifty.call(this);
return rv.toUpperCase();
},
spiffy: function() {
return "Spiffy!";
}
});
var c = new Child();
alert(c.nifty()); // Alerts "NIFTY!"
That works, but there are a few good reasons not to do it:
  • It's long-winded, inviting irritating mistakes.
  • It's makes re-basing a class difficult (every method that does this has that superclass name in it).
  • It makes moving code between classes difficult (ditto).
In short, it's a maintenance problem. Manageable, but a problem. And this is why most libraries try to address it in various ways, often involving adding closures and decompilation and indirection and whatnot. On the "up" side, the above is quite direct and performs very quickly. But what if we could get that directness and speed without getting all of that other stuff, and with brevity and maintainability?

We can.

With a Little Help(er) From My Friends


Our makeClass function has all of the information it needs to help us out: It can easily tell when we're overriding a base class function, and it can give us a reference to that function without our having to know the base class name. How? By putting it on our function instance. Remember that functions are first-class objects, we can put properties on them.

This turns out to be really easy. In makeClass, when copying the members from the specification object(s) to the new constructor's prototype, if a member we're copying is a function and we have a base and the base value is also a function, we stick it on the override function as $super. E.g., we replace these lines from earlier:
for (name in members) {
ctor.prototype[name] = members[name];
}
...with:
for (name in members) {
value = members[name];
if (base && typeof value == 'function') {
baseValue = base.prototype[name];
if (typeof baseValue == 'function') {
value.$super = baseValue;
}
}
ctor.prototype[name] = value;
}
(Also add the 'value' and 'baseValue' declarations at the top.)

So now we can get at our superclass's function by looking at the $super property on our own function. But how do we get that? In our Child#nifty function earlier, say, how do we get to Parent#nifty? If your first thought (like mine) is this.nifty.$super, I'm afraid you're not thinking it through — that will work only for one level of hierarchy, it will fail if anything (say, GrandChild) derives from Child and has its own #nifty function, because with a GrandChild instance, this.nifty is always GrandChild#nifty. If Child#nifty called this.nifty.$super in that situation, it would be calling itself, not Parent#nifty, recursing until the engine gave up on it. So this is out.

And here we run into an issue. We want to get at our current function object, but so far we haven't been giving our functions names. They're just anonymous functions we've assigned to properties on objects. The properties have names, but the functions do not, and we've just concluded we can't use the property (because we can only get to it via this, and that means it'll always point to the bottommost — GrandChild — function). So now what?

Well, there is a way, but it's a bit ugly (on a couple of fronts): arguments.callee. The arguments object, I'm sure you know, is an array-like object defined within functions that contains the arguments passed into the function. That object also has a property on it called callee that is a reference to the actual function object. So we can get at our super function using arguments.callee.$super. And like all other functions, we can call it and pass in the value to use as this by using the Function#call function. So:
var Child = Helper.makeClass(Parent, {
nifty: function() {
var rv = arguments.callee.$super.call(this); // Blech
return rv.toUpperCase();
},
});
alert(c.nifty()); // Alerts "NIFTY!"
That works, and is more efficient than most other schemes, but it has some issues:
  • It's long-winded, inviting irritating mistakes (wait, where have I heard that before?)
  • The new ECMAScript standard, 5th edition, has a "strict" mode in which arguments.callee (amongst other things) is disallowed (many thanks to John-David Dalton and Juriy Zaytsev — kangax — for pointing this out to me)
  • It's slow (although faster than many other ways of doing this), most browsers don't set up arguments.callee unless you use it, and they're pretty slow at setting it up (thanks again, kangax, for that one)
Still, it is functional (outside of strict mode). But what if we could get there another way without those problems?

You guessed it, we can. And in fact, we get a lot of other benefits when we do.

Named Functions


The problem is anonymous functions, and it's not the only problem with them either. Anonymous functions are tools-hostile. Our browsers can't tell us the name of the function in which an error occurred if it's anonymous. Our debuggers can't show us meaningful call stacks with anonymous functions (we just see a lot of question marks, usually). Etc.

So if the problem is anonymous functions, well, let's just give our functions names! That way, we can use the name directly, since a function's name is defined throughout the scope in which the function is defined (including within the function itself); that's how we usually write recursive functions, after all.

So when defining classes, how can we give our functions names? One's first thought might be to simply do this:
var Child = Helper.makeClass(Parent, {
nifty: function nifty() { // <= PROBLEMATIC
// ...
}
});
In some ways, that should work, but it has ramifications. The name "nifty" gets defined in the enclosing scope, which in our case above is global — not a good idea! And it flat out doesn't work with Internet Explorer (or anywhere you're using JScript) or Safari. (For the full story, check out Juriy's excellent article on the subject, it's a good read.)

No, it would be better to contain the scope a bit (not to mention working around bugs), which is easy enough: Wrap it up in a function:
var Child = Helper.makeClass(Parent, (function(){
function nifty() {
// ...
}
function spiffy() {
// ...
}
return {
nifty: nifty,
spiffy: spiffy
};
})());
That probably wants breaking down a bit:
  • We define an anonymous function for scoping
  • Within that scoping function, we define our various methods for Child — as named functions
  • Our function names are in scope throughout the scoping function
  • Our scoping function returns an object with the properties mapped to our named functions (I added a "spiffy" to this example just to make clear how it works with multiple functions)
  • We execute the scoping function immediately and pass the result (the object) into makeClass
No scope bleed, no issues with IE or Safari, we're all set.

So okay, but so far I've been leaving out the good bit — the supercall in Child#nifty! Here we go:
var Child = Helper.makeClass(Parent, (function(){
function nifty() {
var rv = nifty.$super.call(this);
return rv.toUpperCase();
}
function spiffy() {
// ...
}
return {
nifty: nifty,
spiffy: spiffy
};
})());
var c = new Child();
alert(c.nifty()); // Alerts "NIFTY!"
Within #nifty, we refer to the function by name — no this, that's very important! — which always refers to the right one. We get its $super property, and use Function#call on it. (We could use Function#apply if we wanted.) We pass in this as the context parameter so that this is correct within the supercall.

Revisiting makeClass


I floated this supercalls idea on the Prototype Core development mailing list, and Allen Madsen pointed out that if we're going to be using functions to create enclosing scope when defining classes, makeClass should just support our passing those functions in directly rather than making us execute them first. It simplifies the syntax markedly:
var Child = Helper.makeClass(Parent, function(){
function nifty() {
var rv = nifty.$super.call(this);
return rv.toUpperCase();
}
return {nifty: nifty};
});
var c = new Child();
alert(c.nifty()); // Alerts "NIFTY!"
We just define the function rather than defining and calling it; makeClass will call it for us. So let's go support that in makeClass (we'll still support providing the specification object directly) by marking constructor functions (so we know whether we have a base class, since everything might be a function now — this was also Allen's idea) and then just invoking any functions we find to get their specification objects. (Code below under "Bringing It All Together".)

Mixins


A fairly common pattern these days it to have "mixins" that can be mixed into classes. A "mixin" is a collection of functions that coordinate to provide some functionality, intended to be added to a class rather than to be a class in its own right. One mixin might be used by several different classes. Mixins are sometimes seen as a replacement for derivation, but to me they're an adjunct — there are times you derive, and other times you mix in.

Mixins are the reason that Helper.makeClass allows multiple specification objects. Suppose we have a mixin that provides semantics around getting and setting a "cool" property:
var CoolMixin = (function(){
function setCool(cool) {
if (typeof cool != 'string') {
throw "Hey, man, that ain't cool.";
}
this.cool = cool;
}

function getCool() {
return this.cool;
}

return {getCool: getCool, setCool: setCool};
})();
Now we can mix that into any class:
var CoolChild = Helper.makeClass(Parent, CoolMixin, function() {
function spiffy() {
// ...
}
return {spiffy: spiffy};
});
var cc = new CoolChild();
alert(cc.nifty()); // Alerts "Nifty!" via Parent#nifty
cc.setCool(1); // throws the uncool exception
CoolChild inherits from Parent, but also gets the cool stuff from CoolMixin. By convention (and in order to ensure that a subclass's own members take precedence), mixins are always included after the parent and before the child's own members.

Mixins raise an issue for makeClass: Normally, it modifies function instances, setting a $super property on them if they override a base class function (it leaves them alone if they don't). But mixins are used in multiple classes, so we shouldn't touch them. While in practice it would be fairly harmless to set $super on the mixin function if the mixin function didn't call $super (which is definitely an edge case), it's still not right and in certain very edgy edge cases could cause class cross-talk, which is a Bad Thing(tm).

To handle this, we can provide a means of marking methods as being mixin methods and telling makeClass to leave them alone. (I actually mark the methods rather than doing it by context because some implementations — like Prototype — allow modifying hierarchies after initial derivation, in which case it's really important to know which functions are mixin functions and which aren't). It may seem to be pandering to an edge case, but there's effectively no runtime cost and the code is trivial, so let's do it — see Helper.makeMixin below.

Alternately, we can take a page from previous implementations and — in this one very specific case — wrap the mixin function so we can set the $super property on the wrapper. I haven't done that here, but it's a viable solution if you think mixins calling $super is a common enough use case for you.

Anonymouses Anonymous


Not everyone will want to use named functions; perhaps they have a large amount of legacy code using specification objects directly. Rather than leave them with the ugly syntax:
arguments.callee.$super.call(this, ...);
...let's give them a helper method to take on some of that complexity:
this.callSuper(arguments, ...);
But we don't want to do that all the time. Hey, that's the case for a mixin! See the CallSuperMixin below.

Bringing It All Together


Okay, taking all of the above, mixing in an IE workaround (thank you, Prototype!), and adding in Helper.makeMixin, we end up with this:
// Take IV: Explicitly handle mixins, provide a mixin for calling super when
// working with anonymous functions.
// Inspired by Prototype's Class class (http://prototypejs.org)
// Copyright (C) 2009-2010 by T.J. Crowder
// Licensed under the Creative Commons Attribution License 2.0 (UK)
// http://creativecommons.org/licenses/by/2.0/uk/
var Helper = (function(){
var toStringProblematic, // true if 'toString' may be missing from for..in
valueOfProblematic; // true if 'valueOf' may be missing from for..in

// IE doesn't enumerate toString or valueOf; detect that (once) and
// remember so makeClass can deal with it. We do this with an anonymous
// function we don't keep a reference to to minimize what we keep
// around when we're done.
(function(){
var name;

toStringProblematic = valueOfProblematic = true;
for (name in {toString: true, valueOf: true}) {
if (name == 'toString') {
toStringProblematic = false;
}
if (name == 'valueOf') {
valueOfProblematic = false;
}
}
})();

// This function is used to create the prototype object for our generated
// constructors if the class has a parent class. See makeConstructor for details.
function protoCtor() { }

// Build and return a constructor; we do this with a separate function
// to minimize what the new constructor (a closure) closes over.
function makeConstructor(base) {

// Here's our basic constructor function (each class gets its own, a
// new one of these is created every time makeConstructor is called).
function ctor() {
// Call the initialize method
this.initialize.apply(this, arguments);
}

// If there's a base class, hook it up. We go indirectly through `protoCtor`
// rather than simply doing "new base()" because calling `base` will call the base
// class's `initialize` function, which we don't want to execute. We just want the
// prototype.
if (base) {
protoCtor.prototype = base.prototype;
ctor.prototype = new protoCtor();
protoCtor.prototype = {}; // Don't leave a dangling reference
}

// Set the prototype's constructor property so `this.constructor` resolves
// correctly
ctor.prototype.constructor = ctor;

// Flag up that this is a constructor (for mixin support)
ctor._isConstructor = true;

// Return the newly-constructed constructor
return ctor;
}

// This function is used when a class doesn't have its own initialize
// function; since it does nothing and can only appear on base classes,
// all instances can share it.
function defaultInitialize() {
}

// Get the names in a specification object, allowing for toString and
// valueOf issues
function getNames(members) {
var names, // The names of the properties in 'members'
name, // Each name
nameIndex; // Index into 'names'

names = [];
nameIndex = 0;
for (name in members) {
names[nameIndex++] = name;
}
if (toStringProblematic && typeof members.toString != 'undefined') {
names[nameIndex++] = 'toString';
}
if (valueOfProblematic && typeof members.valueOf != 'undefined') {
names[nameIndex++] = 'valueOf';
}
return names;
}

// makeClass: Our public "make a class" function.
// Arguments:
// - base: An optional constructor for the base class.
// - ...: One or more specification objects containing properties to
// put on our class as members; or functions that return
// specification objects. If a property is defined by more than one
// specification object, the last in the list wins.
// Returns:
// A constructor function for instances of the class.
//
// Typical use will be just one specification object, but allow for more
// in case the author is drawing members from multiple locations.
function makeClass() {
var base, // Our base class (constructor function), if any
argsIndex, // Index of first unused argument in 'arguments'
ctor, // The constructor function we create and return
members, // Each members specification object
names, // The names of the properties in 'members'
nameIndex, // Index into 'names'
name, // Each name in 'names'
value, // The value for each name
baseValue; // The base class's value for the name

// We use this index to keep track of the arguments we've consumed
argsIndex = 0;

// Do we have a base?
if (typeof arguments[argsIndex] == 'function' &&
arguments[argsIndex]._isConstructor) {
// Yes
base = arguments[argsIndex++];
}

// Get our constructor; this will hook up the base class's prototype
// if there's a base class, and mark the new constructor as a constructor
ctor = makeConstructor(base);

// Assign the members from the specification object(s) to the prototype
// Again, typically there's only spec object, but allow for more
while (argsIndex < arguments.length) {
// Get this specification object
members = arguments[argsIndex++];
if (typeof members == 'function') {
members = members();
}

// Get all of its names
names = getNames(members);

// Copy the members
for (nameIndex = names.length - 1; nameIndex >= 0; --nameIndex) {
name = names[nameIndex];
value = members[name];
if (base && typeof value == 'function' && !value._isMixinFunction) {
baseValue = base.prototype[name];
if (typeof baseValue == 'function') {
value.$super = baseValue;
}
}
ctor.prototype[name] = value;
}
}

// If there's no initialize function, provide one
if (!('initialize' in ctor.prototype)) {
// Note that this can only happen in base classes; in a derived
// class, the check above will find the base class's version if the
// subclass didn't define one.
ctor.prototype.initialize = defaultInitialize;
}

// Return the constructor
return ctor;
}

// makeMixin: Our public "make a mixin" function.
// Arguments:
// - ...: One or more specification objects containing properties to
// put on our class as members; or functions that return
// specification objects. If a property is defined by more than one
// specification object, the last in the list wins.
// Returns:
// A specification object containing all of the members, flagged as
// mixin members.
function makeMixin() {
var rv, // Our return value
argsIndex, // Index of first unused argument in 'arguments'
members, // Each members specification object
names, // The names in each 'members'
value; // Each value as we copy it

// Set up our return object
rv = {};

// Loop through the args (usually just one, but...)
argsIndex = 0;
while (argsIndex < arguments.length) {
// Get this members specification object
members = arguments[argsIndex++];
if (typeof members == 'function') {
members = members();
}

// Get its names
names = getNames(members);

// Copy its members, marking them as we go
for (nameIndex = names.length - 1; nameIndex >= 0; --nameIndex) {
name = names[nameIndex];
value = members[name];
if (typeof value == 'function') {
value._isMixinFunction = true;
}
rv[name] = value;
}
}

// Return the consolidated, marked specification object
return rv;
}

// Return our public members
return {
makeClass: makeClass,
makeMixin: makeMixin
};
})();
Which lets us do things like this:
var Parent = Helper.makeClass(function(){
function hierarchy() {
return "P";
}
return {hierarchy: hierarchy};
});
var Child = Helper.makeClass(Parent, function(){
function hierarchy() {
return hierarchy.$super.call(this) + " < C";
}
return {hierarchy: hierarchy};
});
var GrandChild = Helper.makeClass(Child, function(){
function hierarchy() {
return hierarchy.$super.call(this) + " < GC";
}
return {hierarchy: hierarchy};
});
var gc = new GrandChild();
alert(gc.hierarchy()); // Alerts "P < C < GC"
And I believe I promised a sample mixin to help those anonymouses:
// Define our CallSuper mixin
Helper.CallSuperMixin = makeMixin(function() {
function callSuper(ref) {
var f, // The function to call
args, // Arguments to pass it, if we have any
len, // Length of args to pass
srcIndex, // When copying, the index into 'arguments'
destIndex, // When copying args, the index into 'args'
rv; // Our return value

// Get the function to call: If they pass in a function, it's the
// subclass's version so look on $super; otherwise, they've passed
// in 'arguments' and it's on arguments.callee.$super.
f = typeof ref == 'function' ? ref.$super : ref.callee.$super;

// Only proceed if we have 'f'
if (f) {
// If there are no args to pass on, use Function#call
if (arguments.length == 1) {
rv = f.call(this);
} else {
// We have args to pass on, build them up.
// Note that doing this ourselves is more efficient on most
// implementations than applying Array.prototype.slice to
// 'arguments', even though it's built in; the call to it
// is expensive (dramatically, on some platforms).
len = arguments.length - 1;
args = new Array(len);
srcIndex = 1;
destIndex = 0;
while (destIndex < len) {
args[destIndex++] = arguments[srcIndex++];
}

// Use Function#apply
rv = f.apply(this, args);
}
}

// Done
return rv; // Will be undefined if there was no 'f' to call
}

return {callSuper: callSuper};
});


Conclusion


The goal of this post is not for people to run off and start using the Helper provided here. The goal is to demonstrate a really simple, highly efficient means of implementing supercalls in these sorts of libraries, and discussing some edge cases around doing so. If you use Prototype, you might want to take a look at my patch for Prototype 1.6.1 (I have no idea whether that will make it into Prototype), which handles dynamic redefinition and such (although that patch doesn't — yet — handle mixins).

We've talked about a lot of things, so let's remind oursselves of what the mechanism is at its roots:
  • Have a "make a class" function.
  • In that function, detect that a subclass function (Child#nifty) is overriding a superclass function (Parent#nifty) and, if so, put a reference to the superclass function on the subclass function instance as a property.
  • To call a superclass function from a subclass function, authors use Function#call or Function#apply on the $super property of the subclass function instance (being sure to get the function directly, not via this). This is simplest and fastest with named functions, but possible with anonymous ones too.
  • Allow for mixins, if you're into that sort of thing (and why not, they're cool)
    Oh, and of course:
  • Profit!
Many thanks again to JDD, kangax, Allen, and others who've helped me hammer this out!

Happy Coding,

-- T.J.