Sunday, 13 September 2009

Private Methods in JavaScript

The question of truly private methods comes up intermittently in JavaScript mailing lists. Most respondents point to Crockford's solution, but then also point out how much memory it takes (each instance gets its own copy of methods; but on the up side, it also gives you truly private member variables as well as methods).

Recently I've been switching to the module pattern for each class, thanks in part to kicking around the Prototype source code a fair bit (and also having read Juriy Zaytsev's article on named function expressions). Since private scope is an intrinsic part of the module pattern, I wondered whether it could help us define truly private methods. And the answer is: Of course it can.

The Pattern


For those not familiar with it, the module pattern is basically:
  • Use an enclosing function to provide scope
  • Use named functions within your function
  • Export the ones you want to have as public methods by returning them from the function (usually as properties on an object)
E.g., something like this:
var CoolModule = (function() {
function foo() { ... }
function bar() { ... }
function cool() { ... }

return {foo: foo, bar: bar};
})();
The anonymous function is the scoping function, which we define and then run immediately, assigning its return value to the CoolModule variable. We end up with a CoolModule object with the properties foo and bar that refer to named functions with the same names. Note that in this example, I did not include cool in the returned object's properties. So what's cool for?

The cool function is accessible anywhere within the scoping function, but nowhere outside of it. So foo can use it, as can bar, but nothing else can. It's a truly private function for the CoolModule module. That's already really useful, but we can also apply it to classes:

Defining Classes


Okay, so let's look at building a class that way. Most toolkits provide some means of creating a "class" based on an object hash of methods. For instance, using Prototype's Class.create, we can define a class like so:
var Thingy = Class.create({

initialize: function(name) {
this.name = name;
},

doSomething: function() {
// ...
}

});
That passes an object literal with named properties for each of our "methods", which Class.create uses to produce a "class" for us (well, more technically a constructor function with the given methods as members on the prototype; JavaScript doesn't have classes, of course, it's a prototypical language — but that's a different subject).

Defining Thingy that way has a problem: All of its functions are anonymous. They're assigned to properties, and those properties have names, but our functions do not. This is a problem any time any tool wants to help us out &mdash for instance, if the browser wants to tell us what function an error occurred in, or if a debugger wants to show us the call stack. So we're better off using the module pattern here, like so:
var Thingy = Class.create((function(){

function initialize(name) {
this.name = name;
}

function doSomething() {
// ...
}

return {initialize: initialize, doSomething: doSomething};
})());
Now our functions have proper names. (They can have different names from the properties that reference them, which is sometimes handy, but let's keep them the same for now.)

Doing that gives us a truly private scope (our anonymous function), which surely helps with giving us truly private methods, right?

Right.

Class Methods


You remember that cool function from earlier, the one we defined but didn't export. The easiest and most obvious benefit we get from our scoping function is that we immediately have class methods with no need for any other work: We just define a function and don't export it in the return value. It's available to all of our other functions, but it's completely closed off from the outside. It's not tied to any instance, but class methods are quite useful things.
var Thingy = Class.create((function(){

function initialize(name) {
this.name = name;
}

function doSomething(param) {
var transmogrified = transmogrify(param);
// ...
}

function transmogrify(str) {
return str.toUpperCase();
}

return {initialize: initialize, doSomething: doSomething};
})());
Our public method doSomething uses our private class method transmogrify to do something to its parameter, presumably something other methods will also need to do but which shouldn't be a public feature of the class. Very handy!

But can this help us get private instance methods, too, without massive memory overhead?

Yup. Remember that JavaScript doesn't have methods, it has functions (very powerful ones). And how a function falls into our somewhat inappropriate class-based terminology ("function", "instance method", "class method") depends entirely on how it's called; there's nothing intrinsic to the function that makes it one of those things. So if we have truly private class methods, that means we also have truly private instance methods, it's just how we call them. In particular, to get an instance method, all we do is call it in a way that we set its context (its "this" value). JavaScript provides three ways to do that:
  1. By assigning it to a property on an object, and then calling it via method notation: obj.method()
  2. By using Function#call
  3. By using Function#apply
The only difference between the last two is how they accept arguments to pass to the function.

Based on the above, so far, I've come up with four ways to get truly private instance methods (at least two of which I know I've seen elsewhere [one after I came up with it, one before]; and I can't imagine the other two are really original, either, even if I haven't seen them). They have varying trade-offs between speed, convenience, and maintainability.

Calling All Instances


The first way is really quite obvious: Using Function#call or Function#apply. Here's an example:
var Thingy = Class.create((function(){

function initialize(name) {
this.name = name;
}

function doPublic() {
doPrivate.call(this);
}

function doPrivate() {
alert(this.name);
}

return {
initialize: initialize,
doPublic: doPublic
};
})());
var t = new Thing("Fred");
t.doPublic(); // Alerts "Fred" via private instance method
Our doPublic function calls our doPrivate function via Function#call; our truly private instance method then shows the name of the Thingy. doPrivate's context ("this") is set to the instance, and so it has normal access to the instance's properties, including name. Voilà! (Sort of.)

Going Procedural


If, like me, you thought "doPrivate.call(this)" looked vaguely familiar, we may have something in common: A background in procedural languages. Using Function#call to call a function and set its context looks very much like just calling a function and passing it a parameter. So even though it's cheating a bit, let's just give a nod to procedural programming here:
var Thingy = Class.create((function(){

function initialize(name) {
this.name = name;
}

function doPublic() {
doPrivate(this);
}

function doPrivate(self) {
alert(self.name);
}

return {
initialize: initialize,
doPublic: doPublic
};
})());
var t = new Thing("Fred");
t.doPublic(); // Alerts "Fred"
Here, we just have doPublic call doPrivate in a procedural fashion, passing in the "this" reference. Since there are no private properties in JavaScript, doPrivate has access to just as much information as it did when called in a more instance-like way.

Okay, but really the holy grail has to be calling the method in the "normal" way: this.method()

Leveraging Prototypes


As you probably know, when an object is created via a constructor function, its prototype gets set to the prototype property of the constructor function. All of the members on that prototype are inherited by the object. They're not copied, they're inherited. That opens the door to a means of getting the holy grail syntax with truly private methods, but at a cost: What if we create a new object with "this" as its prototype? Then we can put the private methods on it, and no one else can see them. Well, this does work, but the costs are pretty high. Let's walk through it.

First, how do we create a new object and assign "this" as its prototype? Well that part's actually quite easy: We just have a constructor function lying around, set its prototype property to the instance we want to be the prototype, create a new instance with it, and then restore the constructor's prototype (just to be tidy, we shouldn't keep a reference to the other instance anywhere). (Some browsers let us directly manipulate the prototype of an actual object via a __proto__ property, but others don't and it hasn't been standardized, so we'll stick to the constructor way.) Here's what that looks like:
var Helper = (function() {

// A factory for setting up our private instances
function factory() {
}

// Our public makePrivate function
function makePrivate(instance, privateMethods) {
var rv, name;

// Set the prototype of our factory
factory.prototype = instance;

// Get our private instance
rv = new factory();

// Reset the factory prototype (just to be tidy)
factory.prototype = Object.prototype;

// Copy over the private methods
for (name in privateMethods) {
rv[name] = privateMethods[name];
}

// Done
return rv;
}

// Publish the "makePrivate" function
return {makePrivate: makePrivate};
})();
And we can use it like this:
var Thingy = Class.create((function() {
var privateMethods;

// The private methods for our class
privateMethods = {
doPrivate: doPrivate
};

function initialize(name) {
this.name = name;
}

function doPublic() {
var self;

// Get a version of "this" that has our private stuff
self = Helper.makePrivate(this, privateMethods);

// Use it
self.doPrivate();
}

function doPrivate() {
alert(this.name);
}
return {
initialize: initialize,
doPublic: doPublic
};

})());
var t = new Thingy("Fred");
t.doPublic(); // Alerts "Fred"
Leaving aside the setup overhead, we've reached the holy grail, right?

Hmmm, not quite. This is all very well for if we read properties, but if we write them, we run into a bit of a problem: Setting a property on our private instance will only set it there, not on its prototype, so when we discard our private instance (as we do at the end of doPublic), any changes are lost with it &mdash e.g., if doPrivate changes anything:
    function doPrivate() {
alert(this.name);
this.name = this.name.toUpperCase();
}
...the changes are lost:
var t = new Thing("Fred");
t.doPublic(); // Alerts "Fred"
alert(t.name); // Also alerts "Fred", not "FRED"
Not good. Now, your first thought (if you're like me) is that all we need is a method that sets the property on the underlying instance, and all private methods need to be sure to use it. But no, if you think about it a little longer, you realize that not only the private methods but also the public ones have to use it (since if we call them, this will refer to our private instance). Worse, not only your own methods, but all superclass methods also have to use it. Now, if you're an adherent to the philosophy that all property access should be through getters and setters anyway, that will work, and your setter would look like this:
function setName(name) {
var inst = this.hasOwnProperty('_original')
? this._original
: this;
inst.name = name;
}
...assuming you've modified makePrivate to store the original instance on the private instance as the property _original. If you're not an adherent of that getter/setter philosophy, though (and I've never seen any substantial JavaScript written by anyone who was), it's a problem.

There's another way: Just as we have to do something when we "go private", we just add a step for when we're "done" with the private thing. The "done" function takes any properties explicitly assigned to the private instance and copies them to the original. Like so:
var Helper = (function() {

// A factory for setting up our private instances
function factory() {
}

// Our public enterPrivate function
function enterPrivate(instance, methods) {
var rv, name;

// Set the prototype of our factory
factory.prototype = instance;

// Get our private instance
rv = new factory();

// Restore the factory prototype (just to be tidy)
factory.prototype = Object.prototype;

// Remember the private methods and the original
// object, we'll want them for later
rv._privateMethods = methods;
rv._original = instance;

// Copy over the private methods
for (name in methods) {
rv[name] = methods[name];
}

// Done
return rv;
}

// Our public exitPrivate function
function exitPrivate(instance) {
var name, original;

original = instance._original;
for (name in instance) {
if (name != '_original' &&
instance.hasOwnProperty(name) &&
!(name in instance._privateMethods)
) {
original[name] = instance[name];
}
}
return original;
}

// Export our public functions
return {enterPrivate: enterPrivate, exitPrivate: exitPrivate};
})();
..and then we can use it like this:
var Thingy = Class.create((function() {
var privateMethods;

// The private methods for our class
privateMethods = {
doPrivate: doPrivate
};

function initialize(name) {
this.name = name;
}

function doPrivate() {
alert(this.name);
this.name = this.name.toUpperCase();
}

function doPublic(name) {
var self;

// Public->Private boundary, need to "enter" private mode
self = Helper.enterPrivate(this, privateMethods);

// ...do things...
self.doPrivate();

// Private->Public boundary, need to "exit" private mode
Helper.exitPrivate(self);
}

return {
initialize: initialize,
doPublic: doPublic
};

})());
var t = new Thing("Fred");
t.doPublic(); // Alerts "Fred"
alert(t.name); // Also alerts "FRED"
Now we're safe from losing our updated properties when we're done with the private instance (provided we don't forget to call exitPrivate!).

Naturally the enterPrivate and exitPrivate functions mean we incur a lot of runtime overhead; but at least once we're "in," we're able to do things with completely normal syntax and it Just Works. And using prototypes in this way is very cool. Unfortunately, though, all of that copying back and forth and object instantiation really does kill performance.

Enh, Forget Prototypes


If we're going to be doing a lot of copying of things when "entering" and "exiting" private code, we can try to improve performance by ditching object instantiation and fancy prototype stuff and just mucking about with the original instance directly:
var Helper = (function() {

// Our public enterPrivate function
function enterPrivate(instance, methods) {
var old, name;

old = {};
for (name in methods) {
if (instance.hasOwnProperty(name)) {
old[name] = instance[name];
}
instance[name] = methods[name];
}
if (instance.hasOwnProperty('_oldStuff') && !('_oldStuff' in old)) {
old._oldStuff = instance._oldStuff;
}
if (instance.hasOwnProperty('_methods') && !('_methods' in old)) {
old._methods = instance._methods;
}
instance._methods = methods;
instance._oldStuff = old;
return instance;
}

// Our public exitPrivate function
function exitPrivate(instance) {
var methods, old, name;

methods = instance._methods;
delete instance._methods;
old = instance._oldStuff;
delete instance._oldStuff;

for (name in methods) {
delete instance[name];
}
for (name in old) {
instance[name] = old[name];
}
return instance;
}

// Export our public functions
return {enterPrivate: enterPrivate, exitPrivate: exitPrivate};
})();
And then we use it like so:
var Thingy = Class.create((function() {
var privateMethods;

// Our private methods
privateMethods = {
doPrivate: doPrivate1
};

function initialize(name) {
this.name = name;
}

function doPrivate() {
alert(this.name);
this.name = this.name.toUpperCase();
}

function doPublic(name) {

// Public->Private boundary, need to "enter" private mode
Helper.enterPrivate(this, privateMethods);

// Do some stuff
this.doPrivate();

// Private->Public boundary, need to "exit" private mode
Helper.exitPrivate(this);
}

return {
initialize: initialize,
doPublic: doPublic
};

})());
var t = new Thingy();
t.doPublic(); // Alerts "Fred"
alert(t.name); // Alerts "FRED"
Well, like the prototype solution, it works, but it has some pretty ugly performance.

I should also mention that both the prototype and the overlay solutions will wreak havoc if you have a subclass with a private method that has the same name as a public method on a base class. You Have Been Warned.

Performance Matters


Okay, so bringing it all together, we've looked at four mechanisms (one with two variations) that avoid the problem of creating a copy of every method for every instance. We've mentioned that a couple of them incur quite a bit of overhead. But...how much overhead, really? I mean, do they cut speed by 10%, 20%, 30%? Enquiring minds want to know!

Well, I put together basic test for six options:
  1. Don't worry, just use public methods and tell people not to call them (e.g., by naming convention or some such).
  2. Use the Function#call variety.
  3. Use procedural programming.
  4. Use the prototype trick, with an "exit" that finds and copies back properties set since going private.
  5. Use the prototype trick, being careful to always use "set" methods.
  6. Using the overlay trick (updating the instance in place).
You can grab the source or run it for yourself here; here's the skinny from some popular browsers on Windows:

Firefox:



IE7 (I don't have IE8 on my little netbook):



Chrome:



Safari:



Opera:



As you can see, our "holy grail" solutions (the last three in each set) perform abysmally. It varies by browser quite a bit, but across the board they're really dramatically slow. Mind you, my implementations of them may well be sub-optimal (I dashed them off), but the implementation would have to improve hugely to make performance acceptable. Combine that with the inevitable bugs related to forgetting an "enter" or "exit" call, and I think you'd be hard-pressed to find a good use case for them.

The results for the first three are interesting. Chrome and IE7 are apparently faster at finding a function object by looking at the properties of the instance than they are by looking on the scope chain; Firefox, Safari, and Opera are faster when we ask for the function from the scope chain. The theme that definitely emerges, though, is that if performance matters to you, just use public methods people are supposed to leave alone, or make your truly-private functions procedural (although the cost of calling them via Function#call instead isn't very high, if you really prefer that).

As a side-note, although I didn't test it, I would expect Crockford's pattern (closures for each instance, each instance gets its own copy of all functions) to have call times in line with the procedural timings above, as that's effectively what it does, just at the instance rather than class level.

Does Performance Matter?


So the next question is, okay, so some of the mechanisms impact call time by huge amounts &mdash more than an order of magnitude in some cases. But...does it matter? Computers are crazy fast these days, right?

Chrome, the fastest in this test (and many others) varied from performing a private call in 0.000003418396717245264495027600135 seconds (three hundreths of a millisecond) at its fastest to doing it in 0.00013514974591847767326197426749 seconds (1.3 milliseconds) at its slowest. IE7, the slowest in this test (and many others) varied from 0.000092537755404204915605567071365 seconds (9/10ths of a millisecond) per call down to 0.48923679060665362035225048924 seconds (just under half a second) per call. And these tests were done on a netbook-class machine (wonderful little unit, don't let the picture in the WP article put you off, it's a color-balance problem in the photo).

Everyone has to make their own decision about this. My stake in the ground is that it does matter, yes, particularly as long as the majority of web users are still using the slowest web browser out of the big players: IE. In our modern web applications, a click by a user might kick off 10-20 underlying JavaScript function calls (at least!). If we say 10 and half of those are "private," it makes the difference between IE in responding in no less than 5 milliseconds to taking at least 2.4 seconds. That's a big drop in user-perceived performance. So for my money, within an application I probably won't worry and will just make things public; in library work (either internal libraries used by multiple apps, or publicly-released libraries) I'll probably go with truly private class methods and pass in instances when necessary (e.g., procedural use).

Whatever you end up doing, I hope this roundup of private method options is useful.

Happy coding,

—— T.J.

1 comment: