Saturday, 15 March 2008

Closures by example

In an earlier post, I promised a follow-up with a few examples of closures. Before we get to the examples, a quick note: To avoid cluttering things up with browser-specific stuff, I'm using the Event.observe() method from Prototype. Event.observe() hooks up an event handler, allowing for differences between browser implementations (see the link for more detail).

I have to say that I found this post to be a real challenge, and it took me a while to figure out why: I kept trying to boil things down to their essence, and the fact is that if you do that, you end up with a single example, one that demonstrates that a closure function has access to intrinsic data. And let's face it, that's not very interesting. :-)

So rather than boil things down, I'll give a few examples of where closures frequently get used, even though they all pretty much demonstrate the same fundamental point (about the functions having intrinsic data).

Okay, to the examples:

  1. A Simple Bound Event Handler
  2. Enduring References
  3. Enduring References to What
  4. Private Properties
  5. Callbacks
  6. The Inadvertent Closure (an anti-pattern)
  7. Your Examples
#1: A Simple Bound Event Handler

One of the most common uses of closures is to bind a function and some data to an event handler:
function wireUpMessage(element, msg)
{
Event.observe(
element,
'click',
function()
{
alert(msg);
}

);
}
This wireUpMessage function creates a closure (the bit in bold) and hooks it up to the "click" event of the given element, having it show the given message. The closure keeps a reference to the arguments and variables in scope where it was defined, so it has access to the msg argument we passed the wireUpMessage function, even though that function has returned.

Assuming we have a button with the ID "btnSayHey", we can use this as follows:
wireUpMessage('btnSayHey', 'Hey there');
Try it out:



#2: Enduring References

You could easily get the idea from the previous example that the closure we created somehow had the text "Hey there" bound into it literally. That's not true. It's the reference to the context containing the msg argument that's bound to the closure, not the value of that argument. The msg argument's value is evaluated when the closure is executed, not when it's defined. This is important, powerful, and frequently misunderstood. ;-)

Consider this:
function setupCounterButtons(startBtnName, showBtnName, stopBtnName)
{
var counter;
var intid;

counter = 0;
intid = undefined;

function startCounter()
{
stopCounter();
counter = 0;
intid = window.setInterval(function() { ++counter; }, 200);
}
function stopCounter()
{
if (intid !== undefined)
{
window.clearInterval(intid);
intid = undefined;
}
}
function showCounter()
{
var msg;

msg = "Counter = " + counter + " (";
if (intid === undefined)
{
msg += "not ";
}
msg += "running)";
alert(msg);
}

Event.observe(
document.getElementById(startBtnName),
'click',
startCounter
);
Event.observe(
document.getElementById(showBtnName),
'click',
showCounter
);
Event.observe(
document.getElementById(stopBtnName),
'click',
stopCounter
);
}
This function has the local variables counter and intid, three named functions (which are closures), and a fourth anonymous closure we've passed into window.setInterval. The named functions are: startCounter, which starts a 200ms repeating update of the counter local variable; showCounter, which displays the current counter value; and stopCounter, which stops the 200ms update. It hooks these functions up to the buttons whose names we pass into the function. So if we hook up the buttons 'btnStartCounter1', 'btnShowCounter1', 'btnStopCounter1' with this code:
setupCounterButtons('btnStartCounter1', 'btnShowCounter1', 'btnStopCounter1');
...we get to try it out:



This demonstrates two important things: Firstly, the closures have an ongoing reference to the counter variable, not its literal value where they're defined. Secondly, all of the closures are referencing the same context: When startCounter sets the counter and intid values, showCounter sees those changes; when the closure we've enabled via window.setInterval updates counter, we see that update. All closures defined in the same context share access to that same context.

#3: Enduring References to What?

Look again at the previous example and think for a moment about what it is that the closures have access to, which endures even after the setup function has completed. Is it something unique attached to the setup function? Well, if that were true, what would happen if I called the function again and hooked it up to different buttons; would those buttons control and access the same counter that the first set of buttons do? If your impulse is to say "yes," consider that counter is a local variable within the function. Local variables aren't specific to the function for all eternity, right? They only relate to a specific time you've called that function. So if we call it again...yup, that's it, we get a new context, and the closures within that context access that new context, not the previous one.

And so if we hook up a new set of buttons:
setupCounterButtons('btnStartCounter2', 'btnShowCounter2', 'btnStopCounter2');
We get a second counter. Try it out and see whether there's any interaction between the counter controlled by these buttons:



...and the one controlled by the buttons in the previous example.

Right, there isn't any, because they're referencing different counters.

#4: Private Properties

Lots of JavaScript apps these days make extensive use of object orientation. One key OOP principle is information hiding. There are several good reasons for information hiding. One example is to defend against an object instance having an invalid state. Frequently this is done by having private data members and only allowing access to them via accessor methods that can prevent invalid values. Since all properties of JavaScript objects are public, how can we have private data members? The answer, of course, is to use closures, much as we did with the counters above. (Crockford is probably the first to document how you can do this, here.)

Here's an example, a Circle class that ensures that its radius is never allowed to become negative:
function Circle(radius)
{
this.setRadius = function(r)
{
if (r < 0)
{
throw "Radius cannot be less than zero.";
}
radius = r;
};
this.getRadius = function()
{
return radius;
};

this.setRadius(radius);
}
Note that we don't have a "radius" property on this object at all. This code:
var c;
c = new Circle(10);
alert("Radius: " + c.radius); // undefined!!
Shows "Radius: undefined" because the property simply doesn't exist. Try it:



Instead, the setRadius and getRadius methods of the object are closures, both of which have a reference to the context in which they were defined, and they share that context as they were defined in the same scope. So rather than having a "radius" property, they simply use the radius parameter given to the constructor, which they keep a reference to even after the constructor has returned. Consequently, this code works just fine:
var c;
c = new Circle(10);
alert("Radius before update: " + c.getRadius());
c.setRadius(20);
alert("Radius after update: " + c.getRadius());
Try it:



(I should mention that there is a downside to this. When you do this, all Circle objects [for example] have their own copies of the setRadius and getRadius functions, rather than sharing copies on an underlying Circle.prototype object. That's a whole different topic, though.)

#5: Callbacks

Closures get used as callbacks a lot. Of course, event handlers are callbacks and so we've already talked about this, but it's useful to remember non-event-handler callbacks.

One very common example is container objects that offer a method (usually called "each" or "forEach") that will call a callback once for each contained object. Prototype provides this (Enumerable.each) on arrays and several other objects; Dojo has something similar (dojo.forEach); JavaScript 1.6 defines this for arrays (Array.forEach). So if we have an array (say) of Person objects, we can act on each element like this (using Prototype syntax):
personArray.each(
function(person)
{
// Do something with the 'person' object
}
);
Suppose we wanted to build a sublist of only the people who are 65 and over:
var sixtyFivePlus;
sixtyFivePlus = [];
personList.each(
function(person)
{
if (person.age >= 65)
{
sixtyFivePlus.push(person);
}
}
);
That uses a closure to add each 65-and-over person to the sixtyFivePlus array.

In practice, closures are used like this in modern JavaScript all over the place.

#6: The Inadvertent Closure (an anti-pattern)

In Example #1, I said that a very common use of closures was to bind a function and some data to an event handler. Unfortunately, while it may literally be true that that's a very common use, I suspect many times it's not the intent of the programmer creating the closure. I suspect much of the time, the programmer just wanted to bind the function, not any data, to the event handler, but didn't realize they were creating a closure and binding data as well.

Unless a page author is using DOM Level 0 event handling (e.g., onclick attributes in the HTML markup), usually there's a setup function called on page load that hooks up event handlers to the buttons and such using some analog of the Event.observe function I've been using in this post. Sometimes those setup functions are dealing with large data sets, perhaps a big JSON or XML document retrieved via an XMLHttpRequest query, referencing that data with local variables. Then the event handlers are defined within the setup function, which means they're closures, which means...right, they keep a reference to the setup function's context, and so the large data set is retained in memory.

Now, if that's on purpose -- the page is referring back to it periodically, etc. -- that's fine. But if the data set was just to be used for setup, it's a waste of memory. I call this the "Inadvertent Closure" anti-pattern.

Here's an example:
function mySetup()
{
var setupData;
var n;

// Set 'setupData' to some large amount of data used only for setup;
// the below is just a contrived array.
setupData = [];
for (n = 0; n < 1000; ++n)
{
setupData[n] = "Item " + n;
}

// (Presumably use the setup data for something here)

// Hook up some event handlers
Event.observe(
'someButton',
'click',
function()
{
// Do something interesting
}
);
Event.observe(
'someOtherButton',
'click',
function()
{
// Do something else interesting
}
);
// (Etc.)
}
Here, a bunch of setup data is allocated and referenced via the setupData array, and then we hook up a couple of event handlers that don't need a reference to the setupData array.

Well, they may not need it, but they have it, which means that all of that data is sitting around eating up memory unnecessarily.

The good news is that it's easy to fix. There are at least two ways to fix it; the way I don't like, and the way I like. ;-)

The way I don't like: Just "null out" the big data set when you're done with it; in our example, we'd just add this line to the end of the mySetup function:
setupData = undefined;
Personally, though, I prefer modularity: Break out the bit that needs to deal with the big data set into its own function, and the setup of the event handlers in their own function:
function myBetterSetup()
{
doTheBigDataThing();
setupEventHandlers();
}
function doTheBigDataThing()
{
var setupData;
var n;

// Set 'setupData' to some large amount of data used only for setup;
// the below is just a contrived array.
setupData = [];
for (n = 0; n < 1000; ++n)
{
setupData[n] = "Item " + n;
}

// (Presumably use the setup data for something here)
}
function setupEventHandlers()
{
Event.observe(
'someButton',
'click',
function()
{
// Do something interesting
}
);
Event.observe(
'someOtherButton',
'click',
function()
{
// Do something else interesting
}
);
// (Etc.)
}
This give us a clear separation of what we're doing, and as a nice side-effect, prevents the closures from keeping our data set around unnecessarily.

At this point, a couple of you might be thinking "OMG! But that means any time I'm dealing with a big temporary data set, I need to be sure no closures are defined within that context! What if I want to act on the data set with an iteration function like in Example #5?!" Don't worry. Remember that the context is kept around because it has a reference from the closure; when the closure is released and cleaned up by the garbage collector, the context is also released and can be cleaned up by the GC. This only matters if you're keeping an enduring reference to the closure, as in the case of event handlers.

#7 and on: Your Examples!

I've tried to give some overview of closures based on examples of usage, and I hope it's been useful, but I've really only scratched the surface here; what are your examples? Post away!

4 comments:

Petr 'PePa' Pavel said...

Mhmmm, very interesting again. I did want to learn AJAX one time or another, but I didn't know I would do it as a side product of reading your blog :-)

kangax said...

I have never seen "setupData = undefined;" as a way to break closure references. I've seen "delete something" or "something = null". Is "something = undefined" equivalent to "delete something"?

T.J. Crowder said...

@kangax - "setupData = undefined" (or "setupData = null") doesn't break the closure, it just releases the bunch of data that setupData references. So the closure still has access to the setupData variable, it's just that that variable doesn't refer to much of anything. (We can't delete setupData because it's not a property, it's a declared variable.) When clearing JavaScript variables, I prefer setting them to undefined (their initial state) instead of null, but null would also work for our purposes here.

This is part of why I prefer to use modularity rather than this "null out" approach. :)

Manoj said...

Very good article.