Sunday, 19 October 2008

Unobtrusiveness

Another link-post today (although again a link to a snippet I've written, so not completey OT) since most of my writing time is going to the Unofficial Prototype & script.aculo.us wiki at the moment: How To - Using Unobtrusive JavaScript

Thursday, 16 October 2008

Minimizing Download Times

Hello all,

That's right, first post since...wow, since April. And it's not even a post, it's sort of a link-post.

I've been doing some work helping build the Prototype user community (moderating the user discussion group, creating an unofficial wiki, that kind of thing) and as part of that I've been doing little mini-articles, much like the ones I expected to do here.

So if you're writing web applications or web pages and you're interested in minimizing the download times for your scripts, check out this article I posted over there: Tip - Minimizing Download Times

Tuesday, 1 April 2008

You must remember 'this'

Of all the tech blogs in all the sites in all the worldwide web, you walk into mine...

If you hang out in JavaScript-oriented newsgroups like these for any length of time, you will eventually see some variation of this question:

Hey, why doesn't this work?
function MyWidget(name)
{
this.name = name;
this.element = null;
}
MyWidget.prototype.showName = function()
{
alert('The name is ' + this.name);
}
MyWidget.prototype.hookElement = function(element)
{
this.element = element;
Event.observe(this.element, 'click', this.showName);
}
function test()
{
var widget;

widget = new MyWidget('Test Name');
widget.hookElement(document.getElementById('testDiv'));
}
"testDiv" is a div in the document, and I know that I'm not calling the test() function before the DOM is loaded, so why is it when I click the div I get the message "The name is undefined"?!
(As always, I'm using some convenience syntax in the above for hooking up the event handler.)

The OP (original poster) might even follow on with:
I even tried changing the observe() line to this:
Event.observe(this.element, 'click', function() {
this.showName();
});
because I heard somewhere that you have to do that, but that's even worse, it causes an error saying this.showName() isn't a function?!
The issue here is that the OP hasn't quite grokked "this" and its special role in the JavaScript world.

I talked a bit about 'this' over here, but I wanted to do a post focussing on the specific pitfall the OP above, like so many of us, has fallen into (forgetting 'this') and how you deal with it.

Let's look at what's wrong with this line first:
Event.observe(this.element, 'click', this.showName); // Wrong
JavaScript doesn't have methods (see link above), and so this.showName just returns a function reference with absolutely no connection to the instance the OP wanted to bind to the element. It's just a function. (Used properly, this is a powerful feature, but in this situation it's causing the OP some trouble.) Recall that showName is defined like this:
MyWidget.prototype.showName = function()
{
alert('The name is ' + this.name);
}
Within the code, 'this' is determined not by where the event handler is set up, but by how the function gets called. Most likely, 'this' will be a reference to the element that was clicked ('testDiv'), because modern browsers use the element related to the event as 'this' within event handlers. Consequently, this.name is undefined unless the element in question happens to have a name attribute.

So to get the intended effect, you have to wrap the call to this.showName() so that 'this' is the MyWidget instance when the code gets executed -- you must remember 'this'. Which is probably what the OP heard about when he tried this:
Event.observe(this.element, 'click', function() {
this.showName();
}); // Still wrong
This is getting closer, and in fact it would work if we were using a variable to reference the widget rather than 'this', but because it's 'this', we actually still have exactly the same problem we had before: When the event handler gets called, 'this' is the element, not the widget, and so there's no showName() function to call.

So how do we deal with this? Well, here's one approach I've seen to rewriting the hookElement function:
MyWidget.prototype.hookElement = function(element)
{
var self;

this.element = element;
self = this;
Event.observe(this.element, 'click', function() {
self.showName();
});
}
This works because we're no longer using 'this' within the event handler, we're using 'self' (the event handler has access to 'self' because it's a closure; more here). So that solution works. I can't say I like it, though. It just feels...hacky, I guess. But still, it works, and although it looks a bit funny the first time, if you're familiar with the idiom you read right past it thereafter. You just need to be sure the closure isn't unnecessarily preserving some other big amount of data from elsewhere in the function.

Personally, though, I prefer using a reusable "binding" function. Many JavaScript toolkits have these (such as Prototype's bind() and bindAsEventListener()), but it's not complicated:
function bind(f, obj)
{
return function() {
return f.apply(obj, arguments);
};
}
This is a function factory: It creates functions that, when called, will call the given function with the given object set as 'this' (using JavaScript's convenient apply() function; insert your own "The fundamental things apply" joke here). Now we can rewrite the OP's hookElement function like so (changes from the original at the top in bold):
MyWidget.prototype.hookElement = function(element)
{
this.element = element;
Event.observe(this.element, 'click',
bind(this.showName, this)
);
}
You might be wondering why we have to specify 'this' twice. Remember that this.showName just returns a function reference, with nothing about the instance (we could replace this.showName in the above with MyWidget.prototype.showName if we liked). If we want bind() to know what instance we want to bind the function to, we have to specify it -- the this at the end.

And that's it! Now the event handler works as the OP expected it to.

Saturday, 29 March 2008

What's in a name?

Micro-post today, folks:

You see a fair bit of confusion from newbies in mailing lists around the "name" attribute vs. the "id" attribute. (Edit 21 March 2010: Confusion not helped by a bug in Internet Explorer; more here.) For instance, I recently saw a "how do I do this" -style post asking how to deal with a form, with the sample being:

<form name='form1'>
<label><input type='checkbox' id='cb1'> Checkbox 1</label>
</form>
The poster wanted to know how to retrieve the form and submit it (along with some further information) via Prototype's Ajax.Updater. So he wanted to use the serialize() method Prototype adds to form elements when it extends them (e.g., when you retrieve the element using $()).

Consequently, the use of "id" and "name" attributes in his example was exactly backward: He wanted to be able to retrieve the form using $(), which is a general-purpose routine that retrieves elements by their unique ID, and then have the form fields submitted -- but the form field had no name.

Here's the mantra:
Elements have IDs; form fields have names.
IDs are unique in the document; form field names are not (necessarily).

Two further notes:

1. Form fields can also have IDs if you need to refer to their elements in your code (to enable/disable them, etc.), but in terms of sending in a form, fields have names.

2. You don't have to use an ID to get at the element for a form; you can use a name and then get at the form element via document.YourFormName in your JavaScript code. But nowadays we mostly look elements up by their unique IDs; and in the specific case of the question from the poster above, since he was going to want to use $(), he would want an ID.

Monday, 24 March 2008

Mythical methods

Note: As of ES2015, JavaScript arguably got methods (and that is the term the spec uses) via a new notation on object initializers and via the class syntax. This article predates ES2015 and refers to non-method properties on prototypes, created with the function keyword (the new method syntax doesn't use the function keyword).
We frequently talk about JavaScript objects having methods. This is just a convenient myth. JavaScript has functions, but it doesn't have methods. It doesn't need them. Its functions, combined with some syntactic sugar, are more than up to the job.

What is a "method"? I'd have to say that the definition given on Wikipedia is pretty good. As of this writing, it says a method is
...a subroutine that is exclusively associated either with a class...or with an object...
Yeah, JavaScript doesn't have any of those. Granted it seems to have them. For example:
function Guess(killer, location, weapon)
{
    this.killer = killer;
    this.location = location;
    this.weapon = weapon;
}
Guess.prototype.accuse = function()
{
    alert('It was ' + this.killer +
          ' in the ' + this.location +
          ' with the ' + this.weapon +
          '!');
};

function testGuess()
{
    var mustardStudyLeadPipe;
    var plumHallRope;

    mustardStudyLeadPipe = new Guess(
        'Colonel Mustard',
        'study',
        'lead pipe');
    mustardStudyLeadPipe.accuse();

    plumHallRope = new Guess(
        'Professor Plum',
        'hall',
        'rope');
    plumHallRope.accuse();
}
Running testGuess does indeed show us
"It was Colonel Mustard in study with the lead pipe!"
and then
"It was Professor Plum in hall with the rope!"
so that looks an awful lot like accuse is a method of Guess objects.

Except it isn't. Let's add a bit to our testGuess function (new bits in bold):
function testGuess()
{
    var mustardStudyLeadPipe;
    var plumHallRope;
    var accuse;

    mustardStudyLeadPipe = new Guess(
        'Colonel Mustard',
        'study',
        'lead pipe');
    mustardStudyLeadPipe.accuse();

    plumHallRope = new Guess(
        'Professor Plum',
        'hall',
        'rope');
    plumHallRope.accuse();

    accuse = mustardStudyLeadPipe.accuse;
    accuse();
}
What does the final call to accuse show? Does it accuse Colonel Mustard? No, all we've done is get a reference to the accuse function into our local variable, there's nothing there (or in the function definition) that refers to the mustardStudyLeadPipe instance or indeed anything related to Guess. No, what this shows will depend on the HTML document in which it's found, but it'll be something along these lines:
"It was undefined in the http://blog.niftysnippets.org with the undefined!"
Why? Because we didn't do anything to define "this" within the function. (I'll come back to details on why we'd get seemingly-odd alert that includes a URL in a bit.)

There are three main things about JavaScript that make it seem to have methods (leaving aside prototypes and "classes" for the moment): The "this" keyword, the fact that object properties can refer to functions (since they are, after all, just objects), and the fact that when you call a function using an expression that gets the function reference from an object property (e.g., object.functionName() or object['functionName']()), the object is set automatically as "this" within the function call.

Let's look at each of those.

The "this" keyword: This keyword looks familiar if you're coming to JavaScript from a background in C++, Java, C#, and the like, where "this" within a method is guaranteed to refer to an instance of the class defining the method (or a subclass). And in JavaScript, "this" does refer to an object instance, but that's where the similarity ends. Other than the name being the same and it referencing an object, it bears no relation to the "this" keyword in class-based languages like C++, Java, or C#. In fact, in many ways, "this" is actually just a function argument that is supplied in a non-obvious (but convenient!) way.

Functions as properties: Let's look again at one line from the code above:
mustardStudyLeadPipe.accuse();
If we were talking about that line of code, we'd probably say "...the accuse method of the mustardStudyLeadPipe object...", which is a useful and convenient way to put it. A more painstakingly-geeky-accurate way of putting it, though, would be "...the function referenced by the accuse property of the mustardStudyLeadPipe object..." Not that anyone's going to say that, but that's really what's going on. Objects don't have methods, they have properties; it's just that a property can refer to a function, since functions are objects like everything else.

Functions called via property references get "this" set for them: This is the part that really makes it seem like JavaScript has methods: The "this" reference gets set automagically to the object instance if you call a function via a property reference. Let's look at that call again:
mustardStudyLeadPipe.accuse();
This does three completely distinct things: Firstly, it identifies a function to call by getting the function reference from an object property; secondly, it says to call the function and return its result rather than just get a reference to it (the parentheses do that); and thirdly, it says that within that call, use the object in question as "this". These completely distinct things are combined in that notation for our convenience. Getting the function reference from an object property doesn't link it in any way to the object the property came from (as our accuse test above confirmed), it just gets a reference the function; the JavaScript engine treats calls of functions just retrieved from object properties as special and sets up "this" accordingly, but it has nothing to do with the function being called.

Lets prove that another way:
function testGuess2()
{
    var plumHallRope;
    var fakeGuess;

    plumHallRope = new Guess(
        'Professor Plum',
        'hall',
        'rope');
    plumHallRope.accuse();

    fakeGuess = {};
    fakeGuess.location = 'library';
    fakeGuess.killer = 'Mrs. Peacock';
    fakeGuess.weapon = 'revolver';
    fakeGuess.demo = plumHallRope.accuse;
    fakeGuess.demo();
}
This accuses Professor Plum as before, and then:
"It was Mrs. Peacock in the library with the revolver!"
The exact same function produces the alerts in both cases, it's purely the way it was called that defined "this". The fakeGuess object isn't even really a Guess -- "fakeGuess instanceof Guess" returns false -- but it has all of the properties the function expects (killer, location, weapon), and so it works just fine (this is something we'll come back to in a later post). Essentially, "this" is just a function argument that's passed into the function as an implicit feature of calling a function via a property reference.

We can also do it explicitly. JavaScript gives us the call and apply methods on function instances, with which we can say explicitly what we want "this" to be. (They do the same thing, they just provide different ways to specify the function's arguments.) If you say myfunction.call(myobject), you're saying "call myfunction and use myobject as 'this'", which is what the property-retrieval stuff does implicitly for you. In fact, this:
plumHallRope.accuse();
equates to
plumHallRope.accuse.call(plumHallRope);
Now all three of the parts we identified above are shown distinctly: Getting the function reference from the property (plumHallRope.accuse) is distinct from the fact we're calling it (call) is distinct from what "this" should be ((plumHallRope)). The fact that these are distinct can be made clearer:
var f;
f = plumHallRope.accuse;
f.call(plumHallRope);
Alternately, here's a more dramatic example:
function testGuess3()
{
    var plumHallRope;
    var fakeGuess;

    plumHallRope = new Guess(
        'Professor Plum',
        'hall',
        'rope');
    plumHallRope.accuse();

    fakeGuess = {};
    fakeGuess.location = 'library';
    fakeGuess.killer = 'Mrs. Peacock';
    fakeGuess.weapon = 'revolver';

    plumHallRope.accuse.call(fakeGuess);
}
Not to bang on about it, but the call at the end accuses Mrs. Peacock, not Professor Plum. The plumHallRope instance was only used to get the function reference, it wasn't used within the function call. In our original testGuess function, we could even do this if we wanted to:
mustardStudyLeadPipe.accuse.call(plumHallRope);
Which accuses Professor Plum. But it's more convenient, isn't it, to say plumHallRope.accuse?

So if "this" is really just a sort of obscure function argument, why have it? Why not just write everything as global functions and pass in the object to act on as the first argument? We could do that (and in fact I did it for years, as a C programmer), but it's more cumbersome. You have to know what functions are meant to be used with what kinds of objects, there's all sorts of stuff littering up the global namespace, etc., etc. By passing around object references, which have properties on them (usually inherited from their prototype; again the topic of an upcoming post) that reference functions intended for use with them, it's just so much more convenient.

Okay, but what was that about that URL showing up earlier when we called accuse? You'll remember earlier when we ran this code:
    accuse = mustardStudyLeadPipe.accuse;
    accuse();
we got the alert with the URL in it. So why did that happen, and where did the URL come from? It was because we didn't give an object to use for "this", and so the function call got the default, which is the JavaScript global object. In browser implementations, the global object is the window object -- so we were showing the values of window.killer, window.location, and window.weapon. In a typical situation, window.killer and window.weapon will be undefined, and of course window.location is the URL of the document in the window.

I chose that example intentionally, because it's a pitfall people fall into a lot: Losing "this". Usually, people lose "this" in the context of an event handler -- e.g., they have an instance (plumHallRope, perhaps), and want to hook up a "method" on it (say, accuse) to an event (a click handler for a button, maybe?), and so understandably they do something like this (I'm again using convenience syntax, as I describe here):
Event.observe('theButton', plumHallRope.accuse); // WRONG
The problem being, that just references the function, nothing about its context. Its context will be determined by how it's called. Earlier when we called accuse directly, because we didn't do that via property retrieval or call or apply, "this" was the global object; event handlers like this one will usually get "this" set to the element (although not if you're using old DOM0 -- onclick attribute -- events or IE's attachEvent function). To maintain its context, you have to do something like this:
Event.observe('theButton', function() { plumHallRope.accuse(); });
The function that gets called by the event handler then turns around and calls the accuse function such that plumHallRope is "this", so you've preserved "this" using a wrapper (which is also a closure; details). (Note that thanks to a bug in Internet Explorer, that may well cause a memory leak for your IE users. Prototype's Event.observe() does some things to minimize the issue; other frameworks will have other helpers along those lines.)

In Conclusion:

I've seen people deride JavaScript for having "fake" methods, but I think that's missing the point. It doesn't have fake methods, it just doesn't have methods at all. What it does have is very powerful, flexible functions and some convenient syntactic sugar that lets us express the 90% case -- calling a function related to an object instance passing in that object instance as an argument -- in a compact and expressive way, but without limiting our use of the functions involved. By not limiting our use, we can use just about anything as a mixin, we can use duck typing, easily create wrapper objects to enforce or verify contracts, etc.

Like so many things about the language, it's confusing until you grok just how simple it is, and then you start appreciating that it's simple but powerful.

Convenience Syntax in Examples

This is a "meta" post, e.g., it's about the blog itself.

In many of my posts, I give code examples to illustrate a point. To avoid cluttering the examples up with browser-specific stuff, I'm using a couple of calls from the Prototype library.

Actually, right now, just the one: Event.observe(). This function hooks up an event handler in a way that is consistent across browsers, even Internet Explorer, which is good because IE doesn't support the standard addEventListener() function, and its own attachEvent() function behaves slightly differently (within the handler, "this" is the global object, like in old DOM0 handlers; whereas with addEventListener it's the element you're hooking the event to). Prototype's Event.observe() ensures that "this" is always the element, even on IE, and also does some housekeeping to try to mitigate an IE bug where it tends to leak memory if you use closures as event handlers (which is incredibly common).

If I use more Prototype stuff, I'll list it here as well, but this is only about trying to keep the examples simple and clear, not about using Prototype or any other specific library...

Tuesday, 18 March 2008

The Horror of Implicit Globals

This code should cause an error, right?

function doSomething()
{
var x;

x = 10;
y = 20;

alert('x = ' + x);
alert('y = ' + y);
}
Well, you'd think so, wouldn't you? (There's no declaration for y.) It doesn't, though. It also probably doesn't do what the author intended. Welcome to The Horror of Implicit Globals. The good news is that there's something we can do about it.

Update 2010/03/31: Please see the update at the end of the article, there's good news on this front.
So what did the code really do? Well, in a simple test, it would seem to do what the author intended, in that it shows two alerts:
  • x = 10
  • y = 20
But x and y are completely different; x is a local variable within the function, whereas y is created as a "global variable". Said "global variable" is not declared anywhere at all, it exists purely because it was assigned to by this function. Functions can create global variables willy-nilly with no advance declaration or indeed, with no declaration at all.

This is clearly a Bad Thing, so why would the language designers let it happen? Well, in some sense I think it's a side-effect. To see why we have the possibility of implicit globals, we have to delve into objects and properties.

If you've done any JavaScript stuff with objects, you know that you can assign a property to an object simply by, well, assigning it:
myobject.myprop = 5;
This sets the property myprop on the object myobject, creating a new property if the property didn't exist before. This is simple, it's convenient, and it's part of the power of JavaScript that objects can just acquire properties without any preamble.

But what does this have to do with implicit global variables? Well, you see, JavaScript doesn't have global variables. No, seriously. It has a global object (just the one), and that object has properties. Those properties are what we think of as global (or page-scope) variables. You might think that this is a distinction without a difference, but if it were, the code shown at the beginning of this post would fail because y isn't declared. It doesn't fail, because by assigning y a value, we created a property on the global object.

"Now hang on," you're saying, "there's nothing in that code referencing some kind of 'global object'." Ah, but there is. And that brings us to the scope chain.

I've mentioned the scope chain before. The scope chain is how JavaScript resolves unqualified references: In any given execution context (e.g., global code or function), there is a chain of objects that the JavaScript engine looks to when resolving an unqualified reference: It first checks to see if the top object has a property with the given name, and uses that property if it does; if not, it checks the next object in the chain, etc. The global object is always the last object in the chain. If the engine gets all the way down to the global object and we're assigning a value to the unqualified reference, the reference is assigned as a property of the global object. And since properties don't have to be declared in advance, voilá, implicit globals.

We can see this in practice. The JavaScript specification says simply that there is a global object and that it's always the root of the scope chain, but it doesn't say what it is -- which makes sense as JavaScript is a general-purpose language. In browser-based implementations, though, the global object has a name: window. Technically, the symbol window in browser-based JavaScript is a property of the global object that refers to the object itself. I mention this not only because it's useful to know, but because it lets us prove to ourselves that "global variables" are really properties of the global object. Let's add a line to our code from above:
function doSomething()
{
var x;

x = 10;
y = 20;

alert('x = ' + x);
alert('y = ' + y);
alert('window.y = ' + window.y);
}
This starts out by showing the same two alerts we had before:
  • x = 10
  • y = 20
...and then it shows a third:
  • window.y = 20
...because y and window.y both refer to the same property of the same object, the global object.

All of this holds true even if we do declare y at global scope:
var y;
function doSomething()
{
var x;

x = 10;
y = 20;

alert('x = ' + x);
alert('y = ' + y);
alert('window.y = ' + window.y);
}
This shows the same three alerts shown earlier. (There is a very slight technical difference between declared and undeclared globals that relates to the delete statement, but the distinction isn't important here, it's just a flag on the property.)

Now, some readers will be thinking "Cool! This means I don't have to have 'var' statements for my globals! I can save a bit of space on the downloaded script files!" Well, true, you could. It's an awfully bad idea, though. Not only would it be a bit unfriendly to anyone else trying to read your script (I suppose you could address that somewhat with comments that get stripped out before download), but it would also mean you couldn't use any lint tools, and you really, really want to. Because the ramification of all of this is that a simple typo, perhaps:
thigny = 10;
instead of
thingy = 10;
can create a new global variable in your page, introducing a bug that's awfully hard to find. Fortunately, though, people have created various tools to do lint checking on JavaScript, tools that can find that error for you before you even get to the testing stage, much less production.

And so there we are, the horror of implicit globals -- and the relief of knowing that we have a defense against them!

Update 2010/03/31: Great news for anyone who's been hit by the Horror of Implicit Globals. The new ECMAScript 5th edition specification is out, and it introduces "strict mode." One of the (several) things strict mode does is prevent the very thing this article warns about: Implicit globals. If we were using strict mode, the function at the beginning of this page would have a syntax error rather than a very subtle bug. Result!