Tuesday 18 March 2008

The Horror of Implicit Globals

This code should cause an error, right?

function doSomething()
{
var x;

x = 10;
y = 20;

alert('x = ' + x);
alert('y = ' + y);
}
Well, you'd think so, wouldn't you? (There's no declaration for y.) It doesn't, though. It also probably doesn't do what the author intended. Welcome to The Horror of Implicit Globals. The good news is that there's something we can do about it.

Update 2010/03/31: Please see the update at the end of the article, there's good news on this front.
So what did the code really do? Well, in a simple test, it would seem to do what the author intended, in that it shows two alerts:
  • x = 10
  • y = 20
But x and y are completely different; x is a local variable within the function, whereas y is created as a "global variable". Said "global variable" is not declared anywhere at all, it exists purely because it was assigned to by this function. Functions can create global variables willy-nilly with no advance declaration or indeed, with no declaration at all.

This is clearly a Bad Thing, so why would the language designers let it happen? Well, in some sense I think it's a side-effect. To see why we have the possibility of implicit globals, we have to delve into objects and properties.

If you've done any JavaScript stuff with objects, you know that you can assign a property to an object simply by, well, assigning it:
myobject.myprop = 5;
This sets the property myprop on the object myobject, creating a new property if the property didn't exist before. This is simple, it's convenient, and it's part of the power of JavaScript that objects can just acquire properties without any preamble.

But what does this have to do with implicit global variables? Well, you see, JavaScript doesn't have global variables. No, seriously. It has a global object (just the one), and that object has properties. Those properties are what we think of as global (or page-scope) variables. You might think that this is a distinction without a difference, but if it were, the code shown at the beginning of this post would fail because y isn't declared. It doesn't fail, because by assigning y a value, we created a property on the global object.

"Now hang on," you're saying, "there's nothing in that code referencing some kind of 'global object'." Ah, but there is. And that brings us to the scope chain.

I've mentioned the scope chain before. The scope chain is how JavaScript resolves unqualified references: In any given execution context (e.g., global code or function), there is a chain of objects that the JavaScript engine looks to when resolving an unqualified reference: It first checks to see if the top object has a property with the given name, and uses that property if it does; if not, it checks the next object in the chain, etc. The global object is always the last object in the chain. If the engine gets all the way down to the global object and we're assigning a value to the unqualified reference, the reference is assigned as a property of the global object. And since properties don't have to be declared in advance, voilá, implicit globals.

We can see this in practice. The JavaScript specification says simply that there is a global object and that it's always the root of the scope chain, but it doesn't say what it is -- which makes sense as JavaScript is a general-purpose language. In browser-based implementations, though, the global object has a name: window. Technically, the symbol window in browser-based JavaScript is a property of the global object that refers to the object itself. I mention this not only because it's useful to know, but because it lets us prove to ourselves that "global variables" are really properties of the global object. Let's add a line to our code from above:
function doSomething()
{
var x;

x = 10;
y = 20;

alert('x = ' + x);
alert('y = ' + y);
alert('window.y = ' + window.y);
}
This starts out by showing the same two alerts we had before:
  • x = 10
  • y = 20
...and then it shows a third:
  • window.y = 20
...because y and window.y both refer to the same property of the same object, the global object.

All of this holds true even if we do declare y at global scope:
var y;
function doSomething()
{
var x;

x = 10;
y = 20;

alert('x = ' + x);
alert('y = ' + y);
alert('window.y = ' + window.y);
}
This shows the same three alerts shown earlier. (There is a very slight technical difference between declared and undeclared globals that relates to the delete statement, but the distinction isn't important here, it's just a flag on the property.)

Now, some readers will be thinking "Cool! This means I don't have to have 'var' statements for my globals! I can save a bit of space on the downloaded script files!" Well, true, you could. It's an awfully bad idea, though. Not only would it be a bit unfriendly to anyone else trying to read your script (I suppose you could address that somewhat with comments that get stripped out before download), but it would also mean you couldn't use any lint tools, and you really, really want to. Because the ramification of all of this is that a simple typo, perhaps:
thigny = 10;
instead of
thingy = 10;
can create a new global variable in your page, introducing a bug that's awfully hard to find. Fortunately, though, people have created various tools to do lint checking on JavaScript, tools that can find that error for you before you even get to the testing stage, much less production.

And so there we are, the horror of implicit globals -- and the relief of knowing that we have a defense against them!

Update 2010/03/31: Great news for anyone who's been hit by the Horror of Implicit Globals. The new ECMAScript 5th edition specification is out, and it introduces "strict mode." One of the (several) things strict mode does is prevent the very thing this article warns about: Implicit globals. If we were using strict mode, the function at the beginning of this page would have a syntax error rather than a very subtle bug. Result!

3 comments:

T.J. Crowder said...

In the above, I talked a lot about the fact that assigning a value to an undeclared, unqualified identifer creates a property on the global object. E.g.:

x = 5;

...creates window.x with the value 5, assuming x isn't declared as a local variable somewhere. It's interesting to note, though, that if you try to retrieve the value of an undeclared, unqualified identifier, it is not treated as a property of the global object unless the property already exists. So if x isn't declared anywhere and there isn't (yet) a window.x property, this line of code results in an error:

var y = x; // Error!

...whereas a qualified reference behaves differently:

var y = window.x; // 'y' gets the value 'undefined'

The relevant sections of the ECMA spec are 8.7 ("The Reference
Type") and 10.1.4 ("Scope Chain and Identifier Resolution").

-- T.J.

Unknown said...

I've a mayor problem with globals in javascript. See following example:


var object1; var data;
object1 = new Obj();

function Obj() { this.id=0; this.name='';}

Obj.prototype.setName = function(name) { this.name = name;}

function init() {
object1.setName("Chris");
alert (object1.name); // alerts 'Chris'

$.post('my_json_list.php', function(data) {
data = JSON.parse(data);
object1.setName(data.name);
alert (object1.name); // alerts 'John'
});
// After completing the data transfer alert the value of oject
alert ('After data transfer thevalue is: '+ object1.name); // alerts 'Chris' (!)x

}

PHP file is:

T.J. Crowder said...

@Ben: The problem there doesn't have anything to do with globals, it has to do with ajax calls being (by default) asynchronous. See this post for details. (In the code in that post, I've modified your code slightly so it's not using globals unnecessarily, but again, that's not related to the problem.) Enjoy!