Tuesday 13 October 2009

A better way to hide .htaccess, WEB-INF

If you use Apache as your web server of choice, you may wish to have files or directories that Apache pretends are not there. For me, this is because I like to have Apache proxy a servlet container backend, but I'm too lazy to separate out the files, and so I just point Apache and the servlet container at the same directory and tell Apache to pass on the relevant requests to the servlet container (e.g. JSPs, servlets, etc.). The only problem is, that common directory will tend to contain things like the WEB-INF directory and its subdirectories, which I kind of don't want Apache to serve up to the public! If you use .htaccess files, you'll have the same sort of situation.

The usual answer is to simply deny access to the file or directory in question, as in this default rule from the Apache2 config:

<Files ~ "^\.ht">
Order allow,deny
Deny from all
</Files>
That sends a 403 (Forbidden) reply when someone tries to access .htaccess. And the same sort of thing can be applied for WEB-INF using the DirectoryMatch directive:
<DirectoryMatch "(^|/)WEB-INF($|/)">
Deny from all
</DirectoryMatch>
And that's great, but I'm a bit paranoid -- why does the outside world need to know that the thing is there at all? I'd much rather it sent a 404. And huzzah, mod_rewrite to the rescue!
<DirectoryMatch "(^|/)WEB-INF($|/)">
RewriteEngine on
RewriteRule .* - [L,R=404]
</DirectoryMatch>
Since the RewriteRule is already inside the DirectoryMatch that will only match what we want, its own regex can just match everything. The L flag says this is the last rule, but the R flag is the magic: It forces a redirect, and if you use the R=xyz form, it redirects with the given code; in this case, a 404. This does the right thing even if you have a custom error document (and you have custom error documents, right?).

Voilà! As far as the outside world is concerned, there just isn't a WEB-INF directory there at all.

If you like, you can have a general rule that works whether mod_rewrite is loaded or not:
<DirectoryMatch "(^|/)WEB-INF($|/)">
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule .* - [L,R=404]
</IfModule>
<IfModule !mod_rewrite.c>
Deny from all
</IfModule>
</DirectoryMatch>
This will do a 404 if mod_rewrite is loaded, but fall back to a 403 of not.

Final note: If you're letting Apache generate directory listings for you by not including a directory file, the above won't hide the WEB-INF diretory (or whatever you're hiding) in those listings. It's easy enough to do it, though: Inside the relevant directive for the directory being listed, just use IndexIgnore to tell the mod_autoindex module to ignore it:
IndexIgnore WEB-INF
In my case, since I want WEB-INF to be hidden everywhere, I can just include it inside my main Directory directive. I never let Apache generate listings for me anyway, but it's nice to have a backstop if I blow away my index file.