Why HTTP_HOST is evil

When browsing [Stackoverflow][so] I often notice users [asking questions][so-q] somehow involving the use of `HTTP_HOST`. I nonchalantly hint on its vulnerable nature and fail to produce a hint on an article explaining why. Which is why I decided to take matters into my own hands.

[so]: http://stackoverflow.com/
[so-q]: http://stackoverflow.com/questions/4652464/how-to-chain-on-mod-rewrite

The origin

In PHP, our protagonist is accessible via:

$http_host = $_SERVER['HTTP_HOST'];

The value in the `$_SERVER` superglobal is taken from the HTTP request’s `Host:` header. Now, this header is only sent by non-HTTP 1.0 clients, but what browser uses that outdated protocol anymore (unless you tell it to)?

For an Apache server responsible for multiple sites, the information in the `Host:` header is crucial to determine which virtual host to route the request to. After all, the client only connects to an IP address and multiple domain names can resolve to this address.

The assumption

Since the Apache server is doing all the work of finding the correct virtual host to serve our request and passes `HTTP_HOST` along to the script to be executed, many assume `HTTP_HOST` to now contain the correct domain name to which a client has connected.

This assumption is circumstantially wrong.

The common use

We assume you have a template engine driven website, you’re aware that hard-coding URLs into your templates is a Bad Thing™ and you set up globally available template variables. In this example, our template engine is Smarty:

The globally included setup script:

$smarty = new Smarty();
/* later on.... */
$smarty->assign(array(
    'title' => 'My Homepage',
    'page_base' => $_SERVER['HTTP_HOST'],
));

Later on, in a template:


A non-malicious HTTP request would have generated two or more links to URLs on the current domain.

What usually happens:

The following HTTP request could have been sent by a Firefox browser:

GET / HTTP/1.1
Host: perfect-co.de
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; de) ...
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,de;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive

What could have happened:

Someone could have come along and done a telnet to your domain’s IP address and send the following manually:

GET / HTTP/1.1
Host: "><iframe src="about:blank" onload="alert('XSS')"

That’s all. Your template’s result should now look like:
“`