When browsing [Stackoverflow][so] I often notice users [asking questions][so-q] somehow involving the use of `HTTP_HOST`. I nonchalantly hint on its vulnerable nature and fail to produce a hint on an article explaining why. Which is why I decided to take matters into my own hands.
[so]: http://stackoverflow.com/
[so-q]: http://stackoverflow.com/questions/4652464/how-to-chain-on-mod-rewrite
The origin
In PHP, our protagonist is accessible via:
$http_host = $_SERVER['HTTP_HOST'];
The value in the `$_SERVER` superglobal is taken from the HTTP request’s `Host:` header. Now, this header is only sent by non-HTTP 1.0 clients, but what browser uses that outdated protocol anymore (unless you tell it to)?
For an Apache server responsible for multiple sites, the information in the `Host:` header is crucial to determine which virtual host to route the request to. After all, the client only connects to an IP address and multiple domain names can resolve to this address.
The assumption
Since the Apache server is doing all the work of finding the correct virtual host to serve our request and passes `HTTP_HOST` along to the script to be executed, many assume `HTTP_HOST` to now contain the correct domain name to which a client has connected.
This assumption is circumstantially wrong.
The common use
We assume you have a template engine driven website, you’re aware that hard-coding URLs into your templates is a Bad Thing™ and you set up globally available template variables. In this example, our template engine is Smarty:
The globally included setup script:
$smarty = new Smarty();
/* later on.... */
$smarty->assign(array(
'title' => 'My Homepage',
'page_base' => $_SERVER['HTTP_HOST'],
));
Later on, in a template:
A non-malicious HTTP request would have generated two or more links to URLs on the current domain.
What usually happens:
The following HTTP request could have been sent by a Firefox browser:
GET / HTTP/1.1 Host: perfect-co.de User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; de) ... Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,de;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive
What could have happened:
Someone could have come along and done a telnet to your domain’s IP address and send the following manually:
GET / HTTP/1.1 Host: "><iframe src="about:blank" onload="alert('XSS')"
That’s all. Your template’s result should now look like:
“`
“`
Tada! XSS galore. This is also easily achievable through a Firefox plugin called [TamperData][tamper-data].
The prevention
The above example actually only works for the default virtual host, as it’s the one getting all non-matching requests routed to.
Apache does its CGI scripts the favor to provide `SERVER_NAME` as well. Given a correct setup, it’ll actually contain the virtual hosts domain name, as configured. No buts. The correct setup includes this tiny little directive:
UseCanonicalName On
Without this, `$_SERVER[‘SERVER_NAME’]` would at least have contained an escaped variant of our injection. Using the above directive to configure Apache, we force the content of `$_SERVER[‘SERVER_NAME’]` to actually be the targeted domain name.
[tamper-data]: https://addons.mozilla.org/en-us/firefox/addon/tamper-data/
2 thoughts on “Why HTTP_HOST is evil”