When browsing Stackoverflow I often notice users asking questions somehow involving the use of HTTP_HOST
. I nonchalantly hint on its vulnerable nature and fail to produce a hint on an article explaining why. Which is why I decided to take matters into my own hands.
The origin
In PHP, our protagonist is accessible via:
$http_host = $_SERVER['HTTP_HOST'];
The value in the $_SERVER
superglobal is taken from the HTTP request’s Host:
header. Now, this header is only sent by non-HTTP 1.0 clients, but what browser uses that outdated protocol anymore (unless you tell it to)?
For an Apache server responsible for multiple sites, the information in the Host:
header is crucial to determine which virtual host to route the request to. After all, the client only connects to an IP address and multiple domain names can resolve to this address.
The assumption
Since the Apache server is doing all the work of finding the correct virtual host to serve our request and passes HTTP_HOST
along to the script to be executed, many assume HTTP_HOST
to now contain the correct domain name to which a client has connected.
This assumption is circumstantially wrong.
The common use
We assume you have a template engine driven website, you’re aware that hard-coding URLs into your templates is a Bad Thing™ and you set up globally available template variables. In this example, our template engine is Smarty:
The globally included setup script:
$smarty = new Smarty();
/* later on.... */
$smarty->assign(array(
'title' => 'My Homepage',
'page_base' => $_SERVER['HTTP_HOST'],
));
Later on, in a template:
A non-malicious HTTP request would have generated two or more links to URLs on the current domain.
What usually happens:
The following HTTP request could have been sent by a Firefox browser:
GET / HTTP/1.1 Host: perfect-co.de User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; de) ... Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,de;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive
What could have happened:
Someone could have come along and done a telnet to your domain’s IP address and send the following manually:
GET / HTTP/1.1 Host: "><iframe src="about:blank" onload="alert('XSS')"
That’s all. Your template’s result should now look like:
<div class="links">
<a href="http://"><iframe src="about:blank" onload="alert('XSS')" "="">Home</a>
<a href="http://"><iframe src="about:blank" onload="alert('XSS')"/about">About</a>
<!-- and more links --></div>
Tada! XSS galore. This is also easily achievable through a Firefox plugin called TamperData.
The prevention
The above example actually only works for the default virtual host, as it’s the one getting all non-matching requests routed to.
Apache does its CGI scripts the favor to provide SERVER_NAME
as well. Given a correct setup, it’ll actually contain the virtual hosts domain name, as configured. No buts. The correct setup includes this tiny little directive:
UseCanonicalName On
Without this, $_SERVER['SERVER_NAME']
would at least have contained an escaped variant of our injection. Using the above directive to configure Apache, we force the content of $_SERVER['SERVER_NAME']
to actually be the targeted domain name.
2 thoughts on “Why HTTP_HOST is evil”