User Privacy and the Phishing Filter


When we shipped the Microsoft Phishing Filter in Internet Explorer 7 Beta 1, many readers on the blog asked: if the Phishing Filter is checking suspicious URLs against a web service, how would Microsoft protect user privacy?

We know that for customers to benefit from the work we put into the Phishing Filter, they have to trust us enough to use it. As you’ve been hearing for years, Microsoft now engineers our products to be more secure by default. In the same way, we engineered the Phishing Filter to protect user privacy. Most importantly, when the Phishing Filter checks if a site is a phishing site, the URL it sends to the web service cannot be used to personally identify you. That was just one of the ways that we engineered the Phishing Filter to protect user privacy.

To prove that the Phishing Filter protects privacy, we asked Jefferson Wells, a well known technology audit firm, to take a look at our design. We gave them in-depth access to the technology and to the engineering team. After they studied the technology and interviewed the engineering team, they agreed that the claims we made about protecting your privacy are true and accurate.

You can read the results of the Jefferson Wells Audit yourself to learn more.

We want you to understand this is a longterm commitment to protect your privacy. To prove our ongoing commitment, we’re going to repeat this audit periodically so that even if the service changes in some way, you’ll still have proof that the web service protects your privacy.

Thanks,
Rob Franco

Comments (53)

  1. game kid says:

    No, we asked why Microsoft and not an independent group are seeing the websites we visit.

    There’s a difference.

  2. game kid says:

    (…actually, that too.)

  3. rdmiller says:

    independent group ??? who might they be ?

    virgins ??

    police officers ??

    NSA officials ??

    Who do you trust anyway ??

  4. Maurits says:

    > The Phishing Filter client does not transmit any personally identifiable information without explicit user consent.

    I love the "… without explicit user consent."  How "Animal Farm." :)

    > URL information transmitted for rating by the Phishing Filter client cannot be traced back to the user’s personal information.

    How so?  If you get a request to https://home.example.isp/maryjo/control-panel/, you can reasonably infer the user’s first name is Mary Jo.  You can even reasonably guess that her email address is maryjo@example.isp.

    > HTTP and HTTPS URLs transmitted for rating by the Phishing Filter client are limited to the domain and path only. All other information in the URL is stripped.

    The connecting IP of the client is also known, as is the time of the request… and probably the version of the client.  This may be enough to tie requests for a particular user together, depending on the popularity of the service.

    > The Phishing Filter client only transmits URLs in the following scenarios:

    > a. When the user wants to manually provide feedback on a URL.

    Fine

    > b. When the URL is not found in the Phishing Filter local data files.

    > c. When the Phishing Filter client heuristics determine a site as suspicious

    This is horribly confusing.  If a URL is in the data files as "known good", but it looks suspicious, is a URL transmitted?

    Conversely, if a URL is not in the data files at all, but it looks OK, is a URL transmitted?

    > d. Transmission of any and all URL information by the Phishing Filter client is over SSL on the Internet.

    That’s certainly a good thing.

  5. Molly C says:

    <quote>If you get a request to https://home.example.isp/maryjo/control-panel/, you can reasonably infer the user’s first name is Mary Jo.  You can even reasonably guess that her email address is maryjo@example.isp.</quote>

    Maurits, I just clicked on your example link  https://home.example.isp/maryjo/control-panel/, and guess what, I’m not Mary Jo!  So no, clicking on a link that happens to have someone’s "name" in it, does not identify the clicker. 😉

  6. paulfp says:

    I’m interested in how IE7 determines that a site is "suspicious"…

    In both Beta2 previews, several of my pages were being flagged as being suspicious, which worried me slightly – too many false positives, especially against your own work, is never a good thing.

    However, in the "final" Beta2, my pages aren’t flagged – was there a change made, and if so what?

  7. TestMan says:

    To show that you just want to kill phishing and not get stats from the users, simply put a file of known phishing URL patterns (regexp).

    IE will simply get this file and use to validate the URL. This way the URL the user are surfing on will not be advertised to a third party and nobody can claim that "new IE7" is a big-brother wanabe 😉

    PS: Anytime for fixing the HTTP Accept header in IE7 to make it more in sync with the real support of IE ?

  8. rdmiller says:

    When would IE download this file?  Each time you opened IE?  The first time you opened IE each day?  Each time you booted your computer?

    There are problems with each of these solutions, mostly having to do with the user experience.

  9. dalmuti509 says:

    OK…little issue with the web service.  What you have done….and don’t say that it won’t cause it happens several times a year to windows update…is created a single location where hackers can DOS attack the entire internet.  If the webservice is inaccessible then the browser will take the full 30 seconds (or whatever timeout you have set) to wait for the webservie, then marks every page as a phishing site…try removing your system from accessing the internet and hit a local site on your intranet…it does exactly that….

    If you don’t make it so that it downloads a "blacklist" file that IE can then use to check sites against, then the service is nice, but will become worthless quickly.  Virus checking software does this for a reason, why not IE?

  10. Chris Hubick says:

    I would also prefer to see a simple file with the URL’s updated monthly via critical updates through Windows/Microsoft update, similar to how the MS anti-spyware removal tool releases a monthly update.

  11. codemastr says:

    rdmiller:

    Seems like antivirus manufacturers have managed to cope with that "problem."

    Also, in response to that audit, it’s incredibly misleading. It uses the MS definition of private info. To me, having my IP address, and the websites I browse is too personal for me. It’s like you define it in such a way to make sure you’re definition holds true. You can trace my browsing back to my IP. You can trace my IP to me. Though my personal info may not be directly transmitted, you can certainly get to it.

  12. Xepol says:

    Any time you retain the same IP for a long period of time, any and every packet you transmit over the internet can become personally identifiable to you.

    This problem is not limited to those who have static IP addresses, but even those on high-speed dynamic IP connections can often hold on the IP address for months, at a time (more than 16 months was my record, and on some ISPs that shall remain Shawnonymous, those with dynamic IPs can reasonably expect to hold their IPs longer than the static IP customers who are mystified to see their IPs changing at least once a year).  Any time you hold an IP for months, there is ample opportunity to tie that IP back to accounts and activies through browser advertizments, logs etc (it can be surprising to see howofter you can search for a name and get an IP or vice versa thanks to forum software, poorly secured servers, etc)

    So is your privacy truely protected?  Not as effectively as you would like.  True, you could claim that I am paranoid, and that MS will stand by its word forever, but in a world where you can track a color laser printed page back to the exact printer that printed it and know the time and date, can you afford to not be paranoid yourself?

    Sure, MS might not know the exact page you visited, and they might not know who you are, but if they ever changed their minds (Or homeland security/the whitehouse got nosey and littigous) someone probably could.

    But that’s OK, it’s still probably more secure than your front door.

  13. Antonio Marques says:

    It seems for the content of the posts here that most people did not even bother to read the audit before posting!

    You are worried because of Microsoft’s definition of “Personally Identifiable Information"? Well, this seems to be a good definition: "[it] means any information that identifies or can be used to identify, contact, or locate the person to whom such

    information pertains, or from which identification or contact information of

    an individual person can be derived. Some examples of PII include first and last name, address, and e -mail address.”

  14. Paul says:

    I love how people get worried about internet surfing privacy, but then probably throw reciepts and banking statements straight in the bin without shredding them.

  15. Fduch says:

    IE7 Has interesting feature to protect us.

    It disables running some scripts from my computer, but allows them to run from internet. Does it mean that IE trusts me less than that website?

  16. Tranzistor says:

    To Chris Hubick:

    Phishing data monthly update is a really bad idea, those sites usually don’t last for more than a week.

  17. ajo says:

    I personally think this is a great feature.

    But if you don’t like this feature, don’t switch it.

    I don’t understand why some of you are complainging about privacy issues since all your data is stored anyway. Think of ISPs, Google, the Government. So if you want to keep you privacy private, go and live in some cave in Afghanistan.

  18. ieblog says:

    Dalmuti509: you’re making some huge assumptions about how our technology works (like assuming that the check of our phishing database is made in serial with a user’s request to browse). We have designed a system that we feel is both scalable to broad Internet use and fast enough to deal with phishing sites that come and go in a matter of hours.

    -Christopher [MSFT]

  19. Alistair Higson says:

    Isn’t annoying when you recieve a text email and link is broken into 2 lines? IE will only take the first line and you end up copying, pasting, alt-tabbing etc to piece together the link manually. it’s not user friendly and fiddly.

    THe behaviour should be to accept multiline clipboard items, removing the line break and line end spaces. G**gle maps does this on their website, try pasting into the search box, it takes multi line entries.

    I can’t think of a reason why it shouldn’t work like this.

  20. Alistair Higson says:

    Isn’t annoying when you recieve a text email and link is broken into 2 lines? IE will only take the first line and you end up copying, pasting, alt-tabbing etc to piece together the link manually. it’s not user friendly and fiddly.

    THe behaviour should be to accept multiline clipboard items, removing the line break and line end spaces. G**gle maps does this on their website, try pasting into the search box, it takes multi line entries.

    I can’t think of a reason why it shouldn’t work like this.

  21. EricLaw [MSFT] says:

    @Fduch: "IE7 Has interesting feature to protect us. It disables running some scripts from my computer, but allows them to run from internet. Does it mean that IE trusts me less than that website?"

    No.  This behavior was first introduced in IE6 on XPSP2, and it was surprising to me too when I first heard about it.  The answer is actually that this is an important security restriction used to prevent data theft.  Fundamentally, browsers are based on a "domain security" model, where content in one domain can only read content from the same domain.  The problem is that as soon as you open a file in the "Local computer", every other file in the local computer is fair game, since it’s in the same "domain".

    If we didn’t restrict the Local Machine Zone as we’ve done, HTML running on your local computer could open any file, steal its contents, and upload it to any server on the internet.  So a bad guy could really easily say, here, download this .ZIP file, open it, and view the web page inside.  Users, assuming HTML is safe, would do so, and bang, their data just got stolen.

    You can read more about the "Mark of the Web" and how to work with script on your local computer in the documentation for IE on XPSP2.

    @Alistair: This is a good feature suggestion, and something that we should consider for a future release.  There are some corner cases where it gets a bit complicated (e.g. what if I only wanted the first line and not the second?  What if I accidentally selected an entire page of text and pasted?) but it’s still a good suggestion.

  22. Fiery Kitsune says:

    Is there a harmless test-link that Microsoft uses to test the filter?

  23. TJ says:

    "Is there a harmless test-link that Microsoft uses to test the filter?"

    Yes, https://www.woodgrovebank.com

  24. Fiery Kitsune says:

    "Yes, https://www.woodgrovebank.com&quot;

    (LOL, it’s a "Bank of Redmond"…)

    Glad to know the filter actually works on my end.

  25. I’d really like to see notification of the page’s secure/insecure status on the tab. The address bar just drops out of my conscious attention area after a while, even when it gets highlighted.

    Is there any chance we could at least have the ability to TURN OFF the disappearing menu bar? I like to see the File/Edit/etc. stuff where I can click on it. As it is, I think "I need to edit that page" and I mouse up there to find… nothing. Then I have a little mental WTF moment before I remember I have to press Alt before I can click anything, and then my brain has to shift gears from mouse to keyboard, and then I have conflicting impulses on whether to press Alt-F or just press Alt and click File.

    This breaks the cardinal rule of UI design: "Thou shalt not make thy user feel stupid." Could we do something about it?

  26. Pretend I am Microsoft here.

    1. I provide a service which handles personal data and I want people to trust this service.

    2. But I don’t want to disclose to people the internals of this service.

    3. How can I give proof to people that they can trust my service ?

    4. I am going to ask another company, which I will pay substentially for this, to check the internals of the system but they cannot disclose how the service works internally. They can just come up with a "yes/no" affirmation.

    5. How on earth is anyone supposed to feel more safe about it ?

    Add to this the fact that most people have an issue with putting "Microsoft" and "trust" in the same sentence and you might realise that you’re on the wrong track.

    Conclusion: this is, once again, a situation that requires you to be as open as possible.

    1. Give me a copy of the source code

    2. Make sure you are legally bound to run the same version as what you disclose

    3. I might trust you then

    Until then stop making promises you cannot hold.

  27. 1. Let’s all give away our software for free.

    2. Let’s give away all the source code too.

    3. Let’s give away all rights

    4. Instead, let’s license all the rights to someone in charge of the GPL

    5. Now nobody makes any money.

    6. Nobody can pay anyone.

    7. People starve, innovation stops, the cows come home, etc.

  28. Wraith, you’re missing the point.

    This is not an open source vs proprietary issue.

    It is a "how can I be trusted" issue.

    No one can trust a system as long as you keep parts of it hidden. It’s just logically.

    I will never entirely trust software for which I don’t know anything about the code.

    Now this doesn’t matter with most software. But when it’s a system dealing with security and privacy issues, it does.

  29. Ghost says:

    I’ll trust a company which can be sued a lot more than a bunch of open source developers who just check code in and aren’t responsible next year for what is in it.

  30. You guys aren’t really open-minded are you ?

    Stop thinking about crusades, that’s not the point here.

    I don’t care and I would, just like you Ghost, actually prefer if it was MS developping this.

    I just want to be able to check what is handling my personal data. What’s wrong with that.

    Currently what the IE team is proposing is to blindly trust them (audits have never been a proof of anything).

    All I’m asking is to be able to trust them.

  31. This is currently what is being diaplyed at the top of the Jefferson Wells page Rob linked to (http://www.jeffersonwells.com/client_audit_reports/main.htm):

    <%@ Control Language="c%23" AutoEventWireup="false" Codebehind="../Controls/header.ascx.cs" Inherits="JeffersonWells.Controls.header" TargetSchema="http://schemas.microsoft.com/intellisense/ie5"%&gt;

    So I’m being asked to trust a web security and privacy ausit done by people who can’t setup a basic website and webserver correctly.

    Correct me if I’m wrong but it’s asking a lot ….

  32. @Cyril

    You have a point there on the website. That’s fairly lame.

    Please excuse my sarcasm on your post. I figured you were one of those MS slamming flamers. However, there really is no way to trust them the way you want to. For one thing, trust is inherently secretive: it only applies to things you don’t really know. We can trust Firefox because we can completely read its source.

    Personally, I don’t care if MS gets my personall data or not. If someone were to googl.. I mean, er, Live Search ‘daquell’, they’d find a fair piece of info on me. Couple that with a few IPs, and they might could even locate me. Nothing’s really private anymore anyway. A smart surfer will rarely need the phishing filter as well. And I think I’ll stop rambling…..

  33. Keeping privacy is a future proof measure. It also ensures a certain degree of freedom for you, me and our kids.

    If you don’t care about it for you. Please care about it for others, as our data is apparently bound to co-exist in some file, somewhere.

    I can probably be searched on a bit too and it doesn’t worry me that much either.

    When someone starts seeing/storing everything I read or contribute to on a daily basis then I get concerned.

    Potentially putting all sides of my life in the hands of whatever future dictator is going to rule the country and take over companies data is not a joyful thought.

    When you say "nothing is really private anymore anyway" you should consider that pretty much everything is private by default. It is your responsibility if you’re giving away personal information.

    Now if someone comes between you and what you are doing to spy on everything, you ought to feel a little bit concerned.

    Because your info is gonna be stored with mine, even if it doesn’t bother you, it is both our reponsibility to check on the people who are storing this data.

    By asking clearance from people who are asking us to trust them with our personal info I am only being responsible.

    And that’s why people making "longterm commitment to protect my privacy" without fully disclosing what they are doing with this data piss me off. This is not a commitment. It’s a (bad) attempt at fooling us.

  34. And to illustrate what I was just saying: http://www.usatoday.com/news/washington/2006-05-10-nsa_x.htm

    There’s a good chance they are already doing the same with the web. No need to encourage them by provide MS with a ready-made database.

    So can anyone confirm that this filter will be optional ?

  35. Any technician with access to your ISP routers could get much much more information about you and sites you visit than the phishing filter, including information about other services like files sharing, instant messaging and emails (even encrypted, they can still see how much data is being exchanged and which server you’re connecting to).

    What to say about people with access to your company or ISP proxy server ?!

    Even if you use an anonymizer proxy, you’re hiding your IP from web sites you visit, but now that one company providing that service could be gathering all the information you wanted to hide from every other people.

    Not trusting the IE7 phishing filter has an easy fix, just turn it off!

    I’m sure people that are that concerned about their privacy can also manually identify phishing sites. The phishing filter will only benefit people who could get phished.

    Now, if you don’t want your personal IP to be identifiable by your ISP or other people, it’s a whole other problem, and I’m effraid the only solution for now is to turn off your network connection…

  36. Mike says:

    I hate Microsoft with a passion, and the anti-pissing is sure as bad as it sounds. But be sure to understand that Firefox 2.0 coming this year will have a anti-pissing feature, provided by…Google. Yes, those guys.

    So what do you prefer, eletric chair or guillotine?

  37. Molly C says:

    @EricLaw:

    Thanks for the explanation regarding the local scripts limitation introduced in IE6XPSP2.

    @Cyril Doussin:

    I’ll take the word of an auditing company over having the source.  I don’t have time to pour through someone’s source code nor the ability to bless it as "good".  And I don’t trust a bunch of anonymous OSS devs (most of which hate MSFT, so can hardly be considered "objective" in any case) to be able to do so reliably either.

  38. @Philippe: I know this, thank you. It is nonetheless not a reason to go and give away information to anyone for little reason. Thanks for specifying that it will be possible to turn it off. The only thing that worries me is that it will apparently be an opt-out measure. Not very privacy friendly imho.

    @Molly: if MS ever released the source for something like this, you can be sure that more than a bunch of OSS devs would have a good look at it.

    It seems the word "open" is banned around here. You have people offering to trust them on how they manage your data and as soon as someone proposes for anyone to review on an equal basis the means by which this data is being manipulated and stored, you raise flamewars re open source, gpl etc.

    I know governments probably already know more about me than myself. It doesn’t mean I should behave like a sheep.

    I won’t be using Firefox’s filter if it gets one and I won’t be using the MS filter either.

    I just thought I’d leave comments on this to let the IE team know how some people can get pissed off when they make promises that they could only fulfill by radically changing their politics. Filtering, as a public service, should rely on an open process and an open technology, or at the very least be an opt-in feature with detailed information. Otherwise it should just not exist.

  39. eM says:

    "Filtering, as a public service, should rely on an open process and an open technology, or at the very least be an opt-in feature with detailed information. Otherwise it should just not exist."

    If you don’t like it, don’t use it. Better yet, hack up your own solution.

  40. One of Internet Explorer 7’s greatest security features is the

    Microsoft Phishing Filter, which checks shady URLs against a web

    service, and in turn determines if the user should avoid the requested

    site. Although one would think that no harm

  41. UnexpectedBill says:

    I have a question about the Phishing Filter.

    I don’t think too many people would disagree with me when I say that the WWW (or rather, the entire card of services that the Internet has to offer) is in a constant state of flux.

    I am curious to know what will or might happen to the phishing filter over time when IE7 is no longer current news or supported software.

  42. @Cyril:

    For what it’s worth, I agree with you. There is simply no benefit to keeping the source code for this feature secret. In fact, there is IMO almost certainly a massive benefit in opening this feature to public scrutiny.

    But that is not our decision to make. Every company on this planet has the right to say "we own this, and we get to decide who sees it, and we decided *you* don’t get to see it". They do not need to have a reason. They do not need to explain themselves. It is their choice, not ours.

    And if you have a problem with this, then your platform isn’t really about being free and open at all… it’s just "meet the new boss, same as the old boss". Which is where I really start to have a massive concern about the culture and society around open source software, because most of its members *do* have a problem with this.

  43. Peter says:

    @Cyril:

    >"if MS ever released the source for something

    > like this, you can be sure that more than a

    > bunch of OSS devs would have a good look at

    > it."

    But you can be sure that a group of Microsoft haters will "find" some problem (whether it exists or not), make a big fuss over it, and suddenly we have a contrived controversey with Microsoft haters and objective analysts debating the issue against each other.  The mere existence of the controversey causes nobody to use the filter (or even pressures Microsoft into removing it altogether), even if there is no real problem with it, in which case the Microsoft haters obtain their objective.

  44. Maurits says:

    > Even if you use an anonymizer proxy, you’re hiding your IP from web sites you visit, but now that one company providing that service could be gathering all the information you wanted to hide from every other people.

    In theory you could use two anonymizers from different companies in a series.  Then one company knows where you are coming from but not where you’re going; and the other company knows where you’re going but not where you’re coming from.

    For email and IM it’s a lot easier… all you need is end-to-end encryption.

  45. paulfp says:

    I don’t really care if MS know personal stuff about me, cos if they do then they must have BILLIONS of columns in that database….. i’ll just be a tiny speck on the map so it’s pretty unlikely that a little army of Mr Gates’ zombies will come all the way to Liverpool to knock on my door and say "naughty naughty, you went on a suspicious web site last week…."

  46. sIKE says:

    Unless the way HTTP protocol works and how it sits on top of TCP/IP anyone using it will not have privacy reguardless of what you’re doing…either going and getting Phishing data, pron, or Martha Stewart you are not annonymous, just like on your phone, anything that requires a connection is defacto point to point and therefore traceable. I have been living with this fact since 1994 and still have no issues……

    sIKE

  47. RuleZ023 says:

    I just don’t have anything to say. Not that it matters. Eh. I’ve just been staying at home doing nothing, but I don’t care. That’s how it is.

  48. The Internet Explorer Team Blog includes a post covering this very subject. Due to many people asking…

  49. As we’ve worked on the new Phishing Filter in IE7, we knew the key measure would be how effective it