Changes to IDN in IE7 to now allow mixing of scripts


Domain names are not limited to ASCII any longer, and as the web is growing more and more domain names now contain characters from other character sets. Such domain names are called Internationalized domain names (IDN), for example http://ايكيا.com is a domain in Arabic for IKEA. IE7 added support for IDN in Beta 2. We listened to your feedback during Beta 2 and we are changing the principles of IDN to accommodate the way customers want to use international characters on the web.

Preventing IDN spoofing by default in IE7 Beta 2

IE7 beta 2 implementation of IDN feature is such that if a user navigates to an IDN URL and if the scripts that are present in the URL are not part of the user’s configured Accept language, IE7 will convert the URL into Punycode and display it in the address bar. IE7 also displays the information bar saying that the website address contains characters which cannot be displayed using the current language settings.

Letters or symbols that cannot be displayed with the current language settings

This design makes IE7 secure by default against any URL spoofing attacks containing non-ASCII characters. In order to view a URL in Unicode format the user must have the language specific to that character script added to the browser’s Accept language.

As discussed previously, another IDN restriction for IE7 Beta 2 was that it did not allow intermixing of scripts for a given label (a label is a segment of a domain name, delimited by dots; www.microsoft.com contains three labels “www”, “microsoft” and “com”) in a URL. Also, for a given label IE did not allow mixing of non-ASCII scripts with ASCII. This step was mainly taken to protect users against homograph-spoofing attacks. Consider the scenario where a user commonly browses sites with Cyrillic URLs. If the user gets a phishing email to visit www.paypal.com where one of the ‘a’s is in ASCII and the other is in Cyrillic, the user might believe they are visiting the real paypal which uses all ASCII characters in their domain name. To protect against this spoof, IE7 will detect the mixed characters and show the URL in Punycode rather than misleading the user.

IDN - displaying URL in punycode

Improving user experience for some mixed script scenarios for IE7

We heard your feedback about how restrictive the feature was by not allowing mixing of ASCII characters with other scripts. For instance, in some locales it is common to have business names that mix ASCII and characters from local languages.

We looked for a way to allow mixed characters in a fragment without introducing the risk of a spoof. The IE team worked with experts from the Windows Globalization team to investigate which scripts can be mixed safely with ASCII characters. 

In the Release Candidate build (post-Beta 3), IE will permit mixing of ASCII with certain scripts and will display the URL in Unicode. However, IE still will not allow intermixing of allowed scripts (list given below) within a label, if they belong to different languages, even though the user has added the language containing the scripts to their Accept Languages.

Consider the following example where a URL label contains Hang and ASCII (website for LG Korea)

IDN - URL containing both Hang (Hangul) and ASCII

IE will now display this URL in Unicode for a user who has added Korean language support, since the non-ASCII script belongs to the Korean language set and is now on the allowed list of scripts. However, IE will show the raw Punycode encoding for a user who has not added Korean language support.

Here is a list of scripts that IE will permit to mix with ASCII

  • Arab (Arabic),
  • Bali (Balinese),
  • Beng (Bengali),
  • Bugi (Buginese),
  • Deva (Devanagari),
  • Ethi (Ethiopic),
  • Gujr (Gujarati),
  • Guru (Gurmukhi),
  • Hang (Hangul),
  • Hani (Han),
  • Hebr (Hebrew),
  • Hira (Hiragana),
  • Kana (Katakana),
  • Khmr (Khmer),
  • Knda (Kannada),
  • Laoo (Lao),
  • Mlym (Malayalam),
  • Mong (Mongolian),
  • Mymr (Myanmar),
  • Orya (Oriya),
  • Sinh (Sinhala),
  • Syrc (Syriac),
  • Taml (Tamil),
  • Telu (Telugu),
  • Thaa (Thaana),
  • Thai (Thai),
  • Tibt (Tibetan)

In summary, you told us how you planned to use the feature and we listened. We’re very excited that we were able to make this change to allow richer domain names for international sites!

Thanks,
Tariq Sharif
Program Manager

Comments (55)

  1. Anonymous says:

    Good to see. now, can we please start posting images that don’t look awful like this! They don’t need to be uber high res, but at least appear decently! (change your settings for compression, the pixelation in these images is terrible!)

  2. Anonymous says:

    haha

    http://www.איקאה.com

    is for IKEA in Hebrew

    cool 😛

  3. Anonymous says:

    Thank you for making the changes. However a distinct discrimination against non-Latin speakers who routinely use Latin script mixed in remains. Cyrillic is not included on the list. Obviously, due to spoofing concerns. However, if you are truly listened to your end users, you would know that even in Cyrillic, mixed script words like DVD and CD are the norm. Otherwise, thank you for finally opening up to the world and letting users use domains in their native languages. Let a million flowers bloom.

  4. Claw says:

    Why is Chinese not on this list?

  5. Anonymous says:

    @Sokol: We "truly" listened to our end-users, who ranked non-spoofability as the #1 concern for IDN. IDN will never take off if it’s viewed as a security hole.  

    @Claw: I’m not sure I understand your question.  Han is a Chinese script.

  6. Anonymous says:

    Pleased to see IE allows mixed script IDNs for Japanese (Hani/Hira/kana) and ASCII, since it’s very common to see such names, such as company names and product names, in real life.

  7. njg says:

    Since I installed IE 7 Beta 3, everytime I access my internal sharepoint site, I now get prompted for user authentication and this happens each time I try to do anything on the site so you end up entering the same authentication info again and again. I am the administrator of this sharepoint site and even if I click remember me, it still keeps prompting me for authentication even though I am on an internal network. This wasn’t happening before IE 7.

  8. Xepol says:

    I realize that this is off topic, but has anyone else using the google toolbar suddenly found that their toolbar has been updated without their permission?

    I hoped on my machine today and suddenly a totally changed toolbar with google this, gmail that, etc etc, and I can’t remember EVER being asked if I wanted my toolbar updated.  I can’t even find a setting in the toolbar that would prevent this invasion of privacy from happening again.

    Frankly, I’m a little confused and bewildered why a company that says its motto is to do no evil and them basically loads new software on my machine heading down that slippery slope towards spyware, all without even saying anything till after it is done.

    Frankly, I’m so mad, I’m seriously considering uninstalling the toolbar forever.

    Anyways, anyone else seeing this?

  9. Anonymous says:

    1-Is it only the domain name at this point? How about the URL and file names?

    2-I don’t see Farsi in the language list, is it comming anytime soon?

  10. Hoopskier says:

    While you’re on the "we listened" theme…

    Any chance you’ll listen to us about the naming of "IE7+" on Vista?

    Blog readers, please vote on this bug to remove the + from IE7+:

    https://connect.microsoft.com/IE/feedback/ViewFeedback.aspx?FeedbackID=168635

  11. Claw says:

    @ErikLaw: I see "Hani" on the list, but I don’t think I have ever heard it called that.  Is it a typo?  It’s normally called Hanzi in Chinese.

  12. Anonymous says:

    @Claw: I’m not an expert on global scripts, but I assume that Han is a Latinization (see e.g. http://www.circleid.com/posts/jet_guidelines_for_internationalized_domain_names/)

    @njg: There are 2 likely problems here.  1> Are the sites in the Trusted Zone? If so, you need to enable the "Automatic logon" action in Tools|Internet Options|Security|Trusted.  If not, ensure that "Include all local (intranet sites)" is checked in Tools | Internet Options | Security | Sites.

    @dotone: The IDN standard only applies to domain names.  For the rest of the URL, IE by default uses UTF-8 for the path, and either UTF-8 or codepaged text for the query string.

    The language list above is the list of scripts allowed to mix with Latin.  Not all scripts need to mix with Latin.

  13. Anonymous says:

    good!!!

    i couldn’t wait for this time coming

  14. Anonymous says:

    Windows Vista’s IE7+ has cipher strength = 256-bit. Do you know if Windows XP’s IE7 cipher strength is 256-bit ?

  15. Anonymous says:

    I think this is an internet explorer 7 bug but I didn`t know where to send a bug report…

    http://code-news.blogspot.com/2006/08/ie-70-encoding-bug.html I don`t think it`s just for IE 7.0

  16. Anonymous says:

    –As for Chinese, you should know that domains like 北京CBD (Beijing CBD), the official name of the Beijing Central Business District, should be supported, no ifs, ands, or buts.

    –"We "truly" listened to our end-users, who ranked non-spoofability as the #1 concern for IDN". — In other words, you listened to the users who don’t need IDNs in the first place, like those who only speak English, want to continue using ascii-only URLs, and can’t be bothered by the rest of the world, and are annoyed some clever phishing types have inserted weird characters to spoof. That’s not the IDN user base! The IDN user base is people who routinely use languages that are not ascii based. Why should they suffer? If I have Russian and English set as acceptable languages, I should be able to view mixed script domains.

  17. Anonymous says:

    Damit what is up with the image software you guys use, you jpg compress it and then recompress as png making us download artifacted images at the price of losless.

  18. Anonymous says:

    I should not address my question over here. But I can’t post a new one since I am a guest user, and don’t come here usually. But I have some problems by IE, I think it maybe because I am using IE7 beta3 version. Here is my problems:

    1. I always got some pages with some infomation down the left corner in status bar says:"done but with error!" and the page can’t show correctly.  Once I meet that, I can’t just simply eleminite it by fresh or open in new table. The only solution is close my IE and open new one. But it also may not help everytime. I need to try serval times to show the page correctly.

    2. Sometimes, I got a problem with my internet and IE only showed me an error infomation page, that is fine.

    After a while, when the internet back to normal, and I can go to other websites, but I can’t go to the page which I just got error infomation by simply using fresh button. I mean I have to open the same page in new table or windows.

    3. The last one is the one really bothered me recently. That’s the reason I come here. But I am not sure it is IE7 beta3’s problem or the web site’s problem.

    I usually go to a Chinese web site

    http://blog.sina.com.cn

    In the midlle of the page, you may see many small pictures. They are the link to some’s blog with many pictures. But, my problem is, whenever I try to look at these blogs, the pictures do not show up.

    Can someone with IE7 Beta3 try one of that?

    my computer IBM T43 xp sp2 + IE7 beta3

  19. Anonymous says:

    @Vlad: The URL you’ve provided leads to a 404 page.  Can you please fix the URL?

    @Sokol: Microsoft is committed to helping protect all users worldwide from phishing attacks, not just those who speak English.  To reiterate: If someone registers an IDN with a disallowed mix of scripts, everyone can still navigate to that URI, it is simply displayed in a Punycode form which cannot be used for spoofing attacks.

    @Eric Liu: I’m able to see the images on the blog.sina.com.cn page and the pages that it links to.  Can you please provide the exact URL of a page which isn’t showing pictures?

    Thanks, everyone!

    @AJenbo: Yup, we’ve posted a bunch of bad images lately.  We’ll try to do better in the future, but as you can imagine, we’re busy working hard on getting the product out.

    If you’d like to see only the designer-blessed pretty images, visit http://www.microsoft.com/windows/ie/default.mspx.

  20. Rosyna says:

    Yay! Now MS has taken the same approach as Safari.

    People using IE7 can finally visit http://sailor月.com/japan/Japan.html without seeing an error message with a solution that doesn’t work.

  21. Anonymous says:

    Ericlaw, thanks for your response. Usually, I have problems with any blog  with pictures from this web site, for example, http://blog.sina.com.cn/u/44491d9d010004w3  

    The pictures are small red cross foe me now. When I use right-click to see the properties of the pitcture, I saw http://album.sina.com.cn/pic/44491d9d020007vs for the first one. I am little strange at this. Because the pictures in other websites are end with *.jpg or something. Why does not in this website? So I suppose this website is using some special technology to deal with pictures. And this technology happened to have some problem with IE7 or my windows xp sp2.

    Another issuse is about my internet. Maybe the poor DSL causes this problem. But I am fine with other website with many pictures.

    Also I did a test like this:

    Again for the first picture in this link, I can’t see it in the current page even I refresh it or right click the red cross and select "show picture".

    Then I copy the link of the pic, and open a new table in this IE window, copy the link to this adress bar, but "enter" or "refresh" did not help. I can not see the picture.

    Then I did another test. I keep this IE window opened, but open a new IE window and copy the link to the new windows, then the picture shows up.

    Ok, I back to the first window, which I couldn’t open the picture just before, and use right click and select "show picture", the picture show up.

    I know this is really mess. But I am really want figure out what is the problem.

  22. Anonymous says:

    @EricLaw [MSFT] can you please tell me which link leads to a 404 page?. All the links work fine for me.

  23. Anonymous says:

    Sorry about the question…other persons seem to have problems accessing the page too..I don`t know why. try this: http://code-news.blogspot.com – It`s the post about IE bug 🙂

  24. Anonymous says:

    @Vlad: The site (http://thor.info.uaic.ro/~dlucanu/pa/id_tema_2005-2006.htm) renders perfectly for me in my RC1 build of IE7.  That being said, the page is written incorrectly.  The server returns the header:

    Content-Type: text/html; charset=UTF-8

    But the body contains the directive:

    <meta http-equiv="Content-Type" content="text/html; charset=windows-1252">

    These can’t both be right.

  25. Anonymous says:

    IE7 is a great improvement.

    I would like to see an indication on the tab that tells me if I have viewed it or not.

    Being able to ‘pin’ tabs so they load the next time IE is opened would also be great.

    Both of these features are in the iRider browser, which was my default until IE7 came along.

  26. Anonymous says:

    Any news on when Release Candidate 1 will be… released?

    Also, how come this blog isn’t done like the live blogs, where you can log in under you live account? I like that.

  27. Anonymous says:

    gooooooood very goooooood

  28. Anonymous says:

    This was more important than an integrated download manager?

    I renew my request for an integrated download manager.  And one like Opera’s, not Firefox’s (Firefox’s sucks; I’ve seen cases where the server stops responding for a certain length of time (but doesn’t necessarily terminate the connection), causing FF to think that the download is complete, and you can’t resume the download in that case; you have to start over from the beginning.  With Opera, the download process stops, but Opera tags it with "Error", which means that you can Resume the download where you left off, which is what and IE download manager should allow as well.)

  29. Anonymous says:

    @Brutus: Yes, in surveys of what IE users were looking for, IDN ranked well above a download manager (remember, there are hundreds of millions of non-US users of IE).  

    There are a wide variety of download managers available for Internet Explorer. My current favorite is (the somewhat unfortunately named) LeechGet, available from http://www.leechget.net/en/.  It offers a ton of features and is free for personal use.

    Of course, we’re still looking at building in a download manager for a future release of IE.

  30. Anonymous says:

    Thanks for finally delivering this IDN support.  Many people fail to realize that this changes the entire internet for the majority of users around the world.  Throw off the shackles of ASCII!!

    Given the importance of this feature, I am rather surprised that in your PR about IE7 you rarely mention IDN support.  For most of us who are in a position to judge it is game-changing stuff.  Enhanced anti-phishing is great but this goes way beyond.

  31. Anonymous says:

    I’ve noticed that in IE 7, if I change the CSS attribute "display" to "none", the div contect keeps on playing.

    For example, if I have an video gallery where you can hide and show content by clicking on the various icons, if I play one of the movies and then click on a different icon, that movie will keep on playing. I won’t be able to see it, but the sound will keep on playing.

    This was not the case in IE 6 (or any of the other browsers, for that matter). I hope it will be fixed.

  32. Anonymous says:

    Can you please allow IE to use mixed script IDNs for Japanese (Hani/Hira/kana) and ASCII, since it’s very common to see names such as company names and product names, in real life.

  33. Anonymous says:

    The web may be international and Internet Explorer obviously needs to support all the nations and users, but I am not international nor is my life or business or interests.  You will never see a legitimate US business targetting US customers using these characters in the urls.  Give me an option to disable resolving and accessing any such url.  While you’re at it enable me to disable accessing urls by specifying wildcards such as *.cn  *.someurl.net.  I’m talking about a hosts level option… although letting me specify security zones this way would also be a plus.

  34. Anonymous says:

    People must not be having that much trouble developing cross browser websites if they can troll blogs with comments such as:

    "Damit what is up with the image software you guys use, you jpg compress it and then recompress as png making us download artifacted images at the price of losless."

    Who cares what the quality or format is, they are clearly visible and are only there to explain something. This is not a high quality glossy book with high resolution pictures. It is a blog.

  35. Rosyna says:

    I know this isn’t quite the place to say this, but I think IE’s phishing filter is totally borked.

    http://www.imagetrash.net/image_45485.jpg

  36. Anonymous says:

    Not sure, if Turkish will be included in this, such IDN names also allowed here while Turkish alphabet mainly latin, windows-1254, iso-8859-9 domain names can be Turkish specific characters, where other users just see as different type such as "ğ, ş, ı". I’m not sure if this is natively installed in IE7.

  37. Anonymous says:

    @Rosyna

    I visited this link http://www.unsanity.org/archives/rant/caring_for_developers.php with IE7 beta3 and I didn’t get a warning by phishing filter.

    Look here: http://imageshack.ath.cx/images/nophish.PNG

  38. Anonymous says:

    @Neal: By default, if you haven’t configured other langauges to anything other than English, it won’t be possible to spoof you using IDN.

    Checking Tools | Internet Options | Advanced | "Always show encoded addresses" will remove the possibility of you being spoofed by an International domain name, even if you change your language settings.

  39. Anonymous says:

    Hi!

    can i make a request? why dont you allow that the default landing page for domains typed without extension on IE’7 in japanese is .jp instead of .com?

    First. For japanese content, there are -obviously but not for you until now- more japanese on .jp domains on use than .com.

    I mean as i am not confident on foreign languages and it is a .jp the extension that has the most japanese content, why to force me to go to an extension (.com) that is most likely not to have japanese content? On .com i will be landing on english, spanish, greek, etc. websites and i dont see the point, i speak and my IE’7 is in japanese. So save me some time and make .jp default landing extension for the people that have IE’7 on japanese.

    I know there are more .com registered than .jp, but it doesnt mean there are japanese content on .com than .jp , right? After all, for someone in Japan, with IE’7 in japanese it will make sense that this person’s preference are websites on japanese, dont you think? And as i said .jp is like 99% chances to have japanese content, while a .com is only less than 1% of .com are in japanese.

    Second. Regarding japanese idn and .com

    Most .com in japanese IDN are already taken by cybersquatters, as .com is cheaper to register and it is not controlled by the Japan Registry. Verisign doenst combat much cybersquatting. At least JPRS combat cybersquatting  on .jp , so you dont see almost never pages full of ads on .jp,  while on .com millions of .com doesnt have content at all. A .jp has content and on japanese, thats what i meant.

    Anyway, please let me know your feedback, because i really find annoying and a little .com biased that if you dont type the extension (.jp), you have to visit a .com that 99% of them are not in japanese, while 99% of .jp has japanese content.

    I dont know in China and other countries. but in Japan, for japanese content to visit a .jp means more pages in japanese than to visit a .com So please for IE’7 on japanese, please landing page on .jp, not non-sense .com as less than 1% have japanese content? Onegai shimasu!!!!

    Naoshi O.

  40. Anonymous says:

    I’m not sure if this is in the right place, but I would like to see IE7 allow you to drag tabs in one window of IE to another window of IE.  If there is a better place to talk about this, put me in the right direction please.

  41. Anonymous says:

    Naoshi Ogata:

    What feature are you talking about?  If you type in a URL, you go to the site.  If you don’t type in the full domain, you go to search (and you get to configure what search engine is used).  If you want, you can change what happens when you hit CTRL+Shift+Enter using Tools>InternetOptions>Languages.

  42. Anonymous says:

    In addition to the more prominent work we’ve done to enable international scenarios (like adding support

  43. Anonymous says:

    This blog post frames our approach in IE8 for delivering trustworthy browsing. The topic is complicated