Net Addresses to Make Use of Non-Latin Scripts

The nonprofit body that oversees Internet addresses approved their use in a decision that could alter the look and character of the Web.

Comments: 46

  1. Addresses using non-latin scripts, formally known as internationalized domain names or IDNs, have been around since about 1999. For example, ???.gr has been registered in 2005. What's new is that top-level domains like .com or .gr may soon also use non-English characters. The article is rather unclear about that.

  2. Do many users actually type URLs anymore? I suspect most use a search engine. And HTTP servers can already serve up pages in any language. So this shouldn't affect users all that much.

    If I understand correctly, this means a national domain can now be represented by both the existing latin (.eg .ru) and the national script (whatever the Cyrillic is for .ru) and these would be two distinct domains.

    I muse over how the difference might play out over time and am reminded of a visit to Chinatown restaurant with a friend who was fluent in Chinese. As we were entering, I noticed a large sign handwritten in Chinese characters in bold magic marker. When I asked my friend to translate, he smiled and said "Tipping is not required". :-)

  3. Doesn't the domain name come BEFORE the dot?

  4. It is about time! They should have opened up internet domains to foreign languages years ago.

  5. Why we continue to give this organization the type of power (real or paper tiger) it has over domains decades after the widespread proliferation of a GUI internet is beyond me. In any other commercial sector they would be viewed as a monopoly with practices that echo organized crime syndicates.

  6. more global users=more global abusers

  7. " “The Internet will become more multi-lingual than before.” Just what we don't want! The best that can happen is to end up with a single language in the world. Obviously, this tern of events is a backward step.

  8. The article should have been more clear as to the potential danger - the spoofing of legitimate addresses using look-alike characters from these other scripts.

  9. On the surface, allowing URLs to access web sites using non-Latin characters sound great. One can register a domain name in one's language and no one has to learn another alphabet. Well, this change not only affects web sites, but affects the registration of domain names, and the central domain name services (DNS). Not only that, these domain names can be used in e-mail addresses; and that is the problem.

    And to think fighting SPAM was bad with Latin character domain names; imagine the number of identity theft, sex related sales, Trojan Horses, viruses, cyberattacks, etc. are now going to get through. Utilities which filter SPAM, or anything else for that matter, will have to be revamped to deal with stopping 90% of e-mail which is considered SPAM and delivery of cyberattacks. And SPAM filtering is only one part, then there are sites, and local server configurations, which blacklist known virus, Trojan Horse and SPAM sites. Again these also use Latin characters for domain names.

    While, internationalization is good for the user, those making this change did not consider the cyberattack, SPAM and system administration/engineering issues that will impact many networks and servers. Not to mention the costs to deal with affecting proactive changes to prevent attacks.

  10. Now, more than ever, the U.S. needs to get serious about teaching two or three languages in addition to English to all school children. The rest of the planet has a distinct advantage in trade and other important activities in life knowing English and often two to three other languages.

    Languages all U.S. school children should be proficient in: English, Spanish, Chinese, and a fourth one they pick which they should be good with or already know. Expanding language requirements for U.S. students could revolutionize U.S. trade world wide and increase understanding how the rest of the world thinks and lives.

  11. We all live on what seems to be a smaller and smaller planet. Humanity needs to learn to live as one big tribe. Allowing the "others" access to the community of the internet may very well help us understand that humans have much more in common than they have differences. Perhaps learning to cooperate and live together will not be such a bad thing.

  12. N° 7: "'The Internet will become more multi-lingual than before.' Just what we don't want! The best that can happen is to end up with a single language in the world."

    If that were to happen, the single language would probably be Chinese. Are you okay with that, Mr. Hosman?

  13. I'm sure al Queda is psyched. Might make it easier to conceal their chat rooms from nosy Americans, no?

  14. I thought the whole idea was to make the rest of world learn American English. It's bad enough already that American made web browsers support foreign alphabets. But the implications of this latest change are even more ominous: The day may well be approaching when Americans might actually have to learn something substantive about foreign cultures, especially those with funny alphabets. I don't see why we just can't keep printing dollars and giving them to the Chinese to manufacture just about everything we need. We like to consume; they like to save. That's all I need to know. Why ask questions? Be happy.

  15. For us English speakers, the current system allows only: A-Z (& a-z), 0-9 & "-".

    We need more Punctuation! Spaces (or the "_" workaround), ampersands (&) are two characters that I would find useful. This point doesn't seem to be addressed.

    We still can't write (for instance): Tiffany & Co.

  16. #12: I'm sorry Randonneur, but Chinese will never become the world's dominant language. Why? Ease of learning/use. English is very easy to speak badly. Spoken English with broken grammar/intonation/tense is still pretty easy for the listener to understand. It's easy to master 26 letters. Chinese (I assume you mean Mandarin) is not even spoken by all 1.3 billion people in China. In many provinces, traveling 20km from one village to another finds one in a place where he cannot be understood. Rote memorization of thousands of complex characters (the ONLY way to learn them by the way) is not conductive to quick development. Years of education time are spent educating Chinese youth solely on how to write their own language (English letters can be learned in days/weeks). The main point is English is easy, useful and extremely popular. Just ask my students. (I have been living in China for the last 6 years btw)

  17. U2NCNET wrote:

    "Why we continue to give this organization"

    They are an INTERNATIONAL organization and they have a mandate to accomodate everyone on the entire planet.

    "In any other commercial sector they would be viewed as a monopoly"

    You totally miss the entire point. There would be chaos if there were more than one organization handing out TLDs.

    And to several other respondents:

    If you think that it's possible to sort out spam or porn or whatever by looking at the domain name, BOY are you naive.

  18. Seems like the first step toward and internet Tower of Babel to me. UNLESS, real-time Babelfish-style translation gets built into web browsers. God forbid we Americans actually have to learn another language!

  19. At no.13 (xSamplex),
    I have to agree with you on that one. It's one of the first things that came to mind when I read this article

  20. Not to nitpick, but the web does not have a four-decade history. Tim Berners-Lee created the first web server and browser in 1990, and the WorldWideWeb as a public service went live in 1991. The intertubes, starting from the original four-node Arpanet, date back 40 years.

  21. Number 12: re forcing everyone to learn Chinese.

    There's a reason that English is universal: it's easy to learn with no stupid genders to make things unnecessarily complicated (thanks German and French! etc.), and no unwieldy alphabets of thousands of characters (hello Chinese!). Just be grateful that we share our beautiful language with you.

  22. #12

    The internet is a single language..1's and 0's.The rest of it is window dressing.

  23. Porn is porn in any language and any URL, and it's still the ONLY profitable business on the net...when will someone design a keyboard that be used in any language - now that would be a welcome improvement to accessing cyberspace.

  24. Clearly the Tower of Internet was growing too tall and threatening.

  25. High Tech Tower of Babble

  26. "Of the 1.6 billion Internet users worldwide, more than half use languages that have scripts that are not based on the Latin alphabet."

    What??? I thought everybody in the world spoke English, wrote/read Latin alphabet, ate American junk foods, paid in US dollar, wore an American fashion style, watched American Idols, etc.?

    You mean we're not Number 1 anymore? Shocking!

  27. Time to rush out and get those highly valuable domain names in other languages so you can make a quick buck! Seriously though it is a good thing except for those in the West. There will be issues but this will make it more fair and accessible for non-Latin language users. Time to brush up on my Chinese!

  28. what does this mean? that every small business currently already paying for their domain names now will have to pay endless registry fees all over? what is icann good for?!

  29. All script should be in Latin. I am a big time liberal but still I get annoyed at multiculturalism. (I myself am a multicultural person) I am not dismissive of other cultures but the idea of unified communication is too tempting to dismiss. The world would eventually be WAY a better place if everyone had a standard means of communication.

  30. "What's new is that top-level domains like .com or .gr may soon also use non-English characters. The article is rather unclear about that."

    Okay, well then, I'm not sure how something like this actually has some of these grand cultural and linguistic implications some of the other readers speculate it will have.

    Sounds like the latest web-gimmick of the month will be to add a few non-latin characters to your domain name to attract random traffic taken in by the novelty of it.

  31. First-off, for all the Tower of Babel arguments regarding language, that is irrelevant. The entire discussion has to do with URLs - the web addresses - not webpage content. Webpages can be written in any language whatsoever provided there is a characterset available on computers for expressing that language. In other words, webpages can easily be written in Englesh, French, Greek, Russian, Farsi, Arabic, Urdu, Mandarin, Japanese whatever... even Klingon! This ability will not change due to admitting non-latin characters in the URL.

    What will be inconvenient for people accustomed to a particular character set is typing in a URL directly that contains non-Latin characters. Often, specifying a direct URL is easier than looking one up via a search engine. Of course, once a webpage is found, it can be bookmarked by a browser.

    What is interesting is some of the minor potential problems that may arise. For instance:
    1. How will such webpages be displayed - or even accessed - if a computer doesn't have the language set installed. This will mean that, going forward, every computer should have every standardized character set for full functionality.
    2. Often in computer applications, there is a "delimiter" character. For instance, often you will see in URLs "%22" which usually represnts a quotation mark. What concerns me is that a delimiter in one language set might actually be a useful character in another. How will a browser handle that? I suspect that certain additional characters will have to become standardized delimiters like certain ones are now.

    Overall, it's an interesting frontier.... It'll certainly create a lot of jobs down the road to handle the transition.

  32. Not all languages read from left to right--Arabic, for example, doesn't. How will this be accounted for in web addresses?

  33. "13. xSamplex Boston, MA
    October 30th, 2009
    10:03 am
    I'm sure al Queda is psyched. Might make it easier to conceal their chat rooms from nosy Americans, no?"

    No.

  34. The responses here are characteristic - xenophobes on the one side, and those with a more realistic view on the other side. If we think we will keep the rest of the world from using their own scripts, we don't understand that our alphabet is actually a minority on the world stage. Do you think we might block the use of others' native scripts, for access to a vital resource like the Internet? Societies like China have for a number of years now _already_ been serving fully Chinese domain names, now with over 300 million users. So now have a number of other countries and script groups.

    Clearly internationalized domain names (IDNs) have been possible. Actually for over ten years now. What this article does not tell is how ICANN has failed most completely to respond to those needs. Over ten years, for goodness sake. Of course other societies took matters into their own hands.

    That is the more interesting story. And the article has buried in it what is the crucial part of today's announcement: Actually, only country code (cc) top level domains (TLDs) will begin to trickle out of ICANN even now. Unfortunately, the article leaves the ICANN spin on the story - 'now we are international.' Instead of getting the key facts up front - 'we still cannot enable the international .com, .org, etc., which have been the heart of the Roman-character Internet.'

    The real action, not in ccTLDs, but in these generic (g) TLDs, remains delayed. Having myself just got off the plane from Seoul and this ICANN meeting, it is clear that IDN gTLDs will be for some significant time still off in the future. While other societies have already moved forward.

    As to the desirability of different languages and scripts, besides English and the Roman alphabet: These differences are where the world starts! From eons ago. Those other societies - a _much_ greater portion of the world, than Roman character-based users - will arrange access to the Internet, a facility fundamental to their moving ahead. The challenge of dealing in different languages is what we have always faced. That does not change. When other script groups are fully accessible, including to us, perhaps we will be encouraged to deal with this age-old challenge more effectively.

    But the interesting story lies beyond ICANN and with the arrangements that other script groups have already made. As they say, stay tuned for developments.

    David Allen

  35. I find it hard to believe that the people in any of these countries mentioned who have the money for a computer and Internet access also have a debilitating problem with the Latin alphabet, which I also agree ought to be good enough for the Internet if it's good enough for the UN. I read one other language besides English, and often need information from sites that only use that language. I don't need the language in the URL to find them, especially since most of these sites are simply rendering names in their own language with Latin letters.

    I find it very easy to believe that some of the enthusiasm here is fueled by nationalist agendas, and easier to believe that this is yet another example of the start now, plan later ethos that is prevalent in some of the countries mentioned; given the amount of cyber crime already flourishing currently in Russia and East Asia, I find the vagueness about what the security risks are and what exactly is being done to counteract them plainly frightening. I understand Westerners tend to turn their brains off as soon as someone hits the identity politics button, but a statement like 'we're up to the challenge' ought to be setting off alarms.

    Finally, I'm sure there will be software developed that automatically shuts out anything routed from an address in something other than Latin letters. Sign me up [it won't hinder my ability to access content in the other language I use].

  36. the problem is when i get a link to go to chase.com, that i don't go to chàse.com or cháse.com, which would be spoofed phishing sites for my personal info

    (i don't know if the new york times comment section supports unicode, or if your browser does. but i typed chase.com where there were accent marks over the a character. you may see gibberish instead)

    the point is, every website i go to from now on, i need to study the url with a magnifying glass to make sure i am getting the actual site i wanted. not even as a security precaution, but just as to avoid phony sites that might be spoofing a real one for all sorts of purposes, not all of them nefarious, but all of them certainly annoying. a with accent mark may be easy to see, but there are some subtle unicode characters that are so completely like the lowercase "L" or upper case "I" or upper and lowercase "O", etc., and each different font might render the different characters in so many subtle variations, that its almost impossible anymore to guarantee that the link you followed actually went to the site you think it did

    so we have to type addresses by hand to make sure they are genuine from now on? its not cultural imperialism to support only 30 or so characters for website addresses. think of it as a universal routing system, that is purposefully limited, simply for the sake of security and peace of mind

    characters for website addresses should remain small in number

  37. Wow, I'm counting the new spam already!!

    Let's face it, this is just another step in the direction of Chinese world hegemony -- brought to you by the US government and Wall Street!

  38. What's the big deal? The Internet is already in over one hundred languages, including dead ones. This only affects whether people have to transliterate their native language into Latin letters. I seriously doubt that it will be anything other than a small convenience (not having to switch keyboard settings before typing in a URL if you type in Russian) for most people. So instead of www.pravda.ru, we have ???.??????.??. Woohoo!

  39. Quick, what's the difference between www.citibank.com and www.?itibank.com?

    In the first one the 'c' is the letter 'c' (1100011 in binary), the other is the lowercase roman numeral for 100 (10000101111101). There are similar pairs of characters for every letter of the alphabet; some, like 'A', have dozens.

    Remember the old days when you could check whether or not a website was genuine by looking at the URL and differentiating between www.citibank.com www.citi-bank.com.ru? Yeah, those days are over. The only way to know whether a web address is genuine now is to type it out longhand yourself, which I think we can agree is a serious step backward, not forward, in internet technology.

    Previously there was an organization that performed a service of translating non-ascii domain names to ascii ones (e.g. www.p?ypal.org will render as www.xn--pypal-4ve.org) that would show the ascii version, preventing spoofing attacks like the new standard will encourage.

    By the way, did everyone know that ICANN funds itself through the sale and registration of new domain names, and gets the most money from creating new top-level domains like .com and .jp? It's true; they have a very substantial profit motive in creating a system that would require existing businesses to register new domains, and countries to register new TLDs, especially if Citibank (for example) now has to register all 1000 or so possible spoofs of its domain name.

    Follow the money

  40. "so we have to type addresses by hand to make sure they are genuine from now on?"

    For heaven's sake. You already are supposed to do that, because you could easily miss a 1 for an l (that's right... they are both in the standard English a1phabet) anyway.

    This is a 1ot of hoo-ha over nothing. It's not the first time foreign languages have got to the Internet! There are already whole families of chat-rooms in Arabic, enormous message boards in mixes of Russian and Ukrainian, hundreds-of-thousands-of-user sites in Urdu, Hindi, and Mandarin. It's only the URL. I am flabbergasted by the amount of bigotry here. It's like time-traveling to 1950!

  41. Urg... People don't realize that now more internet engineers- in order to stay competitive in the market- will now have to learn chinese characters, arabic letters, Hindi script.. etc.. its already hard when companies can hire HB1 workers or offshore work, so please can someone plass me the foreign language book now? And I changed my mind, no kid of mine is going take an european language, they are going right into Arabic or Chinese.

  42. I think the unintended consequence of this new change will be to close off vast areas of the internet to those who don't have the ability to type non-latin characters. It is not a bias in favor of English that kept addresses the way they are now - just in favor of simplicity. Look at it from the programmer's point of view - do you think now there will be versions of C++, Pascal, AppleScript, JavaScript et al that will be deployed in other languages and alphabets as well?

    I predict LOTS of unintended negative consequences.

  43. This is a clever way to address the fact that the internet is running out of sensible combination of latin characters to use as domain names. But then what happens when you run out of sensible international characters? Lets not ignore this problem too.
    Also, all keyboards are in latin characters anyway, so typing in foreign characters still requires a knowledge of the latin alphabet, so this helps no one who doesn't know the latin alphabet (not that the latin alphabet is very difficult to learn).
    As for someone who suggested we allow periods in web domains, that's a bad idea. The period is a special character used to separate different parts of a web address.
    This is not really a revolution in the internet. It's just small progress.

  44. Quote from the Article & my comments, below:

    "Some security experts have warned that allowing internationalized domain names in languages like Arabic, Russian and Chinese could make it more difficult to fight cyberattacks, including malicious redirects and hacking. But Icann said it was ready for the challenge."

    “I do not believe that there would be any appreciable difference,” Mr. Beckstrom (of ICANN), said... YEAH, RIGHT!

    My Comments on the above:

    1) This poses a new and dangerous URL spoofing & security threat.
    Scripting threats, Trojans, you name it...

    2) Prepare for an onslaught of SPAM emails, very hard to stop
    when a diversity of languages & charcters are used in the URL.

    3) ICANN, the promoter of this colossal mistake,
    makes money by the number of new Domain names it registers/sells.
    So, THEY are the main beneficiaries of this sham....

    I wish some US authority/agency would stop this,
    and close down irresponsible ICANN forever!

  45. Personally, I agree with Paul Hosman's viewpoint - we need one world language!

    #10 Mike said "Now, more than ever, the U.S. needs to get serious about teaching two or three languages in addition to English to all school children. The rest of the planet has a distinct advantage in trade and other important activities in life knowing English and often two to three other languages..."

    Which reminds me of a joke ;)

    A Swiss guy visiting America pulls up to a rural bus stop where two locals are waiting.

    "Entschuldigung, koennen Sie Deutsch sprechen?" he asks.

    The two just stare at him.

    "Excusez-moi, parlez vous Francais?" he tries.

    The two continue to stare.

    "Parlare Italiano?"

    No response.

    "Hablan ustedes Espanol?"

    Still nothing.

    The Swiss guy drives off, extremely disgusted. The first guy turns to the second and says, "Y'know, maybe we should learn a foreign language."

    "Why?" says the other. "That guy knew four languages and it didn't do him any good."

  46. I think the entire internet should be converted to Esperanto.