Online gift ordering + enthusiastic kids at the keyboard + Unicode, wait… Unicode?

I was completing an online gift order for my young nephew's birthday, and I was in the middle of typing Happy birthday into the gift card message when an enthusiastic child reached for the keyboard and held down the "a" key as I typed the final "a" in "birthday".

I wanted to capture the spontaneous enthusiasm in the gift tag, but I had no idea what font or format rectangle was going to be used, so I couldn't be sure where to put hyphens so that they will ensure line breaks at visually-pleasing locations. And if I didn't insert hyphens at all, then the line would just run off the end of the gift tag and end up truncated.

Unicode to the rescue!

First, I fired up charmap and went to character U+00AD SOFT HYPHEN. I double-clicked the character in the grid, thereby copying it invisibly to the Characters to copy box. I then clicked the Copy button to copy the invisible soft hyphen to the clipboard. Then I switched back to my Web browser and pasted the soft hyphen into the long string of a's every six or so characters, to provide a hyphenation point.


Happy birthd­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­y!

When the gift reached its destination, my brother said, "Nice job on the hyphens. How did you know where to put them?"

I then let him in on the secret. And now I'm sharing it with you.

Anybody know whether Amazon supports the creative use of Unicode to create elaborate smiley faces?

Comments (35)
  1. Vilx- says:

    Risky! I'd have been afraid of getting a card with the text: Happy birthdaaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaay!

  2. Vilx- says:

    Aaaand there goes your blog software. Note to others: don't include HTML entities in your text – they will be treated as such! Let's see if my second attempt goes better:

    Happy birthdaaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaa­aaaaaay!

  3. steven says:

    That's pretty funny and a cool use of Unicode, though it is fortunate that it worked as intended and that you didn't get the 0xC2 and 0xAD UTF-8 bytes inserted as printable characters from some random 8-bit codepage every 6 characters. Looks like Amazon created their online forms properly :)

    What I find truly impressive is that your brother recognised the hyphenation as being problematic in this case, unless he's also highly tech-savvy.

  4. Vilx- says:

    Yup, that's what I wanted to say in the first place. Morale: more often than not, people still get their encodings wrong. Be careful when playing with them.

  5. Dan Bugglin says:

    As the others touched on this (unintentionally and intentionally respectively) you can never predict how text you type will be mangled, especially if you dare use characters not on the keyboard.  The worst offenders, of course, will blow up with uncommon punctuation (failure to escape ampersands when displaying in HTML, failure to use SQL prepare stuff or any sort of escaping for user input for SQL, etc).

  6. Stuart Langridge says:

    Nice trick, Raymond. I knew about the "shy" entity in HTML to be a soft hyphen, but it never occurred to me that there's actually a real Unicode character for it. I learn a new thing; thanks!

  7. Cam says:

    I don't understand why no-one has complained about Microsoft in these comments yet. :)

  8. MikeCaron says:

    This is brilliant!

    Other uses of this idea might be to "uʍop ǝpısdn ʇxǝʇ ǝʇıɹʍ", or s̷̤͈͕ͧ͛̾̀̕ủ̝̝̖̻͈ͫ͛ͧ̒̓͌͝m̙͎̲̼͍̺̬̮̿ͨ̀̏̓ͫm̨̢̗̮͉͓̞͐ͯͥ̃̍̑ͩo̷̩͓̼̾ͤͬn̸̢̘͙̗̙̭̭͓͇̅͆ͮ ̴̵̫̠͖͇̩̳̊̒̔ͯ̚͜ḑ͍̼͓̠͇̆͊ͨe̶̛̫ͧͨͪ̌ͧ̐ͪ̃m̨̓͊͘͏͕̪̬͕o̧̟̜̝̺̹͇̞ͥ̔ͬ̿̃̃͘ͅñ̵̡͉̖͎̼̱͕̍̄ͫ̔ș̰̻͊ͯ͐̈̒ͭͮ͝ or something

  9. Les says:

    Is it sad I had to resize my browser window before I saw what the big deal was?

  10. NB says:

    Thanks for the resize tip! Very cool. Did not know they were still active :)

  11. Erno says:

    The worrying bit about this is that the people who built these sites probably don't even know that their sites support this. And better yet: I keep imaging some office worker copy-pasting the data from an excel sheet into an email to be send to the printing company where yet another office worker starts to copy-paste the message until somewhere some printer has to deal with what is left of the text.

  12. Willie says:

    I am impressed.

  13. max says:

    That is nice, and kudos to Amazon. A word of warning though — Amazon is cool, but not flawless — the gift message I typed in Russian was swallowed whole, and nothing came out. What's even more shameful is the fact that it was "print your giftcard at home" type of deal, so they just gave me the PDF. Without the text.

  14. alegr1 says:


    Most likely that was because the PDF came with an embedded font for that, and the font might be missing cyrillics.

  15. Technogeek says:

    I've done similar tricks with Unicode. Similar to MAZZTer's post above, I wound up entering U+2019 a few times to get around an unescaped SQL issue on an internal website.

  16. creaothceann says:

    Wouldn't the optimal solution be to insert a soft hyphen after every "normal" character?

  17. Nathan says:

    The real wtf is why the keyboard was accessible to enthusiastic children in the first place.

  18. cheong00 says:

    When I was working on an internation card printing website, I generates preview as the card is done and use a higher resolution image of that to print. If any character cannot be shown, you will not see it on the card.

    I guess this allow us to evade the possible complaints of non-printable words on the cards. :P

  19. cheong00 says:

    @Erno: Yup, the card printing website that I made unintentionally support this too, and I guess any ordering service uses GDI+ based DrawString() or other text drawing routine that properly supports Unicode will support it "unintentionally" too.

  20. SomeGuyOnTheInternet says:

    Smart quotes are an abomination. Good luck debugging why the user can't find Mr O’Brien when they search for O'Brien.

  21. SomeGuyOnTheInternet says:

    Or why search on "Smith-Jones" doesn't find "Smith–Jones".

  22. Evan says:


    Yes, god forbid we write our software to behave like what is good for us humans (counting " and “ and ” as the same when, e.g., searching). It's way better to change our own behaviors.

    (Though in your particular example, it matters much less.)

  23. Evan says:

    Also, how'd we get onto smart quotes anyway? Sorry for getting off-topic; I seem to do that a lot.

    Anyway, personally, I'd be curious to see what they did with RTL markers and combining marks and other fancy stuff like that.

  24. Tanveer Badar says:

    You may want to add a bit about resizing the window. It is not very obvious at first.

  25. Smitty says:

    I would never have assumed that Amazon, or any other website, would honor anything other than ascii;  besides, ascii already defines a hyphen and a carriage return character, why didn't you try that first rather than assume Unicode suppport for a 'soft hyphen' whatever that is ?

    [Um, why should Amazon, an international company, make it impossible for people in Québec or Germany or Japan to write a message in their native language? -Raymond]
  26. Neil says:

    Extraordinarily this actually works best with the actual site at 800×600 as then the first break happens after "birthd" and the "a"s break into five equal strings, while on wider screens or with feed readers you get a ragged right margin.

  27. Tim says:

    @Smitty: Observe how the greeting card text in the article adjusts while you change the width of your browser window. This will help you understand what a soft hyphen is an why it is fundamentally different from your suggested solution.

  28. Jon says:


    It is right in the post: "I had no idea what font or format rectangle was going to be used, so I couldn't be sure where to put hyphens so that they will ensure line breaks at visually-pleasing locations"

  29. Moose says:

    That's neat and all, but did this actually happen? With your brother *really* saying "Nice job on the hyphens". Is your family that geeky??

  30. voo says:

    @Dave Well that comes from living in the US I assume. It's a bit better here in Europe because obviously we do have lots of languages where you won't come far without some special characters (and while you can solve that with 8bit character sets, it gets problematic if you have german and french users,..).

    I would think it should be even better in Asia, but then in my experience they do really like shift-JIS there despite all the problems (ie I would think unicode would be most advantageous there). Oh well.

    @Smitty Sorry to say but I hope you never work at any international company. Thanks to that attitude I still can't write the ß in my street name on lots of forms on the web. It's the 21st century it's really time to start using almost 2 decades old technology and not the 4decades old one..

  31. dave says:

    re: web sites honoring anything other than ascii

    It's one of my pet peeves that, 21 years after the publication of the Unicode standard, people still consider characters larger than 7(*) bits to be someone exotic.

    Especially when we are, implicitly because we're reading a Microsoft blog, all using an operating system that is fundamentally Unicode.

    (*)Yes, 7, not 8. If it's larger than 0x7f it's not ASCII. There may have been a proposal for an 8-bit form, but it was never really adopted.

  32. lefty says:

    I just think you're lucky the message didn't end up being:

    "Happy birthd­aaaaaa­aaaaaaPC LOAD LETTER"

  33. Anonymous says:

    Depending on your Operating Systems level of Unicode support you may be able to use Emoji:…/Emoji

  34. Depends on how the available message size is calculated (the whole glyphs vs. code points vs. bytes can of worms.)  This affects more than just soft hyphens; it will also affect combining characters and (depending on the encoding used) surrogate pairs and other single-code-point multi-CHAR sequences.

  35. > ragged right margin

    Fixable by putting a soft hyphen at *every* breakable location instead of just some of them.

    Hap*py birth*d*a*a*a*…*a*a*a*a*y!

    [At the cost of reducing the available message size by a factor of 2. -Raymond]

Comments are closed.