Encodings in Strings are Evil Things (Part 5)

    In our last episode, we briefly discussed possible behaviors for encoding_cast, and we discussed how the STL’s basic_string class was structured — namely, we noted that it had several core functions that were overloaded many times for various types of input.  We also noted that we could avoid many of the implementation headaches that…

6

Encodings in Strings are Evil Things (Part 4)

   In our last episode, we established that we wouldn’t be able to make a true std::string replacement and still handle variable-width encodings.  So, we started with the beginning lines of an rmstring class.  However, this doesn’t mean we are going to dispense with std::string entirely!  But first, a quick answer about my choice of names…

2

Encodings in Strings are Evil Things (Part 3)

   (Before I start: I’ve gotten a few suggestions about readability, since my two entries thus far have been quite long.  So, entries will now contain a summary at the end with major facts/conclusions, and I’ll go back and add them for the first two posts.  I’ll also try to pace my paragraphs more regularly.  Thanks…

1

Encodings in Strings are Evil Things (Part 2)

   At the end of the last post, we reduced the abstract concept of “string” down to an “ordered sequence of Unicode code points.”  (We did so by choosing to actively ignore glyph information, but we’ll be coming back to it later.)  Unicode code points are simply numbers; of course, numbers have to be reduced to…

5

Encodings In Strings Are Evil Things (Part 1)

    What is a string?       About six months ago at the Game Developers Conference in San Jose, I sat in on a talk about performance tuning in Xbox games.  The presenter had a slide that read:  “Programmers love strings.  Love hurts.”  This was shown while he described a game which was using a string…

4