Q&A on our TR1 implementation

Hello.  My name is Stephan and I’m a developer on the Visual C++ libraries team.  As the Visual Studio 2008 Feature Pack Beta (available for download here with documentation available here) contains an implementation of TR1, I thought I’d answer some common questions about this technology.


Q. What version of Visual C++ does the Feature Pack work against?


A. The Feature Pack is a patch for the RTM version Visual C++ 2008 (also known as VC9)..  The patch can’t be applied to:


    * VC9 Express.

    * Pre-RTM versions of VC9 (e.g. VC9 Beta 2).

    * Older versions of VC (e.g. VC8).


Q: Can I just drop new headers into VCinclude instead of applying the patch?


A: No. VC9 TR1 consists of new headers (e.g. <regex>), modifications to existing headers (e.g. <memory>), and separately compiled components added to the CRT (e.g. msvcp90.dll). Because it is mostly but not completely header-only, you must apply the patch and distribute an updated CRT with your application. You can think of VC9 TR1 as “Service Pack 0″.


Q: If I use TR1, will I gain a dependency on MFC? Or, if I use the MFC updates, will I gain a dependency on TR1?


A: No. The TR1 and MFC updates don’t interact. They’re just being distributed together for simplicity.


Q: How complete is your TR1 implementation?


A: Our implementation contains everything in TR1 except sections 5.2 and 8 (http://open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf ). That is, we are shipping the Boost-derived components and the unordered containers.


Q: Because TR1 modifies existing headers, can it negatively affect me when I’m not using it?


A: It shouldn’t (if it does, that’s a bug, and we really want to hear about it). You shouldn’t see any broken behavior at runtime, nor any new compiler warnings or errors (at least under /W4, the highest level that we aim to keep the VC Libraries clean under). Of course, TR1 can slow down your build slightly (more code means more compilation time), and if you are very close to the /Zm limit for PCHs, the additional code can put you over the limit. You can define _HAS_TR1 to 0 project-wide in order to get rid of the new code.


Q: Does this patch affect the compiler or IDE?


A: No.


Q: Did you license Dinkumware’s TR1 implementation?


A: Yes (see http://dinkumware.com/tr1.aspx ), just as we licensed their Standard Library implementation. So, you can expect the usual high level of quality.


Q: So, what has MS been doing?


A: Roughly speaking, additional testing and “MS-specific” stuff.


1. We’ve integrated TR1 into VC9, so TR1 lives right next to the STL. (This involved unglamorous work with our build system, to get the TR1 separately compiled components picked up into msvcp90.dll and friends, and with our setup technologies, to get the new headers and sources picked up into the Visual Studio installer.) As a result, users don’t have to modify their include paths, or distribute new DLLs with their applications – they just need to apply the patch and update their CRT.


2. We’ve made TR1 play nice with /clr and /clr:pure (which are outside the domain of Standard C++, but which we must support, of course). At first, these switches caused all sorts of compiler errors and warnings. For example, even something as simple as calling a varargs function internally triggered a “native code generation” warning. These errors and warnings took a long time to iron out.


3. We’re ensuring that TR1 compiles warning-free at /W4, in all supported scenarios. This includes switches like /Za, /Gz, and the like.


4. We’re ensuring that TR1 is /analyze-clean.


As usual, we’ve preferred real fixes to workarounds to disabling warnings (and when we disable warnings, we do so only in the headers, not affecting user code).


5. We’re identifying bugs in TR1 and working with Dinkumware to fix them. Dinkumware’s code was very solid to begin with – but as ever, more eyes find more bugs. I’ve even found a couple of bugs in the TR1 spec itself (see http://open-std.org/JTC1/sc22/WG21/docs/lwg-active.html#726 and Issue 727 below).


6. We’re striving for performance parity with Boost (which serves as a convenient reference; we could compare against GCC’s TR1 implementation, but then we’d have to deal with the difference in compilers). In some areas, we won’t get there for VC9 TR1 (hopefully, we should for VC10), but we’ve already made good progress. Thanks to MS’s performance testing (which Rob Huyett has been in charge of), we identified a performance problem in regex matching, which Dinkumware has sped up by 4-5x. (Note that this fix didn’t make it into the beta, which is roughly 18x slower at matching; current builds are roughly 3.8x slower.) And we’ve achieved performance parity for function (again, this fix didn’t make it into the beta).


(“Okay,” you say, “but will regex::optimize make it faster?” Unfortunately, no. The NFA => DFA transformation suggested by regex::optimize will not be implemented in VC9 TR1, but we will consider it for VC10. In my one cursory test, regex::optimize did nothing with Boost 1.34.1.)


7. We’re identifying select C++0x features to backport into TR1 – for example, allocator support for shared_ptr and function. While not in TR1, this is important to many customers (including our own compiler). This just got checked in, and isn’t in the beta.


8. We’re implementing IDE debugger visualizers for TR1 types. Like the STL (more so, in some cases), the representations of TR1 types are complicated, so visualizers really help with debugging. I’ve written visualizers for almost every TR1 type (I am secretly proud of how shared_ptr’s visualizer switches between “1 strong ref” and “2 strong refs”). Note that the beta doesn’t include any TR1 visualizers.


9. We’ve worked with Dinkumware to fix a small number of bugs present in VC8 SP1 and VC9 RTM (“because we were in the neighborhood”). One which was actually related to TR1 was that stdext::hash_set/etc. had an O(N), throwing, iterator-invalidating swap() (discovered because unordered_set/etc. shares much of its implementation). This has been fixed to be O(1), nofail, non-iterator-invalidating.


10. Because TR1 lives alongside the STL, we’ve made them talk to each other in order to improve performance. For example, STL containers of TR1 types (e.g. vector<shared_ptr<T> >, vector<unordered_set<T> >) will avoid copying their elements, just as STL containers of STL containers in VC8 and VC9 avoid copying their elements.


This is a little-known feature of the VC8 STL; it’s there in the source for everyone to see, except that almost no one reads the Standard Library implementation (nor should they have a reason to).  Basically, this is a library implementation of C++0x “move semantics”, although it’s naturally much more limited than language support will be.  In VC8, template magic is used to annotate containers (vector, deque, list, etc.) as having O(1) swaps, so containers-of-containers will swap them instead of making new copies and destroying the originals.  (For builtin types, swapping would be less efficient.)


We’ve simply extended this machinery to the new types in TR1. Everything in TR1 with custom swap() implementations will benefit: shared_ptr/weak_ptr (avoiding reference count twiddling), function (avoiding dynamic memory allocation/deallocation), regex (avoiding copying entire finite state machines), match_results (avoiding copying vectors), unordered_set/etc. (avoiding copying their guts), and array (which doesn’t have an O(1) swap(), but does call swap_ranges() – so arrays of things with O(1) swaps will benefit).


That is to say: if a vector<shared_ptr<T> > undergoes reallocation, the reference counts won’t be incremented and decremented. That’s really neat, if you ask me.


If you have any questions about or issues with the Feature Pack Beta, let us know!




Stephan T. Lavavej, Visual C++ Libraries Developer