Why is there a BSTR cache anyway?


The Sys­Alloc­String function uses a cache of BSTRs which can mess up your performance tracing. There is a switch for disabling it for debugging purposes, but why does the cache exist at all?

The BSTR cache is a historical artifact. When BSTRs were originally introduced, performance tracing showed that a significant chunk of time was spent merely allocating and freeing memory in the heap manager. Using a cache reduced the heap allocation overhead significantly.

In the intervening years, um, decades, the performance of the heap manager has improved to the point where the cache isn't necessary any more. But the Sys­Alloc­String people can't get rid of the BSTR because so many applications unwittingly rely on it.

The BSTR cache is now a compatibility constraint.

Comments (25)
  1. skSdnW says:

    Why did MS not put this inside a #ifndef _WIN64? What about when you ported to ARM? ARM64 is probably your last chance for it to happen at compile time.

    [That would have made it harder to port code that unwittingly relied on the BSTR cache. Also, it would have been a difference between 32-bit and 64-bit Windows that would be really hard to debug if you happen to stumbled across it. Imagine getting a bug that reproduces only only on the 64-bit ARM version of the program. Too bad your VMs are all x86 and x64. Gratuitous differences between 32-bit and 64-bit behavior results in things like Pinball being dropped from the product. -Raymond]
  2. grommit says:

    Time fror SysAllocStringEx

  3. DavidE says:

    That kind of thing has been around a long time. When I worked at MIPS (1985), one of the compiler folks had rewritten malloc/realloc for better performance, but it turned out that it broke a couple of fundamental Unix utilities (diff, and one other I forget). I don't remember the details, but it seems like the programs relied on realloc always returning the same or a higher-value address, and code relied on that for performance. I know - bad coding in the first place, but we couldn't break things like that if we expected people to buy our systems.

    There was a similar weirdness with nroff/troff that was due to them relying on adjacent global variable declarations to be adjacent in memory. This code was written before structs existed. MIPS had a concept of small and large data segments, and the variables in question used a mixture of sizes. We ended up having to add a compiler directive to force everything into the large segment.

  4. Anon says:

    So many problems in computing are only "hard" because we refuse to abandon compatibility when it is necessary.

  5. Ian says:

    You can disable it per-process via SetOaNoCache() -- We asked for this because the cache tends to grow to a collection of the largest BSTRs encountered by your application over time -- not good for a server application.

  6. Joshua says:

    - can't get rid of the BSTR because

    + can't get rid of the BSTR cache because

  7. skSdnW says:

    @Raymond: Application Verifier could catch this?

    [Cool, Application Verifier found a bug in a program you bought in 2001 from a company that no longer exists. Now what? -Raymond]
  8. Joshua says:

    [Cool, Application Verifier found a bug in a program you bought in 2001 from a company that no longer exists. Now what? -Raymond]

    If it's got a latent heap corruption bug and no fix forthcoming for vendor, it ought to be on the short list to replace. It's one thing at the application level where bwcompat-heap & already proposed appshim would fix it, but at the pluglin level it's just too dangerous too leave around.

    [I think you'll find that in practice, the list to replace is long, not short. (Also, it's expensive to replace something. In addition to procuring the replacement, you have to integrate it into your workflow and retrain all your employees to use the new system. And how do you know the new system doesn't have the same bug?) -Raymond]
  9. Azarien says:

    @skSdnW: I'm pretty sure some new, yet uninvented, architecture is going to emerge sooner or later. Never assume that ARM64 will be the last.

  10. skSdnW says:

    @Raymond: Only MS and a couple of ported open source apps could be affected on ARM. For x86/AMD64 the train has left the station but you could maybe remove/reduce the caching if the app has a Supported OS GUID in the manifest?

    @Azarien: Even today on ARM Win32 is basically dead and MS wants to force you over to WinRT.

  11. 12BitSlab says:

    @ skSdnW

    WinRT, in its current implementation, is merely a layer over the top of Win32.  Even MSFT lacks the resources to convert WinRT to a full OS right now.  They have bigger fish to fry.  Win32 is going to be here for quite a while.

  12. Gabe says:

    skSdnW: You can't remove caching for apps that declare it because they may load plug-ins that don't support it. You may not directly support plug-ins, but standard file dialogs load them.

    Imagine if your app crashed as soon as somebody tried to use the "Save As" dialog. You might want to tell your user that they should just stop using their broken shell extension, but your user will probably just use a competitor's app that doesn't crash. Or more likely your user isn't going to help you debug the problem to determine the root cause; they will just think your app is broken and uninstall it without telling you.

  13. skSdnW says:

    @Gabe: The GUID in the manifest already changes API behavior. If we take "Save As" as an example, msdn.microsoft.com/.../hh848036(v=vs.85).aspx says IPersistFile::Save will fail the call if the path is relative and the app has a Win8 or later GUID. It is not unusal for shell extensions to deal with IPersistFile (IShellLink etc).

  14. IdahoJacket says:

    How could a program rely on the cache behavior?  I could see a performance impact, but not a correctness issue.  It's probably just a lack of imagination on my part, though.

    [SysFreeString(bstr); bstr[0] = L'x'; /* inadvertently relies on the cache */ -Raymond]
  15. mh says:

    Is this one of those cases where the cache could be 99% removed and replaced with a "stub" version that behaves as though a cache were present but never actually uses it; e.g: a hypothetical "isBSTRCached" returns false, and adding a BSTR to the cache is just a nop?  Or are there rogue programs that actually go partying on the internals of the cache?

  16. Joshua says:

    [And how do you know the new system doesn't have the same bug?)]

    At least that part's the obvious. The application verifier doesn't find it when doing the same thing. And since my other constraints limited the problem to library component or shell plugin, and I believe application vendors should be able to pre-apply shims from the list it reduces to just shell plugin. I hate buggy shell plugins; they're just too dangerous for everybody. Let's just say that when moving GetOpenFileName and GetSaveFileName out-of-process starts to sound like a good idea you've got a problem. It's a whole lot easier to run into than calling SetOaNoCache() at program startup anyway. All you have to do is link with /LARGEADDRESSAWARE. Ah gee that 64-bit compiled plugin somehow managed to *still* not be LARGEADDRESSAWARE (I've seen it).

    [Corporations with this problem tend to have very few choices available. (These are niche products, not mainstream packages. And very often, it's a custom program written just for that company; there is no alternative at all.) It is likely that every possible replacement also has problems. Now what? -Raymond]
  17. Mark says:

    mh: there may well be, but the more obvious problem is use-after-free, because cached entries don't move in memory.

    [Yup, I sort of mentioned that in the linked article ("heap corruption bugs"). -Raymond]
  18. Antonio 'Grijan' says:

    I wish Windows Explorer provided a shell extension manager the way web browsers do with plugins and extensions, or at least warned you when somebody tried to install one. I'm careful, but I've been bitten many times. The last one, a few weeks ago, a shell extension from a security suite that came with a fingerprint reader started overnight crashing Explorer every time it tried to display the context menu for any link. Six months after being installed. Go figure who is to blame when you haven't installed anything recently: it took me a while of fiddling with RegEdit to find it. Definitely, out of reach for the average user, and even for many brick-and-mortar shop computer technicals. And yes, there are utilities that can help you. But if you don't know they exist (like the average user), they are of no use.

  19. Ben Voigt says:

    "Cool, Application Verifier found a bug in a program you bought in 2001 from a company that no longer exists. Now what?"

    Well, at least I'd know that the applications I'm buying in 2015 didn't have this (class of) bug, thereby preventing problems in the future.

    The fact that information from AppVerifier would be moot (non-actionable) in a significant fraction of cases is no excuse for not performing the test at all.

    Also, a system where Open/Save dialogs didn't load plugins would not be terribly problematic.  Such a system would surely have a button (with keyboard shortcut) for "Open Explorer Window here", where the plugins would be loaded, which would handle pretty much all cases where plugins are needed.  And the improved performance of Open/Save dialogs in the 95% use cases should balance the additional effort needed to load the full interface the 5% of the time when you intend to interact with a plugin.  That additional effort being pretty low, since it's not spawning a new process, just requesting the already-running shell to create a new window.  In fact, since plugins have already run their initialization code in the shell process, the overall performance might be better even when the Explorer window does get opened... and then plugin authors wouldn't need to mess with long-running helper services to cache metadata to reduce per-process initialization time (thinking of TSVNcache.exe, but there are others).

  20. Nico says:

    [Cool, Application Verifier found a bug in a program you bought in 2001 from a company that no longer exists. Now what? -Raymond]

    Sometimes it's true that you're just stuck and can't do anything.  However, the further away from 2001 you get, the more likely it is that you can or must do something anyway.  It's also the case that ignorance leads to stagnancy: "We still use Foo Deluxe 2001 because it has no bugs and is compatible with Windows X".

    In some ways compatibility constraints are just a spiral into mediocrity.

    [If compatibility were abandoned, then the solution that most companies choose would probably be "Don't upgrade Windows." In a sense, the behavior was preserved in 64-bit Windows in order preserve source code compatibility. "You want to port your app to 64-bit Windows? Great. Just fix these pointer truncation problems, and you're all set!" -Raymond]
  21. Anonymous Coward says:

    There are two options (for Microsoft, not for Raymond)... either the behaviour is removed at some point, or it isn't.

    If it isn't, then it will carry on to the very last version of Windows.

    If it is, then the x64 transition would have been a good time to do it. Not a perfect time, but when every application has to be thoroughly tested and/or code-reviewed anyway (to catch things like converting pointers to DWORDs), a good time. There's no mixed-bitness DLL loading, so you would know there are no "program turns off BSTR caching; DLL relies on it" problems, as "program turns off BSTR caching" implies a 64-bit program and "DLL relies on it" implies a 32-bit DLL. If someone had a legacy program from 2001, then good for them, it will keep working because it's 32-bit.

    Not that Raymond has anything to do with that decision.

    @Nico: when it's 2050, then people will be using programs written in 2035 that rely on BSTR caching (if Windows is still around).

  22. Gabe says:

    Anonymous Coward: The harder you make it to port apps to 64-bit, the fewer people who will do it.

    "Why don't you have a 64-bit version of your app? It's just a simple recompile!"

    "We tried that, but we couldn't figure out why the 64-bit version randomly crashed, so we abandoned it."

    I have a client who started upgrading to Windows XP in 2007. They just finished their upgrade to Windows 7 sometime last year. They are in the middle of moving off IE8 right now. Now imagine what the upgrade schedule for this organization would look like if MS didn't care so much about backwards compatibility!

  23. 640k says:

    @Nico: Haven't you learned anything from reading this blog? Windows is optimized for mediocrity, optimized for old buggy program from last millennium. When the platform team is confronted with a bug, they *always* choose backwards compatibility, never to ease development of new applications.

    This is know a technical debt, and makes the resource requirement for all new development escalate exponentially long term. Short term it often a profit though, but this has been going on for ages, at least since win32s.

    win32 api has had many chances to "upgrade" to clean 64-bit, but this path is is never chosen, and it will never be chosen. If you want a clean 64-bit api you have to look elsewhere.

  24. Anon says:

    @Gabe

    When we finally start fining companies dozens of billions, or even trillions, of dollars for knowing refusal to maintain security to industry standards, they'll actually start upgrading -- and fixing security flaws.

    [And then people will complain that the government is interfering too much. My personal physician's office still runs on Windows XP. If you fine him billions of dollars, he'll go bankrupt. (Actually, more likely is that his malpractice insurance premiums will skyrocket and he will have to stop practicing medicine.) -Raymond]
  25. Anon says:

    @Raymond

    $50/ea to replace the machines in that office with refurbished Pentium-D based machines running Win7.

    $100/ea to replace the machines with Core2-based ones.

    But, of course, it costs $0 to put your patient data at risk.

    Granted, the machines are already infected. In general, if your identity is ever stolen, the most likely culprit is your CPA (Many of them are still running Win9x) or doctor's office. (Not minor data breaches, mind, but full-scale ID theft.)

    [Large clinics tend to set things up so that every computer is identical. It's hard enough supporting a network of 1000 identical computers, now imagine if every single one of those machines is running different hardware. (Especially since the specialized software may be supported only on certain types of hardware.) At least the shut off Internet access to all the computers, and no USB sticks are allowed. -Raymond]

Comments are closed.

Skip to main content