Still more misinformation about virtual memory and paging files


The wired network in my building’s being unusually flakey so I’m posting this from my laptop, sorry for the brevety..

Slashdot had a front page story today about an article be Adrian Wong posted in his Rojak Pot: “Virtual Memory Optimization Guide“.

I’ve not finished reading it (the site’s heavily slashdotted), but his first paragraph got me worried:

Back in the ‘good old days’ of command prompts and 1.2MB floppy disks, programs needed very little RAM to run because the main (and almost universal) operating system was Microsoft DOS and its memory footprint was small. That was truly fortunate because RAM at that time was horrendously expensive. Although it may seem ludicrous, 4MB of RAM was considered then to be an incredible amount of memory.

4MB of RAM?  Back in the “good old days” of 1.2MB floppy disks (those were the 5 1/4″ floppy drives in the PC/AT) the most RAM that could be addressed by a DOS based computer was 1M.  If you got to run Xenix-286, you got a whopping 16M of physical address space.

I was fuming by the time I’d gotten to the first sentence paragraph of the first section:

Whenever the operating system has enough memory, it doesn’t usually use virtual memory. But if it runs out of memory, the operating system will page out the least recently used data in the memory to the swapfile in the hard disk. This frees up some memory for your applications. The operating system will continuously do this as more and more data is loaded into the RAM.

This is SO wrong on so many levels.  It might have been be true for an old (OS8ish) Mac, but it’s not been true for any version of Windows since Windows 95.  And even for Windows 1.0, the memory manager didn’t operate in that manner (it it was a memory manager but it didn’t use virtual memory (it was always enabled and active swapping data in and out of memory, but the memory manager didn’t use the hardware (since there wasn’t any hardware memory management for Windows 1.0))).

It REALLY disturbs me when articles like this get distributed.  Because it shows that the author fundimentally didn’t understand what he’s writing about (sort-of like what happens when I write about open source 🙂 – at least nobody’s ever quoted me as an authority on that particular subject)

Edit: I’m finally at home, and I’ve had a chance to read the full article.  I’ve not changed my overall opinion of the article, as a primer on memory management, it’s utterly pathetic (and dangerously incorrect).  Having said that, the recommendations for improving the performance of your paging file are roughly the same as I’d come up with if I was writing the article.  Most importantly, he differentiates between the difference between having a paging file on a partition and on a separate drive, and he adds some important information on P-ATA and RAID drive performance characteristics that I wouldn’t have included if I was writing the article.  So if you can make it past the first 10 or so pages, the article’s not that bad.

 

Comments (26)

  1. Anonymous says:

    Never believe anything you read on /. — it’s a shadow of its former self; mostly populated by a bunch of trolls and ne’er do wells, the editors are so far up themselves they’re in danger of becoming Klein bottles, and it’s nothing more than a mouthpiece for press releases these days.

  2. Anonymous says:

    Even Linux has the idea of a working set, although the Linux algorithm (http://www.linux-tutorial.info/modules.php?name=Tutorial&pageid=311) appears to be based on the number of free pages in the system as a whole, unlike NT which IIRC treats each individual working set (each process plus the system working set) separately.

    My source as always is "Windows Internals, 4th Edition" by Mark Russinovich and David Solomon.

    It’s worth being clear that the system working set (‘Memory: Cache Bytes’ performance counter or ‘System Cache’ in Task Manager) does not just cover the file system cache, but all kernel-mode pageable code and data, including paged pool. Some memory is double-counted in ‘Available’ and in ‘System Cache’ in Task Manager – I’ve actually seen the sum of the two exceed the actual amount of memory available on my 1GB system at work.

  3. Anonymous says:

    Yes; why don’t we bitch about his guide… But wait, where is *your* optimization guide?

    PS – You need to write to him and inform him that a course in MS Dos 4.0 is a prerequisite for publishing articles on the internet.

    (I’m sorry if this comes off a little trollish but so do your comments on his article)

  4. Anonymous says:

    Larry,

    As your comments above seem reasonable and you seem to know a thing or two perhaps you can help out. I have been looking for a way to make WinXP limit the size of memory it will use for Disk Caching functions. Win9x was configurable through the system.ini with:

    [vcache]

    MinFileCache=4096

    MaxFileCache=32768

    This was a wonderful adjustment given that Windows will try to use all "free" memory that is available for disk cache when anything over what is now a small amount of memory has very negligable, if any gains.

    It’s a pretty sorry statement when you have three programs running and one of them is writing a lot of data to disk (Photoshop), and windows decides to swap your browser or mail program out to disk because 700 MB just isn’t enough disk cache.

    Any ideas? Or contacts that could solve this problem?

  5. Anonymous says:

    Shouldn’t one always read EVERYTHING that the writer has to say before giving out comments?

    Plus, at least he tried and write the article. Not accurate? SO what, help the author correct it to be so and help save the world for further condemnation with wrong information!

    I just keep seeing bashing saying I’m right and you’re wrong but no further actions to help right the wrong…. is that smart? I think not.

  6. Anonymous says:

    If its any consolation, the article was written in 1999. And throughout the article claims are made but not backed up with empirical evidence. We can see how a pagefile or swap file or whatever it is you wish to call it can be taken from across the disk to some edge from pretty pictures, and we can see from a graph that the outer edges of a disk will run faster, but from those two graphics the author infers that moving the swap file has a 16+ percent improvement. Sure, there’s some theory on seeking and reading added, but there’s also latent assumptions about the workload strewn in there.

    The biggest problem with the article is that it needs a big "In theory," prepended to every statement.

  7. Anonymous says:

    Actually the date on the post is December 2004, not 1999 – that’s only 4 months old.

    And I tried to read the article. But 5 minutes/page load (mostly due to a dead lan connection that was massively impeding my ability to get real work done) kinda took my wind out of it.

    I gave up after page 4.

    Manip, my comments were 100% trollish. And I HAVE written about memory management before:

    http://blogs.msdn.com/larryosterman/archive/2004/03/18/92010.aspx

    And the correction article I wrote to call out the screw ups in the first one:

    http://blogs.msdn.com/larryosterman/archive/2004/05/05/126532.aspx

    I’m QUITE sensitive to this particular issue, having had my hand slapped quite hard on it (the dev manager in charge of memory management, in an email to Brian Valentine, used my article as an example of the unbelivably bad information about memory management that’s being put out on the web).

    So when I see popular forums promoting this kind of drivel, I get annoyed.

    If y’all would like, I’d be more than happy to do a sentence-by-sentence set of corrections, but it’d just come off as being even more trollish – the article’s got that many errors.

    Having now finally finished the article (from home, where the net works :)), once he gets off his discussion of how virtual memory works and starts discussing how to relocate a paging file and where to put it, his article isn’t that bad. Essentially once he gets beyond his discussions about permanent and semi-permanent paging files, the final parts of his discussion (where he discusses the various issues related to spindles etc) doesn’t appear to be that incorrect (although I’m not sure about the RAID stuff).

    Alan, check the two articles I linked above – the reality is that Windows will use your paging file as much as you’re asking it to use, and no more. Windows, in general, won’t ever page out an application unless some other application is using more memory than is physically available on the machine – the only time pages will be discarded from the standby list is if someone else (photoshop, in your case) needs them.

    And Windows tries REALLY hard to ensure that it doesn’t throw pages out that it doesn’t have to.

  8. Anonymous says:

    Memory management is a difficult subject.

    A certain large company keeps its details secret.

    As raymond demonstrated in "how much memory can your app access" series (a per process view of memory), most comments were about people’s machines.

    How about you go get "dev manager in charge of memory management" to answer technically common questions.

    In the absence of "official" data all I can say to "how do I disable paging to make my paid for app run faster" is "the processor is designed to page, the chipset is designed to page, Windows is designed to suit the hardware, so is also designed to page". This is not really an answer so I normally avoid debates.

    Theory is supposed to be a guiding thought for action. So how does one apply an unknown implemtation to guide ones actions in common user senarios.

  9. Anonymous says:

    Larry, I haven’t read the linked article but I don’t understand your criticism regarding the paragraphs you quoted.

    As for the 4 MB in a DOS computer thing, are you unaware of an old technology called DOS extenders? EMS/XMS drivers certainly allowed DOS to access 15 MB, if in a rather inconvenient way (page swapping). I know I had and used more than 1 MB RAM in my 286/386 DOS systems… hard disk caches were a popular candidate for XMS storage.

    As for the second paragraph, it sounds certainly "good enough" as a high-level description on what virtual memory does. Or did the author claim to deliver a detailed description of the NT kernel? Or what exactly is your criticism?

  10. Anonymous says:

    I’m having problems with paging right now. The latest WinFX CTP shipped as a 460mb MSI file. It takes an incredible amount of time to install, during which the PC thrashes like crazy.

    I have 512mb of RAM, with normally 300 or so free. When installing the CTP first explorer, then two msiexec processes in turn allocated the about 400mb each to process the install script. It took well over an hour just to extract the MSI, which only contained a 455mb ISO image, a readme and an EULA. My peak commit rose to 824mb.

    I think I would far rather have been able to download the ISO image seperately and saved all the hastle.

    I don’t think the Windows installer system was designed to support such large files. The wrong tool for the job.

  11. Anonymous says:

    I think what the author said about the OS gradually paging apps out to disk as data is loaded is correct, but not for the stated reason.

    Try typing this in an NT command prompt with a dir that has a lot of big files:

    for /r %x in (*) do @copy "%x" nul

    CMD itself uses a neglegible amount of memory but you’ll find that most of your other apps are paged out by the time it finishes. My understanding is that the NT memory manager treats the cache manager like any other process, so when it sees the cache manager memory mapping a lot of data, it gradually enlarges its working set of the cache, paging other app memory out in the process. As Alan noted above, this happens in Windows 95/98 as well, even though the VCACHE architecture is different.

    In other words, I don’t think your assertion that Windows only throws out pages when programs need them is quite correct — when the disk cache is involved, Windows will also page when it thinks the app *might* need more memory for caching. This behavior was very useful in the days of Windows 95 when memory was very tight and paging memory to increase the disk cache from 1MB to 4MB helped considerably, but nowadays doing it to enlarge the cache from 500MB to 800MB isn’t as helpful.

  12. Anonymous says:

    Edward:

    What OS (and service pack) were you using? Running anything else at the same time or was this solely because of the install? How ’bout your hardware: recent-ish CPU and enough defragged room on your HDD that those couldn’t have been bottlenecks? And this hour was install time alone, not including download time?

    I can’t publicly promise you many answers about installation performance, but I might be able to get the installer folks to look at it. Or the team that shipped the CTP. An hour to install their package seems way too long.

    Setup lag (especially w/ large files) makes me nervous because my team’s code is in their critical path. We (cough) map a view of an entire .exe or .dll to hash it. I doubt that’s a bottleneck that would make your installation take so long, but it could very well explain why your peak memory usage was so high.

    That was an implementation detail (which may change in future releases) but it ought to be so incredibly obvious to anyone who runs WinVerifyTrust calls under a debugger or even verifies signatures on big PEs that I’m assuming I don’t need to talk to lawyers, my bosses, or HR tomorrow morning. 🙂

  13. Anonymous says:

    Phaeron,

    In that case, some program needed to use the memory. Now it only needed to use it to copy a file, but it DID need to use the memory.

    Having said that, there IS an issue that I purposely ignored. If you have a boatload of low frequency threads running (like a bunch of tray icons that wake up to check the internet to see if a new version of software’s available), then the system needs to page those apps in.

    And that may, in turn cause the pages that hold your application to be paged out. Even though you, the user weren’t using those pages, someone wanted to use them.

    This particular issue’s been one of great concern to the memory management people in Windows – they know that this leads to highly negative user experiences and they’re working really hard to fix it.

  14. Anonymous says:

    Sorry for the length of this post. One thing led to another and…

    Drew wrote:

    <i>We (cough) map a view of an entire .exe or .dll to hash it. I doubt that’s a bottleneck that would make your installation take so long,</i>

    I don’t. If the MSI file is in this case considered to be an exe and you map it, obviously the memory manager will happily swap in new pages into your process working set when you touch them sequentially. Since the previous page(s) in this file are likely most recently touched in the system, this will basically uselessly grow, and grow, and grow the physical memory used – even if you in reality need nothing more than a sliding window of, let’s say, 64KB – and wreak havoc on any system that doesn’t have the amount of RAM free at least as large as the mapped file (and then some).

    The way I think it’s best solved is by just not map such insanely large files, but instead CreateFile non-cached. Unfortunately one can’t say "not really cached, but still do a little read-ahead (by setting the sequential flag)", why it has to be non-cached.

    To optimize, one could allocate two buffers (say 1MB each) and switch between the two when using asynch I/O – fill one buffer while performing calculations on the other.

    It will make the code more complex – once, and once only. The end-user benefits should stand out like the sun burns in your eyes after a long night of hacking (which, conincidentally, is way more time than is required to implement this). 🙂

    One could probably even do this by using two mapped sections, 64KB each, and alternate between them when trying to access the data by catching the memory exceptions and remap the other buffer and… The sky^H^H^HWin32 API is the limit. 🙂

    Furthermore, since an installation program knows (or at least should know) it will only read the source files once, and then they are likely not to be used again in foreseeable time, it also knows those files should be opened non-cached.

    Having been hurt by NT5.0 fc.exe while comparing large files, I realized Microsoft hadn’t at all considered the issue of mapping (the whole of) files way larger than available (or even free) RAM. At least I hope it’s an oversight, that would be hilarous (as in "how the heck could you miss THAT?" with an evil grin) if it wasn’t hurting so much (in lost time).

    Larry, I don’t agree that copying files to nul needs to cache the files (thereby messing up the whole memory behaviour on the system for a while). Why can’t it open them non-cached? I don’t see any read-ahead issue here, since the process hasn’t got anything to do but waiting for I/O anyway. OK, a user might use this to behaviour to explicitly put files in the cache. If that’s a concern, could perhaps a new flag ("/nocache"?) be added to e.g. fc.exe, xcopy.exe and cmd.exe’s copy command, just to name a few?

    I might also add that I think for copy operations, the application (or library) itself is often in a way better position to judge how caching should be done. Consider copying files (especially large files) between two volumes, residing on the same disk vs. on different disks.

    Copying files around on the same physical disk could greatly benefit from the application/library doing the copying doing its own buffering (f.ex. CopyFile(), xcopy.exe, cmd.exe’s own copy, …) , filling up a buffer a bit larger than 64 or 256KB before writing that buffer to another place on the same disk. This I know, since I have written copy functions several orders of magnitude (!) faster than what many, incl. MS itself uses in many places (installers are historically horribly bad in this area – installers originally intended for packages of "reasonable" sizes now still treat files several gigabytes large the same as a 2KB file).

    When copying files between two physical devices, buffering becomes less of an issue and streaming is what more counts. I believe there are even MSDN samples of how to do this. :->

    Either way, bogging down the cache (and therefore the whole system) when copying, or even just reading files so large they can’t ever fit the memory (or even the cache) of a system, especially for sequentially read/write operations like when copying a file or calculating a checksum, is just dumb. Turn off caching for those file operations.

    When copying in general (a number of) files only a fraction of the size of cache-granularity (assuming the still hardcoded 256KB virtual address space granularity I read about somewhere), especially if done by an installer that should know the source files will not be used again, I think it should do it’s own caching of them.

    (while on the subject, I’d like a new FSCTL for NTFS – to tell it to use more memory when relocating a file during defrag operations – when 100MB are free, I see little reason in not using at least a 10MB buffer, instead of the current seek-time-excercise of HD’s).

  15. Anonymous says:

    Larry: there is a typo in your text: "fundimentally"

    Mike Dimmick: working set is not "an idea", it is precisely defined replacement algorithm developed by Denning (http://cne.gmu.edu/pjd/PUBS/bvm.pdf). According to publically available information, Windows kernel uses modified version of working set algorithm.

    Linux, on the other hand, uses MACH/BSD active/inactive queues and physical scanning as a basis for its per-zone page replacement, which is _quite_ different from working set.

    2.2 Linux kernel did use virtual-based scanning that is somewhat closer to working set.

  16. Anonymous says:

    I understand that there are many misleading articles on the Internet but I can’t understand why you did not bother to read the entire article before spewing vitriol… Just food for thought.

    Also, I don’t know much about computers but I used a 286 machine in the "good old days" and what do you know, it had 2MB of memory. So, I think you would be wrong in saying that old DOS PCs can only address 1MB of memory.

    I think instead of being such a troll (sorry, had to be honest), write to him and help him correct his mistakes, instead of sitting here on your "throne" and spewing vitriol.

  17. Anonymous says:

    Mike: Yeah, we realize that it’s a problem. Unfortunately when the devs working on IE3-or-so wrote the code they didn’t have large files in mind. And until recently this code "just worked" or at least nobody complained, so the team was focused on solving other problems.

    Based on testing with huge PEs, I still doubt that signature verification is the root cause behind an install taking an hour.

  18. Anonymous says:

    Q1

    As a lively couple, even with a plenty of RAM, windows can’t live without a

    pagefile. But as in a normal lively couple, windows pushes the pagefile far

    far away and divided it for many fragments. I looked at many pc, and even at

    200 GB hd with only ~5% occupied, the pagefile is located near the partition

    end while all other files at the beginning, that forces the magnetic heads to

    toss through all hd decreasing system performance. Such situation is with

    both winxp’s and win2003.

    Programs such as “pagedfrg” defragment pagefile to one fragment, but also

    place it near the partition end. I believe that it’s better not to use

    “pagedfrg” as one remote fragment would be worst than pagefile fragments

    located close to system files, or better distributed between all files to

    have access to closest pagefile fragment.

    I can manage to place pagefile at the partition beginning between system

    files and all others, but in a multi stage boring way described in the manual

    for easy Linux and windows co-existing.

    http://www.knoppix.net/forum/viewtopic.php?t=17024

    I posted a similar question before, but have not got clear explanation. May

    be somebody knows why windows fragment pagefile and place it far far far away

    from system files??? Is there any normal way to place pagefile close to

    system files??

    I’ll appreciate any information

    Best, Alex

    ==================================

    Q2

    > Can this file, if something is wrong with it

    > slow a system down?? I don’t get a message

    > saying it’s corrupted or anything.

    >

    > BUT it kind of lags, when loading programs most of the time.

    >

    > There are no viruses, or spyware on this system.

    >

    > Its a P4 2.4 533 mhz mobo/CPU, and 1GB of ram.

    >

    > So memory, or speed definitely isn’t an issue.

    >

    > Anyone think of something I can do to fix this prob??

    >

    > I could disable virtual ram, do u think this would fix it?

    >

    > Any info appreciated

    >

    > Thanx

    >

    ==================================

    Q3

    "pj" <pj@discussions.microsoft.com> wrote in message

    news:2747A75C-8EC8-4AA9-97FB-979787724A34@microsoft.com…

    > I occaisionally get a message that windows in increasing my virtual memory. I

    > have 512 of ram, and typically have one program application and a handfull of

    > web pages open. Is there anything I can do to optimize my system performance.

    >

    If you get another stick of RAM your PC will probably not have to use Swap

    Files (virtual memory) at all and your PC will run faster. Alternatively

    monitor your Swap File size (not Swap File usage) and set minimum Swap File

    size to the maximum that the Swap File size grows to plus 100 Mb while

    leaving Maximum Swap File size at No Maximum. That way you will reduce

    fragmentation of Swap Files while still allowing Windows to increase them if

    needed. If Windows then needs to increase Swap File size then increase the

    minimum. Getting extra RAM is easier and more efficient though.

    Rob

    =========================================

    Q4

    > 1. Can I move win386.swp from the C drive to another hard drive?

    > 2. Why do I need it anyhow? I have 512 Mb of RAM and my graphics

    > programs do not use it.

    > 3. If I simply delete it, is there any way to get it back if I get

    > the unpleasant surprise that some program is calling for it?

    =====================================

    These are 4 questions from the last 24 hours of the newsgroups. I only have a few message bodies downloaded so there are more (I searched on pagefile.sys).

  19. Anonymous says:

    Larry,

    Phaeron pretty simply summarized the issue. Sure the disk cache *thinks* it needs pages, and to an extent it does, but being a cache it has no idea if the pages will serve any purpose in the future. The point being that it is doing this at the expense of other pages that belong to applications. I would agree with you that the OS really doesn’t have any more reason to believe those paged would ever be needed again either. So it is a difficult problem.

    My point is that it’s fairly well proven that cache efficiency gains drop off very rapidly and the amount of gain you get in performance after a certain size is very miniscule. To that extent I would certainly like to limit my disk cache size to something reasonable to stop the memory management from wastefully throwing away pages that *I know* will be used again for pages that *I know* won’t be. I realize it doesn’t know this.

    In general I think the behaviour will benifit just about everyone, if Microsoft doesn’t believe so they can leave the default however they want to. I just want a way to fix it so I don’t have to suffer along with everyone else.

    Mike,

    The point isn’t to fix the copy command, that was just an example that demonstrates this problem in a simple fashion. It will happen when any number of applications are running. For instance when I run Photoshop, it will page in some huge image (that was paged out to accomodate a disk cache expansion) so that it can write it out to disk, all the while the window manager grows the disk cache to accomodate it while throwing those pages (or worse as those pages were just loaded) away. Then I when I go back to editing the image again it has to reload it. The net effect is that you need over 2x the memory of the actual in memory image. Quite wasteful, especially so when you don’t quite have that much memory.

  20. Anonymous says:

    I did a search on the writer. He doesn’t sound as stupid as you made him out. He even has a book on the BIOS in Amazon.com. Looks interesting. I may just buy it and see how good it is. Are you sure you are not being too harsh? You do sound like you have quite an ego. Can you prove he’s wrong and you are right? Just wondering.

  21. Anonymous says:

    One of the problems with disk caching is that modern disks throw a monkey wrench in the problem: Most now include a very large amount of cache on their own side, so OS-side disk cache is less useful than before now that writing over a IDE/SCSI cable is just a (slow) direct mem copy. This makes things much faster and less fragmented, without needing nearly as much intervention from the OS. Maybe new versions of windows should query the drive and step back if they have enough of their own caching.

  22. Anonymous says:

    Jacques,

    You’re right, I do have an ego. And I’ve been raked over the coals for making the same kind of lame-brained mistakes as he did in the article.

    First off: "Whenever the operating system has enough memory, it doesn’t usually use virtual memory". This is an UTTERLY stupid comment. Every 32bit operating system today ALWAYS uses virtual memory. Otherwise you’d not be able to fully address the 32bit address space (ok, 31.5 bit) in your process – virtual memory is what makes that work.

    Second, "back in the days of 1.2M floppy drives". By the time that Windows 3.0 had come out (which was the first version of Windows that could take advantage of more than 640K of RAM), the standard for floppy drives was the 1.44M 3 1/2" floppy, NOT the 5 1/4" 1.2M floppy. MS-DOS applications couldn’t address the additional memory (that’s why memory expanders were so popular – they allowed apps to access more than 640K of RAM).

    Those are just two of the comments that set me over the edge. His distinction between a swap file and a paging file might have had relevance for Win16 (where data segments were sapped to disk) but has no relevence in modern 32bit operating systems (including Linux). And even Windows 3.0 had a demand paging system for VXD’s I believe.

    His discussion about operating systems like Win9x and NT allocating significant amounts of RAM to the disk cache is naive at best – in fact, the file cache on NT (at least, I’m not sure about Win9x) uses memory management under the covers (that’s how the disk cache automatically gives back pages when the system’s under memory pressure – since read-only pages in the cache are just on the standby list, the memory manager doesn’t have to do anything special to get those pages back from the cache manager – it just dicards the pages from the standby list and repurposes them).

    I can go on if you’d like. As I said – once you get past the discussions about how memory management works, the article’s ok. But until that point, it’s garbage.

  23. Anonymous says:

    First of all, thanks Larry for patience with this, and to all others for a good discussion.

    Question; this file caching thing. Is there merit to it? I gotta admit, when doing large copies, eventually switching apps does seem to be affected, beyond load reasons. Perhaps the file caching is causing swap-outs of code that is not really too inactive? Or am I talking out of my ass.

  24. Anonymous says:

    Once again a small comment fleshed out into… this. 🙂

    James Risto wrote:

    this file caching thing. Is there merit to it?

    I don’t know what I considered more funny – calling it "this thing" or even questioning if there’s merit to it. 🙂

    But seriously; yes, caching can have a dramatic (positive) effect on performance. The cache is, when a cache-hit occurs, usually several orders of magnitude faster (think 10^3-10^6 times faster) than having to read data from the disk (or redirector, or whatever). Besides performance it also has other positive effects, such as reducing the mechanical work disks have to do, reducing overall network usage (if you’re using e.g. an SMB environment) and so on. This makes disks last longer, less energy (uselessly) be used, and other users on your LAN segment suffer less. 🙂

    But this all assumes cache-hits. It unfortunately takes just one single (bad) application opening and accessing a single (large) file in cached mode to completely destroy these positive effects for all other applications, and indeed the whole system itself.

    Things I’d like to see researched, with results presented – or had the option presented itself (*), research myself – would be:

    – Limit cached data of (sequentially) accessed "large" files to, say only last two 64 or 256KB (or more, depending on how much RAM is currently free – but still marking these pages as "very old"). This would catch "bad" installers (such the stuff Drew mentioned ;-> ).

    – Allow for a "sliding window", incl. read-ahead for "large" files, perhaps especially mapped files accessed sequentially (is currently the read-ahead thread of the cache manager even involved in memory-mapped access of files?).

    – Prefer discarding recently read data (in the cache) for a sequentially accessed file, over discarding any other data.

    – Prefer discarding R/O data segments over code, or the other way around.

    – Prefer keeping directory entries.

    – Use fast (!) compression for data scheduled to be put in the pagefile, over actually writing it to disk. I read some research done using Linux re. this, and IIRC they found data to be swapped out was often the data needed again very soon, while the data just read (that caused the swap out) was more unlikely to be needed again. I immediately considered LZO for its fast compression.

    Especially the compression idea has been haunting me for over half a decade now, since I know from experience it can help _very_ much once you start to run out of RAM "just a little bit".

    (*) Had the NT kernel been open source (I mention this only for the technical/research/engineering ascpects – I don’t have a political motive, even if it provides a "real" argument for a comment to jasonmatusow) it would have been possible to research, and share the experiences, of such changes between sites and usage scenarios way more diverse than Microsoft likely can consider.

    Even if such research and tweaking would likely initially only benefit the ones doing it, and the ones with similar scenarios, it could have displayed useful patterns – not to mention testing tools – that Microsoft and all its customers could benefit from (I can easily imagine an AppPatch flag, perhaps most suited for installers, saying "don’t cache large files from this app, even if it’s so dumb to request cached access" 🙂 ).

    Anyway, as such a scenario (open sourced NT kernel) seems unlikely, perhaps the ideas could serve as a seed for possible areas to look at in the cache manager for the people that do have access to it?

    Enough off-topic from me. Cheers.

  25. Anonymous says:

    Drew: &quot;I can’t publicly promise you many answers about installation performance, but I might be able to get the installer folks to look at it. Or the team that shipped the CTP. An hour to install their package seems way too long.&quot;
    <br>
    <br>I’ve found in the past that some MSI packages install much quicker than others — it seems that if there’s a large file in there disk access goes crazy, whereas an install comprised of a bunch of small files is relatively swift. My guess has always been that in unpacking a large file a lot more memory needs to be allocated in one go, so it’s more likely to start hammering the swap file; if the swap file, the MSI, and the place you’re installing things to are all on the same physical disk, chaos ensues…

Skip to main content