On a server, paging = death

Chris Brumme’s latest treatise contained the sentence “Servers must not page”. That’s because on a server, paging = death.

I had occasion to meet somebody from another division who told me this little story: They had a server that went into thrashing death every 10 hours, like clockwork, and had to be rebooted. To mask the problem, the server was converted to a cluster, so what really happened was that the machines in the cluster took turns being rebooted. The clients never noticed anything, but the server administrators were really frustrated. (“Hey Clancy, looks like number 2 needs to be rebooted. She’s sucking mud.”) [Link repaired, 8am.]

The reason for the server’s death? Paging.

There was a four-bytes-per-request memory leak in one of the programs running on the server. Eventually, all the leakage filled available RAM and the server was forced to page. Paging means slower response, but of course the requests for service kept coming in at the normal rate. So the longer you take to turn a request around, the more requests pile up, and then it takes even longer to turn around the new requests, so even more pile up, and so on. The problem snowballed until the machine just plain keeled over.

After much searching, the leak was identified and plugged. Now the servers chug along without a hitch.

(And since the reason for the cluster was to cover for the constant crashes, I suspect they reduced the size of the cluster and saved a lot of money.)

Comments (40)
  1. Matt says:


    Way too many customers use cluster server as a "band aid" to avoid fixing the real problem. I saw this time and time and time again when I was working in cluster server support.

    Folks, cluster server is not a substitute for good code!

  2. Serge Wautier says:

    I don’t understand why paging is responsible for death in this case. It’s the opposite: Paging allowed the server to keep running a little longer.

    The memory leak is the one and only responsible for death here IMO.

    And why did it die after exactly 10 hours ? Was there a constant rate of requests ?

  3. Frans Bouma says:

    Exactly, Serge :)

    Windows pages all the time, some OS-es actually swap out every mempage first before using it. Windows doesn’t page a lot, but it does page, no matter what amount of ram you have. You can see this when you have your pagefile on another harddisk (solely for the pagefile). It’s accessed sometimes, especially during boot. (and not just to identify the disk ;))

  4. Raymond Chen says:

    Without paging, the server was able to keep up with the rate of incoming requests. A hard page fault costs a few dozen milliseconds, which is an eternity to a computer. Since the incoming rate for this particlar server was relatively constant, the machine just started falling behind and never caught up.

    Yes you could say that the memory leak was the cause of the problem, but it manifested itself through paging. If there was some other bug that caused every request to page (e.g. a data structure with a bad access pattern), then the result would be the same: paging => death.

  5. Paul Hill says:

    But of course, if the requests were _not_ constant, and the overcommittal wasn’t a memory leak but something a bit more temporary (like a wacking great temp table or something), then paging would be, well, not a good thing, but some way short of death.

    Paging in this case _prolonged_ the server’s life by a couplea hours. With memory leaks, you’re knackered anyway.

    One more reason to go .NET, I think :)

  6. Centaur says:

    Seems you’ve got a loose " in the link. Here’s a link to the Jargon File quoted at WordSpy:


  7. Raymond Chen says:

    True – if the request rate ever slowed down enough that the paging could catch up, then paging would have saved the day.

  8. The underlying problem is that if you code in languages where leakage (and buffer overflows) ar e inherent unless coded-out, you are going to end up with code that leaks and somewhere has a buffer overflow. Leakage may manifest itself as paging, and hence an low peak load, or it may manifest itself in some other manner. The problem with subtle leakage is that it is (a) a dog to find, (b) takes lots of time and realistic load tests and (c) something that you cannot usually do just before a ship deadline. so instead you ship a leaky server and try and fix while configuring the cluster to reboot overnight, hopefully with enough of an interval between system reboots that if server 1 doesn’t come back up, you can cancel the restart of server2.

    Code in Java, C# or even Perl and you have to try really hard to leak -it is not the default outcome- so tends to be rarer. The gulf between ‘working’ and ‘deployable’ is narrower and everyone is happy.

    Incidentally, it is easy to turn paging off on a server. Just turn the swapfile off. Then you will see a more entertaining failure mode when the time comes.

  9. Frans: Are you sure that it’s swapping and not just clearing unused sections of the page file in advance so that future page requests can be fulfilled faster?

  10. Mat Hall says:

    What puzzles me is why Windows insists on having a huge pagefile, even when there’s plenty of free RAM. (For example, right now I have 238Mb free physical RAM, and a PF usage of 236Mb — is there some highly technical reason that XP doesn’t keep at least some of that 236Mb in RAM? I could understand if it always left a small amount of physical RAM to avoid possibly choking when it tried to page something in, but I very rarely have anything less than 200Mb free…)

  11. Raymond Chen says:

    Windows 95 worked hard to keep your pagefile as small as possible, but that also meant that if there was a surge in demand, it was slow to react.

    My guess is that Windows NT tries to keep a backing page ready ahead of time so it can pre-clean pages. Then when there’s a surge in demand, it doesn’t have to write the pages out; they’re already clean.

    But I’m just guessing.

  12. Jordan Russell says:

    I really wish MS would take out — or at least offer some way of globally disabling — the "feature" that swaps a process’s entire working set to disk when its top-level window is minimized (see http://support.microsoft.com/default.aspx?scid=kb;en-us;293215). It causes needless disk thrashing whenever a memory-intensive app is minimized and subsequently restored. This is easily the #1 performance bottleneck I encounter on a day-to-day basis.

  13. Jakub Wójciak says:

    One of the most irritating things in Windows NT memory management is high priority of the disk buffers.

    Downloading a big (larger than the size of RAM) file from IIS FTP server or over NetBios causes the server system to page process and dll memory to disk to allow bigger disk cache. Performance drops dramatically – simply restoring minimized windows takes quite a few seconds of intensive paging.

    There should be a option to either limit disk cache size, or make it less prioritized than process working sets.

  14. Carmen says:

    Mat: Where are you getting your paging file usage numbers from? Not all tools are equal when telling you the state of memory in Windows. Perfmon is generally the best place to go for acurate information.

    Jordan: Don’t mistake working set trimming with paging. They’re not the same thing.

    If you want to know more about how the Windows memory manager works, make sure to spend a lot of time reading and understanding the chapter discussing that topic in "Inside Windows 2000".

    Some of the information has changed between Windows 2000 and XP/2003, but the big picture is still the same, with only some of the details being changed.

  15. keithmo [exmsft] says:

    Here’s one (of many, I assume) story about the phrase "sucking mud".

    Many years ago, Tandy Corporation (parent of Radio Shack) produced several "high end" (at the time) business computers. One of these was the Model 16 — imagine a giant TRS-80 Model I with it’s own desk, integrated monitor, and 8" floppy drive. The system had two processors – a Z-80 and a Motorola 68000. The Z-80 was used for legacy Model 2 mode. More interestingly, the Z-80 was also used as the I/O processor for the 68000 when running Tandy’s custom port of Xenix (Microsoft’s Unix-like-OS from the mid 1980’s).

    A program called Z80CTL was run on the Z-80. It received I/O requests from the 68000, performed its hardware magic, then notified the 68000 when the request was complete. Pretty standard stuff.

    When an I/O request was made to the Z-80, Z80CTL would squirrel away a copy of the stack pointer somewhere. When the I/O request completed, Z80CTL would compare the current stack pointer with the saved value. If they didn’t match, well, that was Bad. The system would come to a grinding halt with the following message displayed:

    Halt: Shut her down, Scotty, she’s sucking mud!

    I don’t remember the *exact* wording, but this is close enough. At least the screen didn’t turn blue when this happened…

    I now return you to the 21st century, where you’ll hopefully find something more relevant.

  16. Raymond Chen says:

    Jakub: Ah how operating system design fashions change. I remember when a unified caching architecture (where all memory was treated equally) was the hot thing and any OS that didn’t have a unified cache was declared woefully inadequate and not worthy of consideration.

  17. Jordan Russell says:

    > Jordan: Don’t mistake working set trimming with paging.

    Right, but as pointed out in the article, the working set trimming leads results in paging.

    The effect is very obvious if you minimize a memory hog like VMware — immediate disk thrashing ensues. To avoid the thrashing, I’ve gotten into the habit of minimizing windows using Windows+M instead of clicking minimize buttons. It really sucks.

  18. Jordan Russell says:

    Jakub: You can disable that behavior. Google for LargeSystemCache.

  19. Jakub Wójciak says:

    Jordan: Setting LargeSystemCache doesn’t solve the problem. By adjusting LargeSystemCache and Size registry keys you define the size and priority of the disk cache.

    Those two keys determine the possibility of paging / trimming process working set. The situation looks like this:

    1. You minimize all your apps (memory intensive apps, such as Visual Studio, MSDN, etc.), lock your workstation and go get yourself a coffee.

    2. Applications were minimized = their working sets have been trimmed.

    3. Someone downloads a big file from you (possibly a beta CD image of the application you are working on).

    4. The disk cache grows, eating all your memory and causes the system to page out your apps.

    5. You come back, unlock the workstation and wait and wait and wait – and it pages and pages and pages…

  20. Raymond Chen says:

    Well you weren’t using those programs any more, so they end up least-recently-used and are therefore prime candidates for discard.

    Consider the opposite argument: Why is my computer so slow at file transfer? Sure I have a bunch of programs running, but I minimized them all and haven’t touched them in hours. Clearly I am not using them. Yet the computer refuses to use that memory for more important things like speeding up that file transfer I’m doing from the office down the hall.

    This is one of those cases where no matter who wins, the other guy claims to have been cheated.

  21. Jordan Russell says:

    I don’t really agree with the notion that minimizing means "this application isn’t being used anymore". If I’m not using an application, I’ll *close* it, not minimize it. Often times I’ll minimize applications that are being actively used just to temporarily free up some screen real estate.

  22. Jakub Wójciak says:

    Raymond, you’re wrong on this one. The current behaviour of memory manager always favourizes disk cache and file-server-like machines.

    I can see two solutions:

    1. The system administrator has the ability to choose between workstation-like behaviour or file-server-like behaviour of memory manager.

    2. The file-serving (both NetBios and FTP)services have an option like ‘don’t trash disk cache’, which implies opening files with FILE_FLAG_NO_BUFFERING.

    The usage scenario I have previously mentioned is particularly irritating if you have 512MB of RAM, the apps and the system take about 300MB, so you have basically 200MB of free physical RAM. However downloading a file about 600MB in size, uses up _all_ the memory in the system causing much paging. Couldn’t the memory manager take only those free 200MB?

    At least give me an option to set the upper limit for disk cache size…

  23. Raymond Chen says:

    Jakub: You’ll have to take this up with the memory manager folks then. I’m just guessing.

  24. Catatonic says:

    Memory leaks do happen in C#, and they can be real head scratchers especially if you have C++ in your blood. But that’s a topic for another blog.

  25. Jordan Russell says:

    > The current behaviour of memory manager always favourizes disk cache and file-server-like machines.

    …*if* you set LargeSystemCache to a non-zero value. (See http://support.microsoft.com/default.aspx?scid=kb;en-us;102985)

    It sounds to me like the root of your troubles is the "trim working set on minimize" feature of USER. Provided you have LargeSystemCache set to zero and you don’t minimize your apps, the process working sets shouldn’t get trimmed in favor of cache. At least that’s how things work on my Win2k system…

  26. Karan Mehra says:

    > The current behaviour of memory manager always favourizes disk cache and file-server-like machines.

    Until recently, the memory manager left it up to the lazy writer to flush pages dirtied by the cache manager. However, the lazy writer keeps only 2 writes outstanding.

    So if a large file came in through a very fast connection, its contents would very soon fill up all available memory and Mm would trim more and more working sets to satisfy new requests. Only when there is no more memory available would Mm start flushing the pages that it had excluded earlier

    I believe this has been fixed in Windows Server 2003. Mm no longer gives those pages any reprieve. This is more or less equivalent to having set FILE_FLAG_NO_BUFFERING (on large files)

  27. Jorge Coelho says:

    Jordan, working set trimming when minimizing the top level-window is actually a kind of ‘life savior’ for Visual Basic applications, which have basically no control on memory allocation/de-allocation.

    One of the problems I had was the apparent memory footprint of some of my VB applications. I mean, I destroyed every single object, made sure every form was properly unloaded and terminated and still the memory footprint kept growing and growing. It was just ridiculous!

    Looking up the problem in Dejanews only revealed more cases of programmers frustrated with this… seems like VB caches unused objects and code in memory in case they will be needed again – to the casual user this will make the application look like it’s nothing more than a bloated memory pig.

    That’s when I run into the working set trimming solution. I now call SetProcessWorkingSetSize(GetCurrentProcess(), -1, -1) – the equivalent of minimizing the application’s top-level window – at specific times to force the OS to release all that unused memory. Works like a charm.

  28. Pavel Lebedinsky says:

    > I now call SetProcessWorkingSetSize(GetCurrentProcess(), -1, -1) […] at specific times to force the OS to release all that unused memory. Works like a charm.

    Except that it doesn’t actually release any memory, and inceases the probabilty of being paged out to disk, as others explained.

    Sure, you app now looks better in Task Manager, but if you were to actually measure how it affects performance you’d probably find that there is either no difference, or a noticeable perf hit, depending on situation.

  29. Matthew Lock says:

    FYI nice advert for the Tandy Model 16: http://m.m.nu/nostalgi/c-today_may83_p65.jpg

  30. Jorge Coelho says:

    > Except that it doesn’t actually release any memory, and inceases the probabilty of being paged out to disk, as others explained.

    Sorry, for all pratical purposes, it does release memory. Even if the unused bits were just paged out to disk, the fact is that they are no longer using valuable RAM space (and since they are unused bits anyway, there is no performance hit associated with it).

    Look at it this way:

    Imagine your program is using 5,000 Kb of RAM. You display a rarely used form with lots of controls, etc… memory usage jumps to 7,000 Kb. You unload the form and destroy all references to it. You would expect memory to drop back to the initial value… except it doesn’t. It only goes back down to 6,800 kb or something.

    Now you display another form and the same thing happens. So your memory usage keeps climbing and climbing even though you are destroying every object after use. Soon your application will look like a memory hog even though it isn’t.

    Enter SetProcessWorkingSetSize. Your working set size is reduced to bare bones when you know there won’t be a performance hit by doing this. Everything that should have been discarded a long time ago (and wasn’t because VB or the OS overrided your decision to remove from memory unused objects) is now paged out or removed or whatever happens to it when you trim the working set.

    > Sure, you app now looks better in Task Manager, but if you were to actually measure how it affects performance you’d probably find that there is either no difference, or a noticeable perf hit, depending on situation.

    Say what you will, try explaining to Joe User that what he is seeing with his own eyes in task manager isn’t actually true and that VB will release the memory back to the OS when needed… Do you really expect him to believe you?

    Besides, I’ve tested it here lots of times (and remember I’m the one who decides *when* to call SetProcessWorkingSetSize, so I hand pick the occasions). If there is a performance hit, it is entirelly negligible.

  31. Jordan Russell says:

    Jorge Coelho wrote:

    > Sorry, for all pratical purposes, it does release memory. Even if the unused bits were just paged out to disk, the fact is that they are no longer using valuable RAM space

    Valuable RAM space isn’t wasted because the OS will trim your process’s working set *automatically* when the system runs low on memory. By forcing your working set to be trimmed, you’re basically asking for a performance hit in all cases, as opposed to a performance hit only in low-memory situations.

  32. Jorge Coelho says:

    Jordan, I’m not saying that the OS will not trim the working set when memory is low. I’m just saying: try telling Joe User that is what is going to happen and see if he believes you – all he cares is what he sees in Task Manager.

    With the type of applications I make we are talking about thousands of Joe and Jane Users, not technological geeks who would understand how memory management works.

  33. Raymond Chen says:

    This is a general UI problem with Task Manager: If you show the total-geek info, people will misinterpret it since you need to be a total-geek to understand it properly. What’s the solution? Short of removing the info from Task Manager altogether.

  34. Jordan Russell says:

    Renaming the "Mem Usage" column to "Working Set Size" might be a start. More cryptic yes, but less prone to misinterpretation.

  35. Marco Russo says:

    Guys, I know *a lot* of IT professional who really don’t understand that the really important column to show in Task Manager is not Mem Usage but VM Size.

    Only VM Size tell you:

    - if a program has memory leak

    - if a program has required much more memory than available physical RAM

    Raymond, I know that Windows is used mostly by non-geek people, but the Task Manager naming convention (and default column used in process tab) are some of the worst decisione you could have made considering the consequences in wrong analysis and actions made by people who read these informations, even by IT pro.

  36. Lazar says:

    I’ve been playing with this ‘feature’ of win2K/XP for a while now (after losing the ability to reattange the taskbar with buttonboogie I had to find something to tweak!)

    I run XP/2K on boxes with at least 512M RAM and have been trying to force the apps to reside completely in resident memory but even with the swap set to 0% on all disks it looks like the ‘VM Size’ column always contains a value. I see many skillful replies here so I ask you folks; is there a way to get the apps completely resident in memory without switching to linux->VMware?

    One thing I have done is set DisablePagingExecutive in HKLMsystemCurrentControlSessionManagerMemoryManagement

    and it helps a little but does not provide a fix.


Comments are closed.