A question about preventing the system from going to the idle state turns out to be misguided


A customer asked how they could have their program prevent the system from going to the idle state. Specifically, when the system goes idle, the application gets into a weird state where it starts leaking memory like crazy. The program normally uses around 100MB of memory, but when the system goes idle, something funky happens and the program's memory usage shoots up to 4GB. To avoid this problem, they want to prevent the system from entering the idle state.

Now, if your application is a special-purpose program running on a dedicated computer, then blocking the entry into the idle state might be acceptable. After all, the user bought the computer specifically to run your program and nothing else. But the description of the program provided by the customer did not suggest that this was the case. It was just some program being developed for a general audience.

Interfering with the functioning of the entire system to hide a bug in your application is a horrible thing to do. It means that when your program is running, idle-time tasks never run, the computer never enters a low-power state, laptop batteries drain ten times faster than normal, and you basically ruin the entire computer.

What you should do is debug your program and fix the memory leak.

This is like saying, "We manufacture car stereo systems, and we found that when the car is coasting, the power from the alternator is not sufficient to drive the speakers. We would like to prevent the car from coasting."

Comments (28)
  1. Count Zero says:

    I understand that this pot is not about the cause of the problem, but it really left me curious about it. May I kindly ask you to share some info on that?

  2. 12BitSlab says:

    I probably shouldn't ask this, but are there developers who are truly that stupid?

  3. David Crowell says:

    12BitSlab,

    Yes.  Have you poked around on Stack Overflow and seen some of the horrendous questions?  Yes, some are students, but many are not.

  4. Kemp says:

    Be glad they didn't write a Linux app and then try to port it to Windows. "Hmmm… it doesn't run in Windows as-is. We'd better package a Linux distro with our app so it can wipe the user's drive and set up a suitable environment to run in."

  5. Sockatume says:

    I wonder if people have been conditioned to answer old-shoe-or-glass-bottle type questions because they're used to them from their undergraduate training. After a few hundred ridiculous hypothetical scenarios meant to drill into you the basic premises of your field (test your general problem-solving ability), a question like "which is the better form of duck, a potato or this bandsaw" starts to lose its shock value.

  6. ZLB says:

    Preventing idle in this scenario is clearly silly!

    I thought everyone knew that you should use EmptyWorkingSet() to fix memory leaks!!!!

  7. smf says:

    Sometimes it's easier and less painless to gnaw your own leg off than to explain to management that you need time to debug the application to find why it's leaking memory when going idle, rather than finding out how to sweep it under the carpet by stopping it going idle.

    There are very "clever" people in management, they got there by sweeping things under the carpet long enough to get promoted after all.

    It may also be caused by a third party component who say it's definitely something you're doing wrong, which your management believe as they trust the third party company more than they trust you. Mainly because you didn't pay for them to go on an all expenses holiday (sorry sales conference) to Bermuda, but the third party company did.

  8. j b says:

    Kemp,

    Actually I would prefer that those who insist on The Linux Look would rather run a TRUE Linux! You wouldn't believe how many days I've spent on ever new variations of Cygwin and Cygwin-style applications who not only introduce Linux style semantics on their own constructs, but enforce them on Windows defined constructs – like demand of users that the casing of NTFS file names shall match that in the directory information. Or the worst of them all – environment symbols: Cygwin rewrites SOME of the Windows-defined symbols to all uppercase, and then enforces case sensitivity, both for those it rewrote in all UC and those it left untouched.

    When people need my help to clear out such problems, _and they are the same people that keep bitching about Windows users refusing to learn new habits, that's why Linux never succeeded on the desktop_, that is when I have to mobilize all the self control that I have to keep myself for spitting in their face… (OK, I do have sufficient self control for that purpose, but I admit that the urge is present.)

  9. j b says:

    I keep thinking of one of the cases Raymond describes in his book… This web server that just HAD to be available 24/7. But it had a memory leak causing it to crash with an "out of memory" error every now and then. The only way to free memory was a reboot, interrupting the service, which was not tolerable. So, Raymond tells in his book, they set up this load balancer, (temporarily) replacing the server with a cluster of two machines. When one of them was getting close to memory exhaustion, the load balancer was configured to route all requests to the other machine, while the first one was being rebooted. Next time it was the other machine running out of memory, and the first one had to take the full load during the reboot. This kept the service available without interruption while the system was being debugged. The memory leak was found, and the second machine and the load balancer could be removed.

    I don't have the book available right now, so I may remember some details wrong, but the main idea was that this "cure" kept the service up for the time being.

    It could very well be that the customer in THIS case was in a similar situation, most definitely wanting to debug the application, and seeking a way to keep the service available while doing that.

    [That would fall into the aforementioned "special-purpose program running on a dedicated computer" category, which in this case it did not appear to be. -Raymond]
  10. sh says:

    I'd bet it went like this:

    Dev: It leaks when the OS goes Idle…but I haven't figured out why yet.

    Manager: Well then, why not just prevent it from going idle

    Dev: Well, that would be a bad idea because –

    Manager: -Will it fix the leak?

    Dev: Well, yeah but –

    Manager: Do it.

  11. alegr1 says:

    I suspect the application could not handle being detached from the console when a screen-saver was running.

  12. Dan Bugglin says:

    @sh That's why you say "No, it will only hide the leak, and until I know why it's happening and I fix it, I can't guarantee that it won't happen via other causes."

  13. Dan Bugglin says:

    And to be clear that's with a "Now if that's acceptable to you, we can go down that road, but this is my recommendation."

  14. Karellen says:

    Wow. An actual example of an "Old Shoe or Glass Bottle" question.[0]

    Aside from "You need to stop building things for money until you understand the basics of construction.", I think that there are two other legitimate possible answers here.

    The first is the classic, telling them to put their computer back in its box, take it to wherever they bought it from, and ask the vendor to take it back on the grounds that they are clearly too gorram stupid to even be allowed to own a computer.

    The other is to give them exactly what they want. The sooner *their* customers realise how bad the software is, and to stay away from it and the vendor at all costs, the safer those customers will be. This will greatly reduce the time it takes for their customers to figure that out, saving millions of lives^Whours of frustration in the long run.

    [0] weblogs.asp.net/…/408925

  15. Kevin says:

    @Anon: At this point, it's easier to just downvote stupid StackO questions (which I suspect is exactly what Atwood wants us to do).

  16. Ican Justify Anything says:

    Probably they were trying to locate the source of the leak.  

    A: "Hey, the leak goes away when I touch keys".

    B: "What does touching the keys do?"

    A: "They generate window messages for the keyboard."

    B: "So, the leak is connected to window messages for the keyboard."

    A: "Touching the keys also prevents entering the idle state."

    B: "How do we determine if the leak is connected to window messages, or the idle state?"

    A: "If we prevent entering the idle state programmatically, without touching the keys…"

    B: "… yes, if the leak goes away, then the leak is not connected to window messages for the keyboard."

    "We manufacture car stereo systems, and we found that when the car is coasting, there is not enough power to drive the speakers.  Either the alternator is not strong enough when coasting, or there is a short-circuit on the acclerator pedal when pressure is removed from it."

    "How can we determine which situation we are in?"

    "If we can remove pressure on the accelerator pedal – but prevent coasting, so alternator voltage drop does not occur – then, if the speakers still don't work right, it must be a short-circuit on the accelerator pedal."

  17. cheong00 says:

    I've seen "Old Shoe or Glass Bottle" type of questions too much time that now I'll just ignore them and pretend they don't exist.

  18. Anon says:

    @Sockatume

    People are conditioned into not responding properly to those questions because when you're blunt about the situation, you typically get banned/ostracised from communities or reprimanded at your job (at best!).

    Just look at the new StackO commenting policy, which seeks to ban even the *perception* of being unkind, no matter how idiotic the question (or petitioner) may be.

  19. On the Flipside says:

    @Anon and Sockatume

    The *real* problem comes when the questioner interprets a polite "How'd you get here?" or other such attempt to draw out more about the background of the situation as you being mean/unhelpful.  I have run into this on IRC before, and it's quite a recipe for frustration.

    It gets worse from there, though, because even a kindly "Please use prepared statements instead, here's how you'd rewrite this to use prepared statements" in response to the classic "My SQL doesn't work quite right, plz help" when they're concatenating stuff together to make their SQL statements gets treated as not being kind by this sort of brain-damaged question-asker.

  20. @Anon: I think the problem with people bluntly replying is they often don't explain why it's a stupid idea.  Replies that just say "Why on earth would you want to do that, you f**king idiot?" aren't that helpful.

    Personally I favour an answer that explains why it's a bad idea but also explains how to do it, if they really want to.  It's quite possible they have a sensible reason for doing something seemingly stupid; they just haven't explained it very well.

  21. Sockatume says:

    You're looking at my comment inside-out, everyone:

    "Why is it that people don't to respond with terse explanations of what's wrong with the puzzle?"

    Don't care.

    "Regardless of the above, why is it that people do take the time to come up with a solution to a problem that's patently absurd?"

    This is what puzzles me. And I think it's because we're used to absurd hypothetical questions from exams and problem sets.

  22. j b says:

    @Chris Crowther,

    You are probably right in most cases. Sometimes am the "misguided customer": When the SW developer screams "You can't do that!", I ask back: "Why not? From a user pointer of view, that is a perfectly natural thing to do. Look at this scenario: …". In at least two cases, I have managed to turn the SW developer around: "Well, maybe that is something we should look into….", and in future releases, my "crazy" way of using the software has been accepted as perfectly normal and fully supported.

    There are lots of software dogmas that we programmers live by, without knowing very well why we stick to them – and sometimes yell "You can't do that!" for no very good reason. (We may have arguments, but closer analysis may prove them to be far, far weaker than we first did think.) The "haven't explained it very well" really goes both ways: We SW guys are often poor at explaining the underlaying structure and "philosophy" of the SW, but users are often poor at explaining their real needs so that we can provide the SW they need. Sometimes, "You can't do that!" is technically justified, but still the user is justified, too. Only that the needs haven't come through.

  23. Anon says:

    @Kevin

    There are several provisions which tell you not to downvote questions just because they're stupid if they break no other rules.

  24. Gabe says:

    It's a good thing the programmer asked MS so they could be told that they need to fix their bug. If they had simply searched for something like "windows prevent idle", they would likely have found a link to SetThreadExecutionState in the first result.

    As it turns out, preventing Windows from going idle is actually a very common task. It is required for CD burners, media players, presentation programs, and more.

  25. Karellen says:

    @Gabe – Aargh! My first thought was that "that's not what the idle state is"… but it turns out I was totally wrong.

    I thought from Raymond's post that the customer's problem happened whenever all the CPUs were idle, i.e. basically whenever there were no CPU- or IO-bound tasks running. (To prevent this you basically need to start a thread which run an un-optimisable-out busy loop forever, which is a real Old Shoe or Glass Bottle problem.) So I didn't actually read the linked article. (I should know better by now.)

    Your comment made me actually open the article to grab a quote proving you wrong, only to prove that i was. Thanks.

  26. Ben Voigt says:

    @Gabe, you're confusing "becoming idle" with "effects of staying idle until the configured sleep timer elapses"

    SetThreadExecutionState won't prevent all the other effects of becoming idle such as: MFC OnIdle running (or equivalent in other frameworks), idle priority threads running, dynamic CPU clocking, etc.  (And according to the docs, whether tasks scheduled to run when idle are inhibited depends on exactly which flags were passed to SetThreadExecutionState)

  27. smf says:

    @The MAZZTer

    "And to be clear that's with a "Now if that's acceptable to you, we can go down that road, but this is my recommendation."

    Of course it's acceptable to the manager, he can tell his manager that he was responsible for getting the leak fixed.

    When the problem comes back he can tell his manager that the developers didn't really fix the problem and he is now considering giving you a written warning.

    Once you realise you are working with someone like this, your only option is to leave or hope he gets promoted away from you. Anything else you do will just backfire. They hold all the cards, you don't have any.

  28. Gabe says:

    Ben Voigt: The MFC OnIdle function runs whenever the window's message queue is empty. Idle priority threads run whenever CPU is less than 100%. Dynamic CPU frequency changes happen constantly. Those items are all so common that there's no way they could be asking about how to prevent them.

    The only reasonable thing they could be asking is how to prevent the system from starting idle tasks (which presumably interfere with the app and cause it to consume memory).

Comments are closed.

Skip to main content