Scritch, scratch

Steve Ball posted an article about some “glitching” issues in Vista. I can’t resist adding my two cents.

For me, Vista definitely glitches a LOT more than previous versions of Windows. As a fairly experienced developer, I think I understand the reasons pretty well, so I can explain it away. But as a user, when my laptop audio is glitchy, I want to find the developer responsible and (censored for mild descriptions of hypothetical violence).

I’ve read a lot of comments raising various theories, some that call into question the sanity of the Windows developers. I can’t say I blame them. There is definitely some room for improvement in the way things work. However, in the interest of fairness and progress, the attention should be focused where it will do the most good. That means we shouldn’t simply blame the Windows developers unless there is really something they can do about the problem. And that means that before we start placing blame, we probably ought to figure out where the problem really lies.

The first complaints always mention something about the “lame” and “brain-dead” Windows NT scheduler. However, I’m pretty convinced that this is not the problem. In fact, my audio sounds BETTER when my system is under load (more on this later). I agree with the statement that audio glitches under CPU load are usually the fault of the OS and the scheduler, but I’ve seen very little correlation between CPU load and audio glitching. I haven’t seen any evidence that the CPU scheduler at fault.

Hard disk load is occasionally an issue, but that is generally fairly obvious and easy to fix at the application level. The application simply didn’t buffer enough sound samples and ran out of music to play while waiting for the hard drive to load the next bit of music. Either tweak the buffering algorithm of the application or get a faster hard drive (or network).

Memory can also be an issue. Some buffer or code needed to play the music might be paged out because you’re running low on free memory, and it didn’t get paged in quickly enough once the application tried to access it. This could possibly be blamed on the OS if the OS is too aggressive in trimming the working set. If an app pre-loads 60 seconds of audio, that means it doesn’t won’t touch the last page of the buffer for 60 seconds, which might be long enough for the OS to page out the buffer. Here, you would probably get better results by buffering only 5 seconds worth of music. In any case, I haven’t had any significant trouble with this on Vista (except once in a while when I let Firefox run too long, it eats up 1.5 GB of RAM, and my system goes into memory panic mode).

Drivers are a much more significant issue. Traditional OSes (XP, Vista, Linux) can’t really schedule a driver’s activity. Once the driver starts doing something it thinks is important, it can only be interrupted by a higher-priority driver. On a single-CPU system, if a driver takes over, no new audio can be buffered until the driver returns. Many drivers written for XP or earlier systems (where the audio buffer was somewhat more forgiving — more on this later) cause trouble by doing too much processing at once. In testing (under XP), the driver’s latency wasn’t a problem, but on Vista, the latency requirements are much less lenient. I’ve seen significant Vista audio glitch issues go away after upgrading from an XP-era driver to a newly released Vista-compiant driver.

Even if the driver only does 1 millisecond of work at a time (or whatever the Vista latency recommendations are), if it has to do this 1000 times a second, it will still use up all available CPU time. Drivers have priority over all applications, so on a single-CPU system, this leaves no CPU for the audio application and mixer. On a multi-CPU system, this can still be a problem if the driver holds certain locks that are needed for audio processing. This is why Vista throttles network activity when the audio channel is open — network packet bursts can easily use enough CPU to cause audio glitches. Probably a good idea overall, though it seems that the throttling algorithm is a bit too aggressive and has some room for improvement.

Another issue is power management. This turned out to be the major problem on my laptop. My laptop’s motherboard (CPU and chipset) goes into sleep mode whenever it detects that it is “idle”. That’s actually a pretty good thing because it means I can get 2 or 3 hours of use out of the battery instead of 20 minutes. This happens hundreds of times per second — it sleeps for 2 milliseconds, wakes up to handle a keystroke, sleeps for another 2 milliseconds, wakes up to handle the calculations for an animation, sleeps a bit more, wakes to fill an audio buffer, etc. But it sometimes doesn’t wake up quickly enough to buffer the next bit of audio. If it ever takes longer than 9 ms to wake up, there will be a glitch. This was a real problem when my laptop was new. Recent drivers have improved this a lot, but there’s still a bit of scritch-scratch during some games or media.

As an experiment, I wrote a very simple application to prevent the motherboard from going to sleep. It starts a low-priority thread that does a simple busy wait in a low priority loop. A Sleep(1) loop didn’t help — it gave the motherboard a chance to go to sleep. While the busy wait makes my laptop get very hot, it also completely stops the glitching.

#include <windows.h>

#include <stdio.h>


// This probably doesn’t really do anything. At such a low priority, the

// process usually terminates before the thread exits. But this is an easy

// way to avoid certain compiler warnings. Without it, some compilers warn

// that the “return 0” below is unreachable. If I remove the “return 0”,

// other compilers warn that I don’t return a value from DoNothingQuickly.

volatile BOOL g_stopNow = FALSE;


DWORD WINAPI DoNothingQuickly(LPVOID /* unused */)


      while (!g_stopNow)


            // Sleep(1) didn’t fix the glitching. Sleep(0) just spends all

            // the time context switching, which is probably as bad or

            // worse than a busy wait in terms of impact on the rest of

            // the system. So I’ll just do a busy wait.


      return 0; // Usually never reached.



int main()


      int returnCode;

      DWORD dwThreadId;

      HANDLE hThread;


      hThread = CreateThread(








      if (hThread != NULL)


            // We want to keep one CPU wide-awake and leave any other

            // CPU(s) idle.

            SetThreadAffinityMask(hThread, 1);


            // We don’t want to get in the way of any useful work.

            SetThreadPriority(hThread, THREAD_BASE_PRIORITY_IDLE);




            printf(“Caffeine: Now running. Press <Enter> to quit…”);



            g_stopNow = TRUE; // Probably useless, but might as well…

            returnCode = 0;




            printf(“Caffeine: Unable to create thread. Exiting.\n”);

            returnCode = GetLastError();



      return returnCode;


Drivers seem to be a big part of the problem here — they either spend too much time working, or they take too long to wake up after going into a sleep state. Hopefully this means that audio problems will go away as the drivers improve. Computer retailers like Dell and HP will probably ensure that their new hardware meets the Vista latency requirements before putting it on the market. Unfortunately, owners of older hardware might be out of luck.

Hindsight is 20/20. I’ve seen how these kinds of issues come up, and I’ve been involved in some mistakes myself, so I don’t want to sound like I’m smarter than anybody on the Windows audio team. However, there is certainly some room for improvement in they way this issue has played out. While the changes in the audio stack are technically admirable and the problems can generally be blamed on drivers, that’s little comfort to those enduring static on their speakers. Things work in XP and don’t work in Vista. That sounds like a regression, not an improvement. It’s getting better, but there really shouldn’t have been a problem in the first place.

What’s wrong? Well, Vista aimed for a technically superior audio experience. Latency has been significantly reduced in Vista — when you fire your machine gun in your favorite game, you’ll hear the sound effect a bit more quickly. For gamers and audio professionals, this reduction in latency can make a big difference. For people trying to listen to their MP3s, this probably doesn’t matter much. The downside to reducing latency is that it reduces the margin for error. Vista cannot tolerate any delays longer than 5 or 10 milliseconds without glitching, while XP could usually tolerate a much longer delay with no problem. Assuming all of the drivers do their part, modern hardware actually has no problem meeting the deadlines. But if anything goes wrong on Vista, you can hear it.

What was the mistake? The core of the issue is that a change was made that was detrimental to some customers. In the long term, the change is probably a step (or two) in the right direction, but for many people, the change causes trouble and offers no immediate benefit. Obviously the severity of the problem was underestimated (contrary to popular belief, Windows developers do care about their customers and would never have done this had they known the outcome). I’ve never been a big fan of taking away without giving something in return.

What would have prevented the problem? I don’t know how hard this would have been to implement, but I would love to have some kind of adjustment knob to control the amount of latency I want on my system. That would have sidestepped the whole issue by allowing the customer to pick their priorities, i.e. lower latency on my desktop where there aren’t power issues and where I want my games to sound great, higher latency on my laptop where I want additional power savings, the drivers aren’t as good (and are also optimized for power savings), and where I’m never using the audio stack for anything other than music or videos anyway. This would also have been a great mitigation for driver issues during the transition period.

 This is probably a good lesson for developers in general — be sure to consider the transition period between the old system and the new system when designing the new system.

In the meantime, I’ve finally gotten my audio problems worked out. After the latest motherboard chipset upgrade, I no longer have to run caffeine.exe anymore, and I only run into minor static when running a few specific programs. Hopefully things are improving for everybody else, too.

Comments (7)

  1. Luciano says:

    Are you kidding? A mulitcore cpu CAN’T do what a 486 was doing flawlessy?

  2. Doug E. Cook says:

    Like I said before, this has nothing to do with processor speed, cores, or scheduling. It has to do with motherboard chipsets, drivers, and latency. The 486 didn’t have the aggressive power saving modes that modern CPUs have.

    If you have a system with a good chipset and good drivers, you’ll be fine with any modern CPU (100 MHz or higher). If you have a driver that takes too long to do its thing, it doesn’t matter how fast your CPU is — the driver still takes too long.

  3. Wesley says:

    I am actually writing about your comment on Ben’s site re serial ports. I am not sure what you need to do with your ports and maybe you already understand the following, however VBScript can be used to configure your ports on the fly. While the following sample is written with VS in mind, Ben’s blog shows how you can set up a VPC to respond to VBScript. I use the script to assign a port and cpu resources to the current window. The source for the following is from his book.

    Option Explicit

    CONST vmSerialPort_HostPort = 0

    CONST vmSerialPort_TextFile = 1

    CONST vmSerialPort_NamedPipe = 2

    CONST vmSerialPort_Null = 3

    dim vs, vm, serialPort, aSerialPort, savedParameter, ParameterFile, objFSO, objTextStream

    ’10 Check that the script is running at the command line.

    If UCase(Right(Wscript.FullName, 11)) = "WSCRIPT.EXE" Then

       Wscript.Echo "This script must be run under CScript."


    End If

    On Error Resume Next

    ’17 Attempt to connect to Virtual Server

    Set vs = CreateObject("VirtualServer.Application")

    If Err.number <> 0 Then

       Wscript.Echo "Unable to connect to Virtual Server."


    End if

    On Error Goto 0

    ’25 Read data from previously saved parameter file


    ’29 Open requested parameter file

    Set objFSO = CreateObject("scripting.filesystemobject")

    Set objTextStream = objFSO.OpenTextFile(ParameterFile, 1, True)

    savedParameter = objTextStream.ReadLine



    set vm = vs.FindVirtualMachine("Cadd03")

    ’39 Set CPU resource allocations

    vm.Accountant.SetSchedulingParameters 0, 100, 75

    ’42 Configure the first serial port to connect to a null port, 1st’x to initilize tablet

    set aSerialPort = vm.serialPorts.item(1)

    aSerialPort.configure vmSerialPort_Null, "Com1", false

    aSerialPort.configure vmSerialPort_Null, "Com1", false


    set vm = vs.FindVirtualMachine("Cadd04")

    ’52 Set CPU resource allocations

    vm.Accountant.SetSchedulingParameters 0, 100, 75

    ’55 Configure the first serial port to connect to a null port

    set aSerialPort = vm.serialPorts.item(1)

    aSerialPort.configure vmSerialPort_Null, "Com1", false

    aSerialPort.configure vmSerialPort_Null, "Com1", false


    set vm = vs.FindVirtualMachine("Cadd05")

    ’65 Set CPU resource allocations

    vm.Accountant.SetSchedulingParameters 0, 100, 75

    ’68 Configure the first serial port to connect to a null port

    set aSerialPort = vm.serialPorts.item(1)

    aSerialPort.configure vmSerialPort_Null, "Com1", false

    aSerialPort.configure vmSerialPort_Null, "Com1", false

    ’74 TURN ON PORT in Parameter data file

    set vm = vs.FindVirtualMachine(savedParameter)

    ’78 Set CPU resource allocations

    vm.Accountant.SetSchedulingParameters 0, 100, 1000

    ’81 Configure the first serial port to connect to a physical serial port

    set aSerialPort = vm.serialPorts.item(1)

    aSerialPort.configure vmSerialPort_HostPort, "Com1", true

  4. Doug E. Cook says:

    Wesley –

    Thanks. That’s cool. Unfortunately, it doesn’t help much. I need to be able to have programs on my host machine talk over the serial port to programs on my guest machines. I can do this once by connecting a null modem cable from COM1 to COM2, assigning one of the Virtual PC COM ports to physical COM1, and attaching the program on my host machine to COM2. But now I’m out of physical COM ports, so this doesn’t work when I need to use two COM ports at once or when I need to talk to multiple guest machines. The programs I need to use don’t work with named pipes.

    One program I’ve been looking at is an open source driver called com0com. That might work.

  5. Peter Kirn says:

    I have to disagree here. A choice between lower latency and "glitch-free" performance is an artificial one. You ought to be able to have both. In fact, there are plenty of scenarios (ASIO drivers, for one, and some of the other examples people give here) where you got lower latency but more reliable performance ** on XP **.

    So don’t blame lower latency for Vista’s glitch problems. I think the issue is, we’re not really seeing the improvements promised by other enhancements like the scheduling stuff, and we’re having to fight a bunch of unreliable, buggy drivers. We really need a commitment from Microsoft to get better audio performance.

    I think in the end, you’ll wind up with something that makes everyone happier. There are going to be more and more media-rich applications that want BOTH low-latency performance AND reliable, glitch-free playback, from communications to audio and music to richer sound in games.

    Anyway, fwiw, I’m very pleased with my performance on Vista — provided I use Winamp, not WMP for playback, and thanks to better drivers than I had a few months ago (particularly NVIDIA video drivers)

  6. Doug E. Cook says:

    Again, just to clarify –

    The audio stack on Vista is designed to have lower latency than the default audio stack of XP. As a consequence, drivers that lock up the system are more likely to cause audible glitches than they were on XP.

    If everything is working as it should be, you can get low latency on either XP or Vista. The ASIO drivers are a good example of this. As long as your other drivers are cooperating, you can get glitch-free audio with low latency on any modern OS.

    On the other hand, if your driver goes out to lunch for longer than your audio stack can tolerate, it doesn’t matter what the OS or the scheduler does. Your audio is going to glitch. The lower the latency, the less time the driver has to keep the system locked before it becomes a problem.

    This isn’t a problem with Vista’s scheduler or lack of improvements there. IRQ handlers simply aren’t scheduled. They trump everything except other IRQ handlers. When the IRQ comes in, the handler immediately runs until it returns or is interrupted by a higher priority IRQ. If it takes a long time to return, you’ve got a problem.

    An XP machine with poorly written drivers will not be able to play low-latency audio glitch-free, regardless of the CPU speed. Same with Vista. Same with Linux. There’s nothing the OS can do about it.

    An XP machine with good drivers will be able to play low-latency audio glitch-free (as long as the CPU is up to the task of decoding the audio). Same with Vista. Same with Linux.

    The main difference with Vista is that the latency is lower by default, so the "poorly-written drivers" become more obvious. Driver behavior that wasn’t a problem with XP’s default audio stack is now a serious problem with Vista’s default audio stack. However, the behavior that causes a problem with Vista’s audio stack would also cause a problem with a low-latency XP audio stack.

    There are some things that Microsoft could do. Microsoft could provide a way to increase the latency of their audio stack to reduce the impact of poorly-written drivers. Microsoft could provide their own implementation of the poorly written drivers. Microsoft could pressure the vendor to improve their poorly written drivers. But I’m not sure how it would help to have a commitment from Microsoft to get better audio performance. The Microsoft part of the equation really isn’t the problem

  7. Roger Barrett says:

    It is, as dcrook says, largely a matter of process priorities, and all modern OS implementations have to make choices about how they schedule tasks (which includes IRQ handlers). The real problem that OS implementations tend to have is that they are "one size fits all" solutions trying to do their best for everyone, and often not succeeding.

    Windows, in particular, has almost no ability to be tailored to what it’s purchaser wants. For certain machines dedicated to particular tasks, maybe 80-90% of Windows code base is just a wasteful resource hog, but you simply don’t have the control to get rid of it. With a lot of effort, you can probably find out how to disable a lot of pointless processes, and maybe also disable hardware/drivers that cause interference, but it is certainly not made easy to have Windows behave the way you might want it.

    A better solution for low latency than Vista is the one adopted by the Linux kernel when compiled with real  time support. Even with bad drivers, you can modify the priority of the different IRQs and processes/applications to whatever you want. This means you can, for example,  set the priority of the IRQ for audio hardware, plus the priority of other audio processes and applications above everything else in the system to get sub-millisecond latency. Of course, you still need to ensure that any process with high priority is coded in a real-time aware manner (doesn’t hog the processor), but at least the oprions to configure the OS are there; Windows developers could learn from that.