Windows Vista Sound causes Network Throughput slowdowns.


AKA: How I spent last week :).

On Tuesday Morning last week, I got an email from “reader@slashdot.org”:

You’ve probably already seen this article, but just in case I’d love to hear your response.

http://it.slashdot.org/article.pl?sid=07/08/21/1441240

Playing Music Slows Vista Network Performance?

In fact, I’d not seen this until it was pointed out to me.  It seemed surprising, so I went to talk to our perf people, and I ran some experiments on my own.

They didn’t know what was up, and I was unable to reproduce the failure on any of my systems, so I figured it was a false alarm (we get them regularly).  It turns out that at the same time, the networking team had heard about the same problem and they WERE able to reproduce the problem.  I also kept on digging and by lunchtime, I’d also generated a clean reproduction of the problem in my office.

At the same time, Adrian Kingsley-Hughes over at ZDNet Blogs picked up the issue and started writing about the issue.

By Friday, we’d pretty much figured out what was going on and why different groups were seeing different results – it turns out that the issue was highly dependent on your network topology and the amount of data you were pumping through your network adapter – the reason I hadn’t been able to reproduce it is that I only have a 100mbit Ethernet adapter in my office – you can get the problem to reproduce on 100mbit networks, but you’ve really got to work at it to make it visible.  Some of the people working on the problem sent a private email to Adrian Kingsley-Hughes on Friday evening reporting the results of our investigation, and Mark Russinovich (a Technical Fellow, and all around insanely smart guy) wrote up a detailed post explaining what’s going on in insane detail which he posted this morning.

Essentially, the root of the problem is that for Vista, when you’re playing multimedia content, the system throttles incoming network packets to prevent them from overwhelming the multimedia rendering path – the system will only process 10,000 network frames per second (this is a hideously simplistic explanation, see Mark’s post for the details)

For 100mbit networks, this isn’t a problem – it’s pretty hard to get a 100mbit network to generate 10,000 frames in a second (you need to have a hefty CPU and send LOTS of tiny packets), but on a gigabit network, it’s really easy to hit the limit.

 

One of the comments that came up on Adrian’s blog was a comment from George Ou (another zdnet blogger):

“”The connection between media playback and networking is not immediately obvious. But as you know, the drivers involved in both activities run at extremely high priority. As a result, the network driver can cause media playback to degrade.”


I can’t believe we have to put up with this in the era of dual core and quad core computers. Slap the network driver on one CPU core and put the audio playback on another core and problem solved. But even single core CPUs are so fast that this shouldn’t ever be a problem even if audio playback gets priority over network-related CPU usage. It’s not like network-related CPU consumption uses more than 50% CPU on a modern dual-core processor even when throughput hits 500 mbps. There’s just no excuse for this.”

At some level, George is right – machines these days are really fast and they can do a lot.  But George is missing one of the critical differences between multimedia processing and other processing.

Multimedia playback is fundamentally different from most of the day-to-day operations that occur on your computer. The core of the problem is that multimedia playback is inherently isochronous. For instance, in Vista, the audio engine runs with a periodicity of 10 milliseconds. That means that every 10 milliseconds, it MUST wake up and process the next set of audio samples, or the user will hear a “pop” or “stutter” in their audio playback. It doesn’t matter how fast your processor is, or how many CPU cores it has, the engine MUST wake up every 10 milliseconds, or you get a “glitch”.

For almost everything else in the system, if the system locked up for even as long as 50 milliseconds, you’d never notice it. But for multimedia content (especially for audio content), you absolutely will notice the problem. The core reason behind it has to do with the physics of sound, but whenever there’s a discontinuity in the audio stream, a high frequency transient is generated. The human ear is quite sensitive to these high frequency transients (they sound like “clicks” or “pops”). 

Anything that stops the audio engine from getting to run every 10 milliseconds (like a flurry of high priority network interrupts) will be clearly perceptible. So it doesn’t matter how much horsepower your machine has, it’s about how many interrupts have to be processed.

We had a meeting the other day with the networking people where we demonstrated the magnitude of the problem – it was pretty dramatic, even on the top-of-the-line laptop.  On a lower-end machine it’s even more dramatic.  On some machines, heavy networking can turn video rendering to a slideshow.

 

Any car buffs will immediately want to shoot me for this analogy, because I’m sure it’s highly inaccurate (I am NOT a car person), but I think it works: You could almost think of this as an engine with a slip in the timing belt – you’re fine when you’re running the engine at low revs, because the slip doesn’t affect things enough to notice. But when you run the engine at high RPM, the slip becomes catastrophic – the engine requires that the timing be totally accurate, but because it isn’t, valves don’t open when they have to and the engine melts down.

 

Anyway, that’s a long winded discussion.  The good news is that the right people are actively engaged on working to ensure that a fix is made available for the problem.


Comments (62)

  1. Nik says:

    I for one don’t think this is a huge issue.

    Any server would not have any multimedia playing.

    And anybody playing multimedia will not need massive networking; everybody had IPODs and even the crippled network should be enough to watch streaming media.

    I am wondering if these realtime enhancements would apply to other realtime threads or just the audio, if it’s all then we could use windows for more "real-time" tasks!

  2. Leo Davidson says:

    Thanks for posting this information. I’m sure a tested fix will take some time but it’s good to just have an explanation so that people can’t spread ridiculous, baseless FUD about how the problem is caused by DRM and so on.

    (I’m no fan of DRM but I dislike the wrong thing being blamed for a problem.)

  3. Michael says:

    But on a dual core system, why cant the interrupts be set to run on separate cpus (audio on one, network on the other)? In our voice-call-routing systems, thats what we went out of our way to do (make sure our T1 voice lines had their interrupts serviced separately).

  4. Nik: It shows up while copying files around, and on a gig network it matters.

    Michael: Because that’s not the way the hardware works.  On multicore machines, the hardware interrupts are only serviced on processor 0.  Even on true MP machines, many of them only service interrupts on processor 0.  There’s nothing an operating system can do to fix it.

    However there ARE some things that you can do to mitigate the issue, and the right people are working on creating the fix.

  5. Joku says:

    There’s some things that haven’t been explained : Why the issue doesn’t repro on XP, according to reports you can get high speeds without glitches in audio there. I haven’t verified this yet though as I’m lacking other gigabit capable computer right now.

    There’s also several reports that using the power configuration that changes CPU speed/multiplier based on load can cause glitches and I can verify this one.

    What’s really interesting is that the glitches *only* affect the DirectX audio on vista. If I use ASIO4ALL(.com) that bypasses the Vista audio stack the glitches stop even during these power transitions.

  6. Joku: It doesn’t reproduce on XP because MMCSS doesn’t exist on XP.  And those reports are totally wrong.  You can’t get the kind of throughput we’re talking about here on XP without turning multimedia playback to a slideshow (especially if you enable ipsec).  Our perf team has the demos to prove it.  To blow away multimedia playback on XP, all you need to do is to have an app that loads the CPU running at priority 15.

    And ASIO shouldn’t matter – ASIO apps run at the same priority as everything else, and network interrupts will preempt them either way.

    Btw, you should use a tool like TTCP to generate network traffic.  Otherwise there are other processes that get in the way and change the results (like file I/O time, etc).  TTCP has the advantage of being a raw networking I/O test.

  7. Steve says:

    @ Nik:

    Although one my think this to be true, in the broadcasting industry ‘Audio servers’ keep an online cache of around 100GB of 24 bit WAV audio that normally resides in a tape or MAID Archive. I won’t do the maths but a typical broadast audio server has 16 outputs, each under seperate control for a different TV channel. Voice overs, Audio descrption services, mutiple languages for the same program, etc,etc. Getting 16 channel of all this content in 24bit WAV files from an archive is pretty intensive, and although we still use WinServer2003, this particular issue could have big consequences for us if we moved to Vista, or even WinServer 2008!

    Anyway, why doesn’t this hard coded value ramp up with CPU speed? Could this value be an output of the windows Experience index?

    Steve

  8. Steve, it’s hard coded because someone screwed up.  The spec said that it was supposed to be controlled by a registry key (and disabled under certain circumstances).

    Unfortunately that didn’t happen, someone screwed up.

  9. WhatAboutHD says:

    "And anybody playing multimedia will not need massive networking…"

    I have to wonder if that is true.

    What if one has a Vista MCE computer with HDTV tuners, able to watch and record HDTV, and then what if there are other computers on a gigabit network simultaneously watching (aka streaming) recorded HDTV off that same Vista computer.

    Seems there would be a very common need to support both a heavy network requirement and multimedia playback.

  10. WhatAboutHD: By our measurements, you can run at least 2 HD 1080P video streams over the network without encountering this issue.

    This really is limited to file copies or other hideously network intensive operations.

  11. August says:

    Larry: I got the impression from Russinovich’s blog post that the problem tends to be much worse in a computer with multiple network adapters (ie 10.000 packets/sec becomes 6.000 packets/sec with three adapters).

    I may be an extreme example but I’ve got 7 adapters on my machine. 2 VPN, a 1394, a LAN, a WLAN and two VMWare. Does this affect the speeds or does the adapters need to be in-use? I guess that 2 (or three if "1394" counts) is standard nowadays?

    I’m running Windows XP so I can unfortunately not try it out.

  12. August, it is.  The VPN and 1394 don’t count, but the vmware ones do.

    That’s a part of the things we need to fix.

  13. Adrian says:

    From Mark’s post, there are two issues.  One is prioritizing the CPU usage of the multimedia threads, and the other is capping of the number network packets received per second.

    Will the solution offer the user a way to say "I don’t mind the occasional glitch, please prioritize network performance over the playback of some lame podcast I’m barely paying attention to?"  It seems Vista makes the assumption that users will always prefer glitch-free playback over everything else.

    Even intense network traffic is bursty.  Isn’t it possible to use more audio buffers to stay a little farther ahead of the playback, which could absorb some missed interrupts and only glitch if heavy network traffic is sustained?

    The CD player in my car glitches about once per hour.  I hardly notice it anymore.  It’s still better than listening to lossy MP3s.

  14. Prioritizing the multimedia threads over the rest of the OS isn’t an issue actually.  The MMCSS service prevents any MMCSS managed thread from consuming more than 80% of the CPU (it’s actually way more complicated than that, but you can use that as a rule-of-thumb).

    The core issue is that on gigabit networks, even though the network stack queues the incoming packets, if the rate of incoming packets gets high enough, then the network stack will hold off the multimedia stack for 10s of milliseconds at a time.  If the network stack had ever had a chance to run, things would be seemless, but…

    The audio stack had to make a trade-off between low latency (smaller buffers) and fewer glitches (bigger buffers).  We’re already getting flack because the latency of the audio engine in shared mode is greater than the latency of the XP audio stack, so increasing the buffer size is not an option.

    The relevent teams have all the data that they need to come up with a good solution for the problem and they’re actively working on it.

    Please Note – this comment was edited to reflect the actual consumption allowed by MMCSS

  15. Adrian says:

    Thanks for the response.

    I guess I misunderstood.  I thought Mark’s post was saying MMCSS threads may use up to 80%, leaving 20% for everyone else.

    I still hope the solution allows the user to control the tradeoff.

  16. Mike Dimmick says:

    What I wondered was why the network packets were being handled in DPCs (and therefore the scheduler doesn’t get a look-in) rather than being moved off to a worker thread or to a thread in the application which created the socket (and yes, I’m aware that the Windows file sharing and HTTP in Windows Server 2003 are implemented as drivers). Obviously there are parts of the TCP/IP protocol suite that don’t end up in applications – ICMP Echo processing, for example – which would have to be handled by a shared pool of worker threads.

    I suppose that in the case where TCP is receiving a large response from a server and has nothing to send in the other direction, it has to generate ACK packets on a timely basis to keep the window filled.

    Is the latency for getting the processing onto a worker thread just too high for this to work?

  17. Mike, the answer’s tied up in the dark ages of NT’s history, but essentially the issue is that the network stack passes an indication at DPC time to the higher level components (TCP, UDP, then to RDR and SRV or WINSOCK).  Those components then get the opportunity to decrypt/decode/interpret the data in the indication data before they post a receive to retrieve the data.  TCP also generates acks at that time.

    I don’t know anything about the decisions that the networking team made, so I don’t know about the worker thread thing.

  18. OSGuy says:

    Larry – You said interrupts are serviced by CPU0 on both multi core and true MP machines – Really? I am fairly certain OS can balance the Interrupts on various CPUs or can choose to pin them down to one CPU. So if Windows is designed to not balance IRQ across CPUs it is a Windows design limitation and not hardware limitation. Or am I missing something here?

    Linux for example uses irqbalance daemon for balancing interrupts – http://www.irqbalance.org/documentation.php specifically mentions  –

    "Intel chipsets (and similar chipsets from other vendors) use something akin to a table (it’s programmed into a component called IO-APIC) for this, and this "table" maps specific interrupts to specific cores or sets of cores. The standard table in our hardware effectively maps all interrupts to core 1 of the first socket. While this works, it also means that under high utilization (for example, on a really busy network) this core gets to spend a disproportional amount of work on processing the interrupts.

    It is the task of the interrupt balancing software to distribute this workload more evenly across the cores: to determine which interrupts should go to which core, and then fill this table for the chipset to use."

  19. George Ou says:

    Here’s a better explanation and how you can work around the issue with jumbo frames.

    http://blogs.zdnet.com/Ou/?p=711

    The point is that the 10K packets per second rate limit is hard-coded for the worst case scenario.  It does not account for faster multi-core CPUs.  My Core 2 Duo E6400 could have easily been set to 50K packets per second and the same assurances to multimedia would have been guaranteed.

    The other problem is that MMCSS doesn’t distinguish between idle, simple music playback, DVD, or HD Video playback since those clearly have different processor requirements.

    This is a poor design and Microsoft will need to fix this and the simplest and most reasonable way to address this is with a content and CPU aware dynamic rate limit for networking performance.  Making the rate limit per LAN interface and not amongst all the interfaces is probably a good idea too and that’s an obvious bug that needs to be fixed.

  20. OSGuy: I don’t know how Linux handles it, I just know that the guys who know this kind of stuff over here tell me that multicore machines handle interrupts on CPU0.  Some MP designs handle interrupts on separate processors, but the majority of the inexpensive (ie non server ones) just interrupt CPU0 – it’s cheaper :).

    I can’t speak to how Linux handles this, I’m not a kernel guy.

    George: I don’t think that anyone is defending the decision to go with 10K packets/second.  Certainly during the internal discussion of the problem (and I was on all of the emails) keeping the limit was never one of the suggestions tendered.  Btw, I do like your jumbo frames idea, it’s a good one.

    As Mark (and I) have said, we’re going to address this issue.

  21. OSGuy says:

    Larry – CPU0 is the default for handling interrupts, that’s how the table is programmed initially – most modern OSes dynamically reprogram the APIC to distribute the interrupts if CPU0 is overburdened with handling interrupts and other CPUs are relatively idle.

    Anyway I assumed you were interested in knowing this, so apologies if that wasn’t the case.

  22. George: Audio uses the "Audio" category (or the "Pro Audio" category, video playback uses the "Playback" category.

    There are many possible solutions to the problem, some of them have been mentioned on this thread, some of them aren’t.  The relevent teams have commited to providing a solution to the problem.

  23. George Ou says:

    There is a large variation on CPU requirements on HD video playback depending on the graphics card you use.  If you use a top of the line $500 ATI 2900, it ironically requires more than 60% of a dual-core processor to play back 1080p VC-1 video.  If you use a cheap $50 ATI 2400 with full VC-1 bitstream decoding and offloading, you can expect to see 7% CPU utilization on a dual-core processor.

    I don’t have a problem with the throttling mechanism and I think it’s needed.  It’s just that it needs to be more intelligent and account for the varying types of media playback and the capability of the CPU and GPU.  Now clearly, there’s no reason why MMCSS should engage the packet rate limiter (no matter how much) if Windows Media Player 11 is sitting idle or paused.

  24. T. Ferguson says:

    Larry, it’s pretty sad that an OS 6 years in development, Microsoft’s crown jewels is bested by XP.  Really really sad and pathetic, in fact.

  25. Phaeron says:

    It seems to me that the main culprit here is that the multimedia stack in Windows Vista is always geared towards low latency, whereas most of the time this isn’t needed. For regular multimedia playback, I would actually prefer that the audio stack _not_ try to maintain a 10ms mixing interval, because that burns more CPU time in context switches and disturbs video display timing (which can be critical in windowed display). For regular non-interactive playback, all you need is matched latencies between the audio and video streams. You can mix every 100ms and still have perfectly synced, glitch-free audio.

    A 10ms mixing rate also seems like a bad idea for regular desktop usage on laptops, where you want to lower tick rates so the CPU can sleep a bit. Any thought to making this adaptive or tunable in a future version?

  26. Tanveer Badar says:

    Can you believe that Mark’s blog entry’s comments section is full of suggestions to kill MMCSS.

    I wrote a comment there and I would like to suggest it to readers here too, have a look a this page:

    http://msdn2.microsoft.com/en-us/library/ms684247.aspx

    try tweaking the parameteres mentioned there and tell your results to others.

  27. Tom M says:

    10000 packets really isn’t all that much if your packets are just 64 bytes. That’s just 5 mbps. Well below some broadband speeds.

  28. George: This isn’t a CPU utilization problem.  If the multimedia processes were allowed to run at all, they’d be able to do their work.  But in this scenario, without the throttling, under certain network workloads, the multimedia processes don’t get scheduled for tens and hundreds of milliseconds.  Which causes massive glitching on even non HD content.

    The types of playback don’t actually matter.

  29. T: Actually Vista was 2.5 years in development – there was another year of prototyping called Longhorn :).

    Phaeron: Actually we’re getting hammered because we’re currently too high latency for certain very common workloads – like voice communications.

    Tom M: Yes, but how many real-world workloads use 64 byte packets at 10,000/second?  Tha’s how I was able to reproduce the problem on my 100mbit network: I used 600 byte packets.

  30. Tim Smith says:

    A book on PBX systems (phone systems) once explained it like this.

    When dealing with data traffic, you need 100% accuracy but can tolerate some latency.

    When dealing with voice (i.e. sound), you can tolerate little or no latency but can degrade quality (more lossy compression or lower sampling bit sizes).

    The two are diametrically opposed.  The Vista issue, is almost the exact same scenario playing out on a computer.

    (I just noticed Larry even mentioned voice communications)

  31. Chris Benard says:

    There is a workaround for this problem already.  My previous roommate and friend found out you can work around it by removing a false service dependency:

    http://digg.com/microsoft/How_To_Fix_the_Vista_Network_Speed_Issue_While_Playing_Sound

  32. DaddyMac says:

    The obvious solution would be to give the ability to the running app to specify if it needs low latency or glitch-free audio.

  33. Chris: Sure, you can hack the system and remove the MMCSS dependency.  And your audio and video will glitch like crazy when you do just about anything with your machine.

    If you don’t care about multimedia performance, that may be an acceptable solution, but one of the important goals for Vista was that the system provide a dramatically better multimedia experience than XP did (it’s pretty much trivial to get XP to glitch – just running a CPU intensive process will do it).

  34. DaddyMac: We do.  You can opt into using MMCSS.

    Of course the Vista multimedia playback infrastructure opts in because we figure that the user wants a good experience.

    If you want a crappy multimedia experience, you can set the "SystemResponsiveness" parameter to mmcss to 100 – that’ll turn off most of the CPU boost.

    Unfortunately because someone screwed up badly, it doesn’t turn off the network throttling.  Needless to say, some fairly senior people were a bit peeved when they learned this.

    That’s a small part of the fix for this problem.

  35. George Ou says:

    Still, I have zero problems playing audio or video while pushing or pulling 60 MB/sec or 480 mbps.  It’s obvious the throttling simply needs to be dynamic and take in to account how fast the user’s CPU is.

  36. Sébastien Mouren says:

    Doesn’t this issue call for hardware audio/video implementation with local buffer/memory, with better DMA and driver models instead of last years trend to rely on host-based audio/video processing?

  37. George, it’s not "obvious" at all.  There are fixes that don’t require the kind of dynamic throttling you’re describing.

    I hate having to say this, but…. Trust us.  The people working on this have literally decades of experience in designing extremely high performance systems (there are two distinguished engineers and a technical fellow involved in these discussions – you don’t get any more senior developers at Microsoft than that).  The teams working on the solution fully understand the problem and they believe they’ve got a solution that will address the issues that have been reported.

    We screwed up in Vista and implemented a throttling system that we introduced a serious performance issue for certain classes of hardware.  We’ve acknowledged that and the teams involved are working hard at coming up with a resolution for this issue.

  38. Sebastien: Hardware acceleration of audio wouldn’t help this situation at all.  

  39. Mike says:

    I tested this in our office: gigabit connection between client and server with two gigabit switches in between.

    Downloading a large file from the server without WMP playing: about 40% usage of the gigabit NIC.

    Downloading a large file from the server with WMP playing: about 12% usage of the gigabit NIC.

    Obviously it’s quite annoying when you have to move large files around often. However, this rarely happens while WMP is playing (it’s an office, remember 😉 ).

  40. George Ou says:

    No it isn’t obvious and I haven’t seen the "slide show" effect even when I’m pulling in 400 mbps of data even when I’m playing DVDs.  I don’t even see a glitch when I’m playing back 10 videos at the same time.

    I don’t doubt your Sr. Engineers, but maybe we’re not communicating clearly here.  I just don’t see the problems on my hardware that you’re describing.

  41. Last week there was a small storm on the internet when it was discovered that playing music on Windows

  42. George Ou says:

    "George, it’s not "obvious" at all.  There are fixes that don’t require the kind of dynamic throttling you’re describing."

    I’m sure there is and I wouldn’t dare suggest I know more about this issue than Microsoft’s engineers.  But here is why I’m having a hard time with the explanation if you’ll look at the following screenshot.

    http://blogs.zdnet.com/Ou/images/rec-4k-with-dvd.png

    I’m receiving data at around 300 mbps and I’m playing back a high-quality high-bitrate DVD.  I saw zero glitches in the video and hear zero glitches in the audio.  There were some intermittent glitches in the DVD when certain processes I have yet to identify in my computer kicked in but they were not when I was receiving data over the network.

    There seems to be some glitch in the system and I don’t know if that’s third party software messing up or some glitches in the OS, but playing back DVDs and pulling in more than 300 mbps of data at the same time didn’t seem to be a problem at all.  Note that I am using jumbo frames to increase my throughput because of the 10K throttling.

  43. Mark says:

    Interesting to read all the comments, can I just say a big thanks to Larry for actually being here and answering questions? Most employees, Microsoft or not, would probably be hiding in a hole of silence and denial by now, and that’s even on official communication channels, let alone answering comments about closely guarded implementation details on their blog. Whether Microsoft stuffed up or not on gigabit ethernet transfers while playing audio is fairly immaterial to the vast majority of users out there who got Vista on their WalMart PC, it’s reassuring that Larry is willing to engage in technical discussions about it.

    Well done! 🙂

  44. Tim Smith says:

    There is one thing I find funny about this whole "We’ve already got a fix."

    How many people internal to MS have been using Vista?

    How many people beta tested it?

    How many people have been using it since release?

    Now it is reported that there is a network slowdown.  How many people does this issue really affect?

    But someone posts a "fix" and 20-30 people say it worked for them but a few said it made the issue worse.  But as far as the people it worked for, they consider the issue fixed.

    If you don’t get what I am saying, here it is.

    With such a poor install and test group for the "bug fix", chances are that all they are doing is causing one issue to go away while creating a totally different issue for other people.  

    With something as complicated as audio and networking, I’ll trust Microsoft over some dude and his roommate.

  45. George: 3rd party should be incapable of introducing glitches in Vista as long as MMCSS is running.  With MMCSS present, there are basically only two things that can generate glitches.  The first is long DPCs (which is the problem with the network), the second is if the disk is so utterly hammered that the I/Os to retrieve the multimedia data can’t be read from the disk.

    I suspect that the glitches you saw may very well be a result of the same networking issue (but of course I don’t know for sure since I’ve not collected perf traces on your machine).

    With the 10K throttleing you shouldn’t see any glitches.  Jumbo frames won’t make a difference.

  46. Tim: Welcome to the world of Microsoft :).

    Both Raymond Chen and Ed Bott periodically goes into a rage about people ponying up advice on how to speed up Windows by tweaking registry keys that don’t even exist.

    It’s fun :).

  47. Igor says:

    Let me say kudos for admitting the mistake, but…

    Larry said : "I hate having to say this, but…. Trust us."

    No can do Larry, not anymore. People trusted Microsoft too often in the past and each time they got screwed. Time for trust is over, people want proof.

    Larry said : "The people working on this have literally decades of experience in designing extremely high performance systems"

    I doubt it. If they have so much experience then they would:

    1. Be able to make 2ms audio latency possible

    2. Not make this stupid mistake

    or at least:

    3. Catch this problem in alpha version of Vista

    This is a shame, it is a clear sign of poor design, and it undermines customer trust even further.

    About CPUs and interrupts, other posters are right and I can confirm it (and you can read it in proper documentation) — any CPU in multi-core system can service interrupts. You just need to reconfigure an I/O APIC before booting application CPUs. If you still have any doubts, get any live Linux distro on a bootable CD and try it out.

    I believe that the cap on the network bandwidth (at least in the current form) is not neccessary at all.

  48. Igor, I’m sorry you feel that way.  It’s possible that in the future, Mark will write a follow-on article which will express all the myriad of trade-offs involved in the discussions (which ranged from the capabilities of various network cards, processor design, scheduler design, test matrixes, and a boatload of other factors).  But I doubt it.

    I mentioned the root cause of this issue above: The networking people weren’t testing multimedia and the multimedia people didn’t have the hardware necessary to test the network.  Stuff happens, we realized the issue and we’ll learn from it.  One of the outcomes of the discussions about this problem is a better understanding of the issues that BOTH teams face to help ensure that mistakes like this don’t happen again.

    And that is the last I will say on this subject.  You can talk amongst yourselves if you like, but I’m done with this particular thread.

    Sorry about that.

  49. Mark says:

    I’m a car guy and a computer guy and you are correct about your analogy; it isn’t analogous… if a timing belt were to begin slipping, the engine, at the least, would quit running and would require physical attention before it would even run at idle speed again. At the worst, mechanical damage would occur, again requiring physical attention before running again.

    A better analogy, although still not perfect, is to say that as the engine has to run faster the ignition can’t keep up with the higher speed requirements and occasional it mis-fires, causing a noticable "stutter". The faster the engine runs, the worse the mis-fires. Reduce the engine speed and the mis-fires go away.

  50. I just run a test using XP SP2 to see how this all turns out on that system:

    Box 1: Xeon, 3 GHz with Gigabit Ethernet, single chip 2 hyperthreaded processors

    Box 2: Core Duo, 2 GHz with Gigabit Ethernet, single chip 2 cores

    I used some custom software to blast packets between the computers on a TCP connection. This software only sends/receives the packets without looking at the data in them. Windows firewall was off. On box 2 I played a low resolution video (320 x 240, Audio: Windows Media Audio 9.1 32 kbps, 22 kHz, stereo (A/V) 1-pass CBR, Video: Windows Media Video 9) in Windows Media Player 9.

    Results:

    In all cases the video played and there were no sound clicks or pauses. In some cases, the video had small pauses or jerks. The Ethernet speed did not decrease in any test but continued at full speed.

    Tests:

    Sending from Box 1 to Box 2: got to 50% on Gigabit with about 50% CPU on box 1 and 45% on box 2. Mark’s Process Explorer showed about 20,000 context switches per second on interrupts and DPCs.

    Sending from Box 2 to Box 1: got to 30% on Gigabit with about 25% CPU on box 1 and 60% on box 2.

    Also sent a 1 GB folder of image files back and forth between the boxes. Network speed never got above about 15% in these tests.

    Interesting Note: The newer dual core machine could not send packets as fast as the older Xeon.

    Conclusions: CPU usage is high when sending network packets at high speed on XP. The media player has no problems with playing sound clearly with a high level of interrupts and DPCs.

  51. George Ou says:

    "George: 3rd party should be incapable of introducing glitches in Vista as long as MMCSS is running.  With MMCSS present, there are basically only two things that can generate glitches.  The first is long DPCs (which is the problem with the network), the second is if the disk is so utterly hammered that the I/Os to retrieve the multimedia data can’t be read from the disk.

    I suspect that the glitches you saw may very well be a result of the same networking issue (but of course I don’t know for sure since I’ve not collected perf traces on your machine)."

    Oh we may have a miscommunications here.  I was getting glitches in DVD playback while network test was HALTED and nothing else was going on.  I was not having glitches while the 300 mbps test was in progress.

  52. yksoft1 says:

    on my Vista Business I ran "net start" and did not see MMCSS service.

    and mmcss.dll is not present in %windir%system32.

    seems one day I removed it as I thought is was a tough trojan horse which changes the dependency of Windows Audio service to prevent removing.

  53. Wilhelm Svenselius says:

    Igor: For someone who obviously doesn’t hold Microsoft in very high regard, you sure have no problem with holding them to a standard of absolute perfection.

    Bugs happen. If you are a programmer, you should know this. Operating systems are very complex and Windows is no exception. In terms of bugs, other OSes are no better, really, when compared to XP SP2 or Vista.

  54. Lucio Maciel says:

    Wilhelm Svenselius: Bugs happens, sure, but a Network throttling is not a BUG, its a design decision, it was designed to do that, its not there by a software bug, it was a intentionally introduced "feature".

    Then I ask, what kind of software designer comes with a such poor solution to the "Audio versus Network" problem?

  55. Bikedude says:

    Igor and OSguy, I am a bit surprised by your comments, because the motherboard I’m using (Tyan K8WE) will physically disable certain devices (like the second NIC) if the second CPU socket isn’t populated. Programming the APIC won’t help, because physically the connection is gone. (OTOH, in my config, some IRQs will then be services by the second CPU, but NIC1 and my soundcard are still both on CPU0)

    It seems to me that it would be up to the CPU manufacturer to allow the second core to service interrupts, but it is none-to-obvious that this is the case, and it does not solve all cases (e.g. the typical multi-socket single-core configs).

    I’d love to see Larry address this in a blog posting, but the "Linux does!" type of argumentation strikes me as a bit pointless. (Linux can do many strange and not so wonderful things as well)

  56. Igor says:

    Larry said: "And that is the last I will say on this subject."

    Ok, but you still haven’t explained why we can’t have 2ms audio latency when those people are such an experts.

    Wilhelm said: "For someone who obviously doesn’t hold Microsoft in very high regard…"

    Let me get this straight, I am _not_ a Microsoft hater (or ./ or Linux troll). True, I am a developer but that doesn’t have anything to do with the subject.

    Let me remind you — subject is a bad design decision to throttle down network traffic in case it _might_ interfere with audio. You can replace "audio" in the previous sentence with any other task and it would still be wrong design decision.

    Bikedude said: "I am a bit surprised by your comments…"

    Why are you discussing something you don’t understand?

    I suggest you to start reading "Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 3A – System Programming Guide" (document order #253668) chapter 8 titled "Advanced Programmable Interrupt Controller (APIC)" where it clearly says:

    "In multiple processor (MP) systems, it sends and receives interprocessor interrupt (IPI) messages to and from other logical processors on the system bus. IPI messages can be used to distribute interrupts among the processors in the system or to execute system wide functions (such as, booting up processors or distributing work among a group of processors)."

    While you are at it, take a good look at figures 8-2 and 8-3. Any CPU can handly any IRQ and it is not some Linux Voodoo — it is clearly documented.

    Why we are discussing this anyway?

    Amiga 500 which was released in 1987 was capable of playing four channel audio, playing animation and copying floppies at the same time and all that on a 7.09 MHz CPU.

    Compare that with dual-core Windows PC running at 3,000 MHz today which still blocks while reading a floppy or a CD, and whose audio stutters when you connect to the internet using an internal soft-modem — to me it is blindingly obvious where the problem is coming from.

  57. OSGuy says:

    Bikedude – we are talking about a problem (sound causing network throughput slowdown) which is reproducible on dual core machines where two processor cores are present and interrupts can be routed to any one of them or both.

    Igor nailed it that this throttling of network assuming that audio will skip is a really bad hack to cover up the problems in the networking stack – they can’t consume packets without taking over the full CPU.

    The reason Linux was highlighted was not because of fanboism but to illustrate the fact that it is possible to do Gigabit speed ethernet traffic and Audio on modern CPUs without having to throttle one or the other.

    A nice scheduler ought to handle this situation just fine.

  58. Igor says:

    OSGuy said : "A nice scheduler ought to handle this situation just fine."

    Unfortunately I don’t think it would be that easy.

    In my Gigabit NIC (Intel Pro/1000 PL) driver properties there is an entry named "Interrupt Moderation Rate" which can be set to one of:

    Adaptive

    Extreme

    High

    Medium

    Low

    Minimal

    Off

    I presume that things would get even worse if you set this to Off and thus disable IRQ moderation completely leading to an enormous amount of IRQs when NIC is heavily loaded.

    My point is that as long as an OS cannot handle such high IRQ rate without dropping packets the issue at hand won’t be truly fixed, it will just be worked around until we all get 10Gbps adapters and then it will surface again.

    What is interesting is that the following Microsoft document claims that Vista can distribute IRQs to different cores and that it supports Message Signaled Interrupts:

    http://www.microsoft.com/whdc/system/bus/PCI/MSI.mspx

    Someone should either check his facts or urge for the document to be updated if what it says is incorrect. It is dated 2004, perhaps that was scrapped?

  59. Dennis says:

    Igor is right, Vista distributes interrupts to both cores!

  60. OSGuy says:

    Igor – The document you referred to talks about MSI and Interrupt Prioritization, both of which are different than dynamically routing interrupts to multiple CPUs based on load.

    The closest thing in the document which relates to interrupt routing is Interrupt Affinity – It clearly states that it is driver dependent to ask for such affinity – If driver does not ask specifically to route interrupts to say all CPUs in the system, the default machine policy takes over (and I suspect that is to route to only BP). ALSO of interest is that the documentation says it is a feature which may be useful for NUMA systems.

    Further it says –

    "It is important to realize that establishing an affinity policy represents a request by the driver and its device, not an absolute constraint."

    A decent scheduler does matter here – check out the Linux realtime preemption project where the scheduler offers to fix all latency sources that generate higher than ~1 msec latencies. Useful for this type of audio requirements. Google for Linux RT and Jackd.

  61. Shelby Cain says:

    What’s even worse is that this inherently flawed design impacts applications like Steam (a multiplayer gaming service that one normally leaves minimized in the system tray).  Simply having Steam loaded into memory causes Vista to throttle the network connection.