I just love this question. It has such a simple answer.... Right? (uh, yeah. Sure). You can start off with this post from the product group (http://msdn.microsoft.com/en-us/library/ff937706.aspx) , which is as good a starting place as any, but as a teammate just said "That's sort of a 'Your Mileage May Vary' answer. Right?" Yep, it is. Here is the reply I sent to the team. Of course I was immediately challenged about making a blog post. (Challenge Accepted).
Unfortunately there isn’t much more precise available. Think of Visual Studio Test Agents as an operating system instead of an application (I use this same analogy with IIS). It “hosts” your application inside it. If your application is lean and mean, AND the app you are hitting is also lean and mean, AND you do not have anything fancy in the way of rules and plugins, AND…. Then you can host a lot of VUs.
If, on the other hand, your app has a lot of rules, or you have to sift through large payloads to extract values, or you wrote sloppy code in your plugins, etc. then you will host a lot less on the same machine.
I did a gig a couple years back where we were able to host approximately 21,000 VUs on a single machine. We had written extremely lean tests, turned off every feature we could find and even ripped out a couple of features of VS 2010 to make it work, but we had to since we were driving 500,000 concurrent vUsers (I was testing for the launch of Halo Reach). I was at a customer a little while ago and we were able to sustain about 250 concurrent vUsers on a machine there. They had coded web tests with nesting and all kinds of rules and injected logic, etc. Those machines were almost as powerful as the ones for the Halo testing.
There are counters built into VS to track %time in rules and plugins for the sake of tuning the test harness, and you should always tune it. Remember that a test harness is “a server side application” and we should be as careful and thorough writing/testing/using it as we expect our customers to be with the apps we’re driving load against.
One thing I forgot to add to the email to my teammates and am adding it here now is the consideration of the physical size of the machines. I had an engagement where I was asked to simulate 10,000 instances of SQL Server using SYNC services to talk to 4 main SQL servers. To do this, we estimated that we would need 108 test agent machines with a grand total of 1,040 cores (yep, over 1,000 physical processing cores in the rig). We were using unit tests that the customer had created and driving them through a single controller. We were unable to get the system to be stable enough to drive load. After a few days of debugging and troubleshooting, we found a couple of issues in their code and cleaned them up. The one that made the biggest difference in performance was getting rid of a call to GC.Collect(). When I did that, I was able to get the harness to run fast enough that we dropped from needing 108 servers, to needing only 20.
Now, what I didn't tell you is that the 108 machines were not all equal size. The 20 machines I dropped down to were all Dell r910 machines with 32 cores each and 128Gb Ram each, so I was still using 640 cores.