Refactoring and performance

I'm currently reading Test Driven Development in Microsoft .NET. There's a lot of stuff in the book I like, but there was one thing that rubbed me the wrong way. In the "Refactoring by example" chapter, they take a prime number algorithm and refactor it. In one step, the algorithm loops up to sqrt(n). They merge the loop with one that loops up to n. They do some hand waving that this might not be as efficient, but you should check to make sure. This seems to go against the philosophy of refactoring they describe up to the point of the book. Refactoring should improve the internal structure of the code without making externally visible changes. And, the test driven philosophy is that you write tests for all externally visible features. So, shouldn't there be tests that define the performance requirements and shouldn't the verification that the performance is still acceptable after a refactoring be done by verifying that a test still passes?

Of course, this is easier said than done. Automated performance testing is hard, especially on arbitrary machines. In the MacBU they have a great performance testing infrastructure that tests the performance of every build. This tells you if any changes were made that hurt performance, or how well any performance improvements worked. This was always done on a specific machine on an isolated network to keep results reproducible. Doing tests on every developer's machine is a bit more complex. How do you know how long a test should take? What do you compare it to? Even defining performance requirements can be difficult. What CPU speed? How much memory? How many other applications running? etc.

I wish the book addressed this issue. I don't know if any other books in this genre do...