Scale and performance lessons learned from a live deployment

My name is Dan Blood, I am a senior tester working in the product group that is responsible for Search within MOSS and MSS.  One of my responsibilities is maintaining a mirror, SearchBeta,  of the enterprise search solution that all of Microsoft uses.  This mirror indexes approximately 28 million documents on the corporate intranet as well as serving all of the echoed queries that are executed against the main enterprise search system.  This mirror has virtually the same query load and data as Microsoft's enterprise portal.  The product group routinely uses this mirror to validate QFEs, potential fixes as well as validate and observe how the system behaves at scale.  This mirror has been up and running for several months now and I along with the team have learned a lot from the experience.

My intent for this series of blog postings is to provide details on the lessons that we have learned.  For example:

  • What have I done to optimize the hardware?
  • What can be done with the crawling system to ensure the 28+ million documents are "freshly" indexed?
  • How has the SQL machine been configured for optimal use?
  • How do I monitor the system to make sure it is healthy?

Over the next few weeks I'll be updating the blog with the above information.  If you have questions around how to maintain a large scale enterprise search deployment feel free to post your questions as comments to this post and I'll try to answer your questions in follow-up postings.  

Thank you for your interest and comments.  Check back in the next few days for the first post in this series.

Dan Blood
Senior Tester
Microsoft Corp