Hi DQS users,
I want to draw your attention to the Data Quality Services Performance Best Practices Guide that describes how to get the best performance out of DQS, especially when working with large data volumes.
As a knowledge driven product, DQS requires relevant knowledge for doing a good job improving the quality of your data, and a good knowledge base is a key factor for achieving excellent and accurate results. Fortunately, this is also true for obtaining good performance. The contrary is also true – poor knowledge will not help in getting reasonable results, and will cause degradation in DQS performance. The Performance Best Practices guide explains why this is true and provides valuable insight into the inner workings of the DQS implementation and recommendations on how to plan and use DQS for obtaining best performance.
The document is intended for database administrators and DQS users who plan and implement DQS projects. It starts with a short section containing some high level numbers on the expected performance obtained on recommended hardware, and the rest of the document (and its majority) is about planning for successful and efficient DQS projects. Whenever dealing with large amounts of data, I recommend reading the entire document in order to save you the common pitfalls.
The document is about the General Availability release of SQL Server 2012. There will be significant improvements down the road, hopefully within the next several months, and we will update more later on.
The DQS Team