Advanced Performance Tuning – 4 :: Designing for better performance contd…
Few months back, I was working with one of my customers, who was engaged by his client to design a Survey Module. Nothing very fancy about the module, but the kind of reports that were to be delivered made it really tricky.
Let me explain the requirements first: When customers click on a Survey Link, they are redirected to a web page. This page has 20 questions; for each question, customers are supposed to provide a rating between 1 and 5, both inclusive. Of course, customers can skip any question they do not want to rate; and finally, there will be a text box where they can enter their comments. Click on Save, and the data gets saved into a database. Pretty straight-forward, isn’t it?
My customer designed the database schema such that the main tables that had the Survey Results was designed as below:
CREATE TABLE Surveys (
SurveyID INT IDENTITY(1,1) NOT NULL,
RefNo CHAR(10) NOT NULL,
SurveyDate DATETIME NOT NULL,
...
CONSTRAINT [PK_Surveys] PRIMARY KEY CLUSTERED
( [SurveyID] ASC ),
CONSTRAINT [CK_RefNo] UNIQUE NONCLUSTERED
( [RefNo] ASC )
)
GO
CREATE TABLE SurveyResults (
SurveyID INT NOT NULL,
Answer1 SMALLINT NULL,
Answer2 INT NULL,
Answer3 INT NULL,
Answer4 INT NULL,
Answer5 INT NULL,
Answer6 INT NULL,
Answer7 INT NULL,
Answer8 INT NULL,
Answer9 INT NULL,
Answer10 INT NULL,
Answer11 INT NULL,
Answer12 INT NULL,
Answer13 INT NULL,
Answer14 INT NULL,
Answer15 INT NULL,
Answer16 INT NULL,
Answer17 INT NULL,
Answer18 INT NULL,
Answer19 INT NULL,
Answer20 INT NULL )
GO
CREATE TABLE SurveyComments (
SurveyID INT NOT NULL,
Comments VARCHAR(500) NULL )
GO
I must say, the design was very impressive, and, added to that, appropriate indexes made the Reporting data retrieval very fast.
And then, one fine day, their client asked them to design a set of new Reports; they coded the stored procedures and queries, but unfortunately, they were unable to get these queries perform as expected. Some of the requirements that were shared with me were:
-
List of Surveys where customers had rated 4 or 5 in at least 10 questions.
-
List of Surveys where customers had rated 2 or less in at most 5 questions.
-
List of Surveys where customers had rated 5 or 1 in all questions (possibly junk surveys).
-
List of Surveys where the average rating was 3 or more.
With this design, the developer had to push the data into a Temp Table and then apply all sorts of logic to the data in the Temp Table to get the desired output. This not only took a lot of time, but also increased I/O and CPU utilization. At this point, they came to me for help.
After carefully studying and analysing the situation, I came to the conclusion that with the existing design and code, further optimization was nearly impossible. We could have worked on the code, but I was not sure if we could achieve the benchmarks that were already set. I suggested a change in the schema of the SurveyResults table. My suggestion of the schema was as follows:
CREATE TABLE SurveyResults (
SurveyID INT NOT NULL,
QuestionID SMALLINT NOT NULL,
Answer SMALLINT NULL )
GO
With this design, each Survey would be saved as 20 rows, per question. Of course, this design change warranted change in all the reporting queries; however, given the complexity of Reporting requirements, this was probably the most appropriate change. The developers made all the necessary code changes, data was transformed, and we tested, tested and tested all over again; and with all necessary Indexes in place, we were able to provide the expected performance.
This exercise, clearly demonstrated how, a schema design can affect or improve query performance.
Disclaimer: All information provided here is my personal opinion and is neither verified nor approved by Microsoft before it is published. All information, and code samples, if any, is provided "AS IS" with no warranties and confers no rights.