2. Application Patterns and database performance: Academically correct object modeling.

 

Another pattern that can cause too many round trips to the database is when each object is treated as a black box and is responsible for retrieving and saving its own data. This by itself is not the problem; it’s when the object hierarchy reaches three or more levels. The natural programming pattern becomes this: for each row in the parent object, retrieve all the related child rows in a separate query. Two levels in the hierarchy is not too bad unless the final record set is huge. But carrying this pattern to the third and fourth level, you can imagine that the number of round trips to the database can geometrically increase.

 

            Consider the following object hierarchy: Project – Task – Sub Task – task history.

This is a hierarchy for building a house. The house is modeled in the project table. The tasks for each vendor are stored in the tasks table. Each task has several steps, and each step can have multiple history records. The first round trip to the database was done by the application to retrieve the list of houses. The user would pick a house to work on. Then the application will retrieve all tasks for the project. This is a normal pattern so far and we have not generated any bad performance. But then the code gets interesting because for each task there is another trip to the database to retrieve its sub task. Then for each subtask there is a separate round trip to the data base to get its history. This is done because the objects are a black box and have no knowledge of other objects even if they are related. The really bad aspect of this approach is that all these round trips were done with one user action!

 

Solution A:   Create a view in the database that joins all the tables and returns one data set. Use that to populate your objects. I’ve used OUTER JOINS in the sample below so that all projects get returned, even if they don’t have any tasks, sub-tasks or history.

Create View vProject AS SELECT * FROM Project P OUTER JOIN Task T on P.Proj_ID = T.Proj_ID OUTER

JOIN SubTask ST

on T.Task_ID = ST.Task_ID OUTER JOIN TaskHistory TH on ST.SubTask_ID = TH.SubTask_ID

 

Solution B:   For those of you who don’t like returning everything in one record set, another good solution is to have a business object that can encompass all the related entities and populate each individual object as needed, the code will execute a stored procedure that well return multiple result sets. For Example:

Execute dbo.pr_GetTasksForProject @parm1=100

Which does:

            SELECT * FROM Project WHERE Proj_ID = @parm1

 

SELECT T.* FROM Project P OUTER JOIN Task T on P.Proj_ID = T.Proj_ID AND P.Proj_ID = @parm1

 

SELECT ST.* FROM Project P OUTER JOIN Task T on P.Proj_ID = T.Proj_ID OUTER

JOIN SubTask ST

on T.Task_ID = ST.Task_ID OUTER JOIN TaskHistory TH on ST.SubTask_ID = TH.SubTask_ID AND P.Proj_ID = @parm1

 

SELECT TH.* FROM Project P OUTER JOIN Task T on P.Proj_ID = T.Proj_ID OUTER

JOIN SubTask ST

on T.Task_ID = ST.Task_ID OUTER JOIN TaskHistory TH on ST.SubTask_ID = TH.SubTask_ID AND P.Proj_ID = @parm1

 

This will return multiple record sets So you well have to remember to create the proper code to loop through all the ADO return sets (using IDataSet.NextResult()) as was shown in an earlier blog. You will notice that the last select statement is almost the same select as the view in solution A except that in this case it is only returning data from one table. One customer preferred this solution because it was easier to plug into their current object model – they simply loaded each class object from each result set.

 

Solution C:   This project example is a hierarchy and may be perfect for XML. You could return XML from the relational tables by using the FOR XML clause in the select statement. But is this a perfect case for using the XML data type, or should you map your XML input to relational tables? Read these articles before you decide.

https://msdn.microsoft.com/xml/default.aspx?pull=/library/en-us/dnsql90/html/sql25xmlbp.asp

https://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sqloptxml.asp

 

Discovery:   What is an easy way to discover whether you have these patterns in your code? The first way is to ask your users which screens and actions are the slowest. They will be able to direct you to the slowest portions of the application. A more scientific way is to use SQL Profiler in a carefully controlled test environment. Here is a step-by-step method for discovering the number of round trips for each user action in your application.

  1. Start SQL Profiler. Start a new trace using the default events and columns. Add any filters you need to ensure that you are only getting the commands that are coming from your session.
  2. Start your application. Look at the SQL Profiler screen and count the number of commands that occurred during the start up.
  3. Simulate normal work by your user base, stopping after each action to record the number of commands that appear in SQL Profiler. Note that a screen load is a valid user action.

The point of this last section is that collecting this metric is a task that you should add to your test routines. And beware of using simple test data; i.e. an invoice with one detail line item. Create at least one customer with a large number of instances per object, especially where you have a multi-level object hierarchy.