Combining relational and NoSQL concepts in SQL Server

Intro

In this article we will see how you can combine NoSQL and relational concepts in SQL Server database.

We will see how you can transform tables to JSON objects and vice versa using FOR JSON and OPENJSON functions.

Relational vs. NoSQL approach

Traditional relational schema design follow strict normalization approach – every logical entity is stored as a separate table and there are foreign key relationships between related entities. Imagine products that may have many reviews – you should create one table for products, another for reviews, propagate foreign key from product into reviews table, and add a foreign key relationship between them, as shown in the following figure:

 

Although this is a proper database design, if you have many related entities you might end-up with a lot of tables and foreign key relationships, you will need to join tables every time you need to take product information, maintain indexes or foreign key columns to speed-up joins, etc. Example of query that reads products and related reviews is shown in following example:

 SELECT Production.Product.ProductID AS ID, Production.Product.Name, 
 Color, ListPrice, ReviewerName, ReviewDate, Rating FROM Production.Product JOIN Production.ProductReview ON Production.ProductReview.ProductID = Production.Product.ProductID

Every time you need to retrieve related reviews you need to join tables and you would probably need to create additional index on foreign key column in Review table. Results are shown in the following table:

ID Name Color ListPrice ReviewerName ReviewDate Rating
709 Mountain Bike Socks, M White 9.50 John Smith 2007-10-20 00:00:00.000 5
937 HL Mountain Pedal Silver/Black 80.99 David 2007-12-15 00:00:00.000 4
937 HL Mountain Pedal Silver/Black 80.99 Jill 2007-12-17 00:00:00.000 2
798 Road-550-W Yellow, 40 Yellow 1120.49 Laura Norman 2007-12-17 00:00:00.000 5

On the client-side you probably need to process results and merge two 937 product rows as a single object with two related reviews items (e.g. as an array with two elements). Instead of single query that joins two tables you can use two separate queries. First query would read product information ad second query would read related product reviews. Some ORM tools use this approach when they need to read related informaiton. This is not so bad unless if you have a lot of related entities (e.g. product images, product attachments, product categories, etc.) and you need to run several independent queries for a single product.

One of the reasons why NoSQL systems become popular is the fact that you can use composite objects where you can store attributes of primary entity (product in our example) with related records (product reviews) within the primary entity as an array or collection of sub-objects. As an example, in MongoDb or DocumentDb you will create one JSON document for Product and add related reviews as an array of JSON objects like in the following example:

Products collection
{"ProductID":709,"Name":"Mountain Bike Socks, M","Color":"White","Reviews":[{"Reviewer":{"Name":"John Smith","Email":"john@fourthcoffee.com"},"ReviewDate":"2007-10-20T00:00:00","Rating":5,"ModifiedDate":"2007-10-20T00:00:00"}]}
 {"ProductID":798,"Name":"Road-550-W Yellow, 40","Color":"Yellow","Reviews":[{"Reviewer":{"Name":"Laura Norman","Email":"laura@treyresearch.net"},"ReviewDate":"2007-12-17T00:00:00","Rating":5,"ModifiedDate":"2007-12-17T00:00:00"}]}
 {"ProductID":937,"Name":"HL Mountain Pedal","Color":"Silver\/Black","Reviews":[{"Reviewer":{"Name":"David","Email":"david@graphicdesigninstitute.com"},"ReviewDate":"2007-12-15T00:00:00","Rating":4,"ModifiedDate":"2007-12-15T00:00:00"},{"Reviewer":{"Name":"Jill","Email":"jill@margiestravel.com"},"ReviewDate":"2007-12-17T00:00:00","Rating":2,"ModifiedDate":"2007-12-17T00:00:00"}]}

 

You have simpler data model, no JOINs, no additional requests/queries/indexes, all data available in the same record. However, this model is also far from perfect. Although it is a good choice for smaller data models, in more complex models you might end-up with heavy objects, or you would need to organize objects in separate collections. In some cases you might need to join objects stored in different collections, and you would find that this is not possible (e.g. in DocumentDb) or you need to write complex map/reduce jobs for a simple join (e.g. in MongoDb).

How to transform relational schema into NoSQL in SQL server?

SQL Server 2016/Azure SQL Database introduce hybrid approach where you can choose between relational and NoSQL concepts. As an example, if you have products and their reviews you don’t need to create separate tables if you don’t want them. You can create additional columns in the primary table that will contain collection of related entities formatted as JSON arrays:

 ALTER TABLE Production.Product
 ADD Reviews NVARCHAR(MAX) 
 CONSTRAINT [Reviews are formatted as JSON] CHECK(ISJSON(Reviews)>0)
 

In this example, we are adding a simple text column with a constraint that specifies that reviews should be formatted as JSON (similar to NoSQL databases). There is no new syntax for JSON constraint - you can use standard check constraint with function ISJSON that checks is Reviews text formatted as a JSON object.

If we want to move related product reviews from a separate table into this column we can use a simple UPDATE query:

 UPDATE Production.Product
 SET Reviews = (
 SELECT ReviewerName AS [Reviewer.Name],
 EmailAddress AS [Reviewer.Email],
 ReviewDate, Rating, ModifiedDate
 FROM Production.ProductReview
 WHERE Production.ProductReview.ProductID = Production.Product.ProductID
 FOR JSON PATH)

Inner query fetches all related reviews, formats them as JSON documents using FOR JSON clause and stores them as JSON text in Reviews column. We can format properties in JSON document using dot syntax (e.g. Product.Name will be created as a Name property in Product object).

If you don't have existing related table, you can insert new values as array of JSON objects, e.g.:

 INSERT INTO Production.Products(Name, Reviews)
 VALUES('HL Mountain Pedal', 
 '[
 {"Reviewer":{"Name":"John Smith","E-mail":"john@fourthcoffee.com"},"Rating":5},
 {"Reviewer":{"Name":"Mike"},"Rating":4}
 ]')

Values can be inserted as a plain.

Now, we can read products and related reviews with a single query:

 SELECT ProductID, Name, Color, ListPrice, Reviews
 FROM Production.Product
 WHERE Reviews IS NOT NULL

Results are shown in the following table:

ProductID Name Color ListPrice Reviews
709 Mountain Bike Socks, M White 9.5 [{"Reviewer":{"Name":"John Smith","E-mail":"john@fourthcoffee.com"},"ReviewDate":"2007-10-20T00:00:00","Rating":5,"ModifiedDate":"2007-10-20T00:00:00"}]
798 Road-550-W Yellow, 40 Yellow 1120.49 [{"Reviewer":{"Name":"Laura Norman","E-mail":"laura@treyresearch.net"},"ReviewDate":"2007-12-17T00:00:00","Rating":5,"ModifiedDate":"2007-12-17T00:00:00"}]
937 HL Mountain Pedal Silver/Black 80.99 [{"Reviewer":{"Name":"David","E-mail":"david@graphicdesigninstitute.com"},"ReviewDate":"2007-12-15T00:00:00","Rating":4,"ModifiedDate":"2007-12-15T00:00:00"},{"Reviewer":{"Name":"Jill","E-mail":"jill@margiestravel.com"},"ReviewDate":"2007-12-17T00:00:00","Rating":2,"ModifiedDate":"2007-12-17T00:00:00"}]

 

As you can see, reviews are returned as a collection of JSON objects. There is exactly one row per each product so you do not need any transformation on the client side. This is a perfect choice if your client already expects JSON format for related records (e.g. if you are using some Angular, Knockout or other template engines that inject JSON model into HTML view).

As an alternative if you want to “join” products with related reviews as in the query above, you can use following query:

 SELECT ProductID, Name, Color, ListPrice, ReviewerName, ReviewDate, Rating
 FROM Production.Product
 CROSS APPLY OPENJSON(Reviews)
 WITH ( ReviewerName nvarchar(30) '$.Reviewer.Name', ReviewDate datetime2, Rating int)
WHERE Reviews IS NOT NULL

OPENJSON table value function takes related reviews formatted as JSON and returns them as a table. In the WITH part you can specify schema of returned table. Column names match the names of properties in JSON object, and if you have nested property (e.g. Reviewer.Name) you can specify JSON path where this value can be found. CROSS APPLY joins parent row with table returned by OPENJSON functions. Results are shown in the following table:

ID Name Color ListPrice ReviewerName ReviewDate Rating
709 Mountain Bike Socks, M White 9.50 John Smith 2007-10-20 00:00:00.000 5
937 HL Mountain Pedal Silver/Black 80.99 David 2007-12-15 00:00:00.000 4
937 HL Mountain Pedal Silver/Black 80.99 Jill 2007-12-17 00:00:00.000 2
798 Road-550-W Yellow, 40 Yellow 1120.49 Laura Norman 2007-12-17 00:00:00.000 5

In both cases you don’t need to scan two tables – all necessary results are taken with a single query.

OK, but how to update reviews?

This is important thing you would need to consider. Updating JSON collection of objects is not so efficient as updating collection of rows in a separate table. In NoSQL models, updating fields imply updating one big blob of data where you

SQL Server provides JSON_MODIFY function that enables you to update a value on some path, delete or add new node, or append object at the end of the list of nodes.

In the following procedure is shown how can you append new review into the collection of product reviews:

 CREATE PROCEDURE
 AddProductReview(@productid int, @reviewerName nvarchar(30), @reviewerEmail nvarchar(30), @reviewDate datetime2, @rating int)
 AS
 BEGIN

 UPDATE Production.Product
 SET Reviews = JSON_MODIFY(Reviews, 'append $',
((SELECT @reviewerName AS [Reviewer.Name], @reviewerEmail AS [Reviewer.Email],
         @reviewDate AS ReviewDate, @rating AS Rating, getdate() AS ModifiedDate
  FOR JSON PATH, WITHOUT_ARRAY_WRAPPER))
 )
 WHERE ProductID = @productid;

END

this procedure accepts product id, and information about new review as parameters. You would need to transform review data to JSON using FOR JSON clause, and then use JSON_MODIFY to append the new review to the list of the existing reviews. If you provide review as JSON text to the procedure, you could simply append the input JSON object to the existing array.

Summary

Although SQL Server is relational database you don't need to use strict relational theory in your database design. With new JSON support that is coming in SQL Server 2016 and Azure SQL Database you can choose when to follow strict concepts of relational schema design, and when to format objects as in the NoSQL systems. SQL Server gives you flexibility to choose the best format for your applications and enables you to easily transform table data into JSON documents and JSON documents to table structure.