Product catalog is a common pattern in e-commerce. The fact is that products might have various properties. Some products might have weight, some color, taste, price, etc. If you want to create generic table that should store different products, this table could contain few columns that are common to all products and a lot of custom columns that are applicable only to some specific types of properties. In this case you would either put a lot of columns in product table and make them sparse, create some kind of table inheritance, or use EAV pattern.
I need to model product catalog where applicable properties depends on type of product. I want to avoid huge tables with tens or hundreds of sparse columns, complex table inheritance hierarchy, or separate table with key:value pairs that should be always joined with Product table.
If properties of some entities might vary, we can combine relational columns with JSON key:values. We can divide product properties using the following rules:
- Properties that are common for all products will be stored as columns (e.g. title, price, foreign keys)
- Properties that are not common can be stored in single JSON column as key:value pairs.
This way we have one simple table that does not require additional JOIN operators, and still enable us to use properties in JSON as any other column.
Case study – Adventure Works
If you look at Production.Product table in AdventureWorks2016 database you will notice that only few columns such as Name, ProductModelID, ModifiedDate, and are generic. Other columns are specific for products in AW domain and if we would like to store new types of product such as self phones, cars, etc., we would need to introduce new columns while most of the existing columns will not be used because they might not be applicable.
Therefore, I will create more generic and simple table ProductCatalog that will contain JSON key value pairs for most of the columns in the Product table that are specific to bicycles:
DROP TABLE IF EXISTS [dbo].[ProductCatalog]
CREATE TABLE [dbo].[ProductCatalog](
[ProductID] [int] IDENTITY(1,1) NOT NULL CONSTRAINT [PK_ProductCatalog_ID] PRIMARY KEY ,
[Name] nvarchar(50) NOT NULL,
[ProductSubcategoryID] [int] NULL,
[ProductModelID] [int] NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] [datetime] NOT NULL,
[Data] [nvarchar](max) NULL
Now I can move products into this table:
SET IDENTITY_INSERT ProductCatalog ON
INSERT INTO ProductCatalog(ProductID, Name, ProductSubcategoryID, ProductModelID, rowguid, ModifiedDate,Data)
SELECT ProductID, Name, ProductSubcategoryID, ProductModelID, rowguid, ModifiedDate,
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER) AS Data
SET IDENTITY_INSERT ProductCatalog OFF
Only columns that are common across all products are retained as table columns. All other columns are formatted as JSON key value pairs and stored in Data column.
Now we can return all additional properties as a whole JSON object or slice results using some properties:
SELECT *, JSON_VALUE(Data, '$.Color') AS Color
WHERE JSON_VALUE(Data, '$.ProductNumber') = 'DT-2377'
With JSON in SQL Server 2016 we can easily create flexible key:value structures that could be contain any set of properties. This is common NoSQL pattern that could be easily applied in SQL server.
In the second part of this case study http://blogs.msdn.com/b/sqlserverstorageengine/archive/2015/12/21/indexing-data-in-json-product-catalogs.aspx you might see how to add indexes in this product catalog.