Handling inheritance with JSON

JSON in SQL Server 2016 and Azure SQL Database enables you to handle custom fields and inheritance. As an example, imagine People/Employee/Salespeople structure where Employee is a kind of Person, and Sales person is a kind of Employee. This is a standard inheritance structure of entities. In earlier versions of SQL Server, you had several options to design table for this inheritance structure:

  1. Single table inheritance where you can put all fields from all sub-classes in one wide table (e.g. People)
  2. Multiple table inheritance where you create separate table for every entity
  3. Entity-attribute-value pattern where you keep common filed in one table (e.g. People) and then store all custom fields in separate (PersonId, FieldName, FieldValue) table

In new SQL Server 2016 WideWorldImporters sample database, we have new approach for handling inheritance - using JSON column and key-value pairs. Application.People table is used to store all kind of people and it has only columns that are common for all types of people. We have two flags IsEmployee and IsSalesperson that represent type of person, and one JSON column (CustomFields) that contains custom fields specific for some kind of people.

If someone is employee (IsEmployee column is equal to 1) then it has some additional fields (like OtherLanguages he speaks, Title, and HireDate). Since these fields are custom for employee type, they are stored in JSON column as key-value pairs. We are storing them as JSON key-values because we want to avoid additional sparse columns or separate table that will contain custom fields.

If some person is sales person (IsEmployee column is equal to 1) then it has some additional fields PrimarySalesTerritiory and CommisionRate.

This looks like EAV model where key-values are not stored in separate table.

Returning data as JSON

This model enables you to access all data with a single table read, e.g.:

 select PersonID, FullName, PhoneNumber, FaxNumber, EmailAddress, CustomFields
from Application.People
where PersonID = 17

This query will return common database columns and all custom fields as one text field. If your client understands JSON (e.g. AngularJS single-page app) raw content of CustomField column can be directly displayed.

If you have JavaScript client that understands JSON, you might return entire text as JSON using FOR JSON clause:

 select PersonID, FullName, PhoneNumber, FaxNumber, EmailAddress, JSON_QUERY(CustomFields) CustomFields
from Application.People
where PersonID = 17
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER

Note important thing here - we need to wrap CustomFields with JSON_QUERY call. If you just use CustomFields column in FOR JSON, it will be surrounded with double quotes and escaped because this column is treated as plain text. However, if you wrap it with JSON_QUERY, FOR JSON will know that result of this column is valid JSON and it will not escape it.

Returning data to clients that don't understand JSON

In some case you might want to have relational view over JSON values.

Imagine that you have PowerBI or SSRS reports that can connect to a People table as data source. They don't understand JSON and they need standard columns. In reports usually you need to have predefined columns that you will use for reporting, so you cannot use just a JSON text.

If you can write TSQL that defines data source, you can use JSON_VALUE or OPENJSON in SQL query that defines data source to read values from JSON, e.g.:

 select PersonID, FullName, PhoneNumber, FaxNumber, EmailAddress, Title, HireDate
from Application.People
 cross apply OPENJSON(CustomFields)
             WITH(Title nvarchar(50), HireDate datetime2)

When a reporting tool executes this query, it will see Title and HireDate as standard columns. In this case it might be better to put this query in some stored procedure that will be called from reporting tool.

Another alternative is to create views that will encapsulate JSON values and use these views as a source.

You can create Employees view that filters only people with IsEmployee flag, and add custom fields specific for employees:

 drop view if exists Application.Employees
go
create view Application.Employees as
select PersonID, FullName, PhoneNumber, FaxNumber, EmailAddress, Title, HireDate
from Application.People
 cross apply OPENJSON(CustomFields)
             WITH(Title nvarchar(50), HireDate datetime2)
WHERE IsEmployee = 1

Or we can have a view called Salespeople that filters only people with IsSalesPerson flag, and add custom fields specific for sales people:

 drop view if exists Application.Salespeople
go
create view Application.Salespeople as
select PersonID, FullName, PhoneNumber, FaxNumber, EmailAddress, Title, HireDate, PrimarySalesTerritory, CommissionRate
from Application.People
 cross apply OPENJSON(CustomFields)
             WITH(Title nvarchar(50), HireDate datetime2, PrimarySalesTerritory nvarchar(50), CommissionRate float)
WHERE IsSalesperson = 1

External applications can read data from these views and they will not be aware that custom fields are coming from JSON column. Following queries will return columns that looks like any standard table columns:

 select * from Application.Employees
select * from Application.Salespeople

The only constraint is that fields that are coming from JSON cannot be directly updated using UPDATE Employee SET CommissionRate = 0.15. You would need to use JSON_MODIFY function to update values in JSON column.

Conclusion

There is no perfect structure that can map OO inheritance to relational database, but this is one approach. Compared to other design approaches there are some pros and cons:

  1. Single table inheritance: you will not have table schema explosion where you need to add new column for each property in some subclass. Downside is that accessing JSON values is slower than direct column reference.
  2. Multiple table inheritance enables you to organize you properties into separate columns that can be small and manageable. However, in order to return information about entities you need a lot of joins.
  3. Entity-attribute-value table is similar to JSON, but in this case you have only one table access instead of reading from two tables.