What's new for In-Memory OLTP in SQL Server 2016 since CTP3

Article
03/25/2016

SQL Server 2016 is making a lot of enhancements to In-Memory OLTP to make it easier to use and perform even better. In a previous post I listed all the new features that had been included in SQL Server 2016 up to and including CTP3. But we have added a lot of new features since then, including NULLable index key columns, LOB types and auto-update of statistics. Below are all the new features for In-Memory OLTP that we added between CTP3 and RC0. Underneath the list of new features you will find more detailed descriptions of LOBs and other off-row columns, ALTER TABLE improvements, and statistics improvements.

In-Memory OLTP features added between CTP3 and RC0

Transact-SQL Improvements:

Query Surface Area in Native Modules:
- LOB types [varchar(max), nvarchar(max), and varbinary(max)] for parameters and variables.
- OUTPUT clause: In a natively compiled stored procedure, INSERT and UPDATE and DELETE statements can now include the OUTPUT clause.
- @@SPID: this built-in function is now supported with natively compiled T-SQL modules, as well as constraints in memory-optimized table.
Support with memory-optimized tables for:
- NULLable index key columns. It is now allowed to include NULLable columns in the keys of indexes on memory-optimized tables.
- Large rows: LOB types [varchar(max), nvarchar(max), and varbinary(max)] can be used with columns in memory-optimized tables. In addition, you can have a memory-optimized table with row size > 8060 bytes, even when no column in the table is a LOB type. Detailed considerations are below.
- UNIQUE indexes in memory-optimized tables. Indexes can now be specified as UNIQUE.
Heap scan: the query processor can now scan the rows in a table heap data structure in memory directly. When a full table scan is needed, this is more efficient than a full index scan.
Parallel scan: all index types, as well as the underlying table heap, now support parallel scan. This increases the performance of analytical queries that scan large sets of data.
Reduced downtime during upgrade: Upgrade from an earlier build of SQL Server 2016 to the latest build of SQL Server no longer runs database recovery. Therefore, data size no longer affects the duration of upgrade. For upgrade and attach/restore from SQL Server 2014, the database is restarted once, therefore the downtime experienced during upgrade of a database from SQL2014 is in the order of [time required for database recovery].
Log-optimized and parallel ALTER TABLE: Most ALTER TABLE scenarios now run in parallel and optimize writes to the transaction log. The optimization is that only the metadata changes are written to the transaction log. For a detailed discussion of exceptions, see below.
Statistics improvements:
- Automatic update of statistics is now supported. It is no longer required to manually update statistics.
- Sampling of statistics is now supported. This improves the performance of statistics collection.
- Note that automatic recompilation of native modules is not supported. They need to be recompiled manually using sp_recompile.
- More details about statistics below.

LOBs and other off-row columns

Large object (LOB) types varchar(max), nvarchar(max) and varbinary(max) are now supported with memory-optimized tables and table types, as well as natively compiled T-SQL modules, and the size limitations mirror that of disk-based tables (2GB limit on LOB values). In addition, you can have a memory-optimized table with a row size > 8060 bytes, even when no column in the table uses a LOB type. There is no run-time limitation on the size of rows or the data in individual columns; this is part of the table definition. Of course, all data does need to fit in memory.

Even though large columns are now supported, it is still recommended to have most columns fit within 8060 bytes for performance reasons. Details below.

The following T-SQL script illustrates a table with multiple large non-LOB columns and a single LOB column:

 CREATE TABLE dbo.LargeTableSample
(
      Id   int IDENTITY PRIMARY KEY NONCLUSTERED,
      C1   nvarchar(4000),
      C2   nvarchar(4000),
      C3   nvarchar(4000),
      C4   nvarchar(4000),
      Misc nvarchar(max)
) WITH (MEMORY_OPTIMIZED = ON);
GO

LOB columns and other columns that do not fit in the 8060 byte in-row storage are stored off-row, while the in-row storage has an 8-byte reference to the off-row data. There is an internal table for each individual off-row column.

The logic that decides whether a given column lives on-row or off-row is as follows, and every ALTER TABLE operation ensures that these rules are followed.

If the columns do not fit in the 8060-byte row limit, the biggest columns are stored off-row. For example, adding a varbinary(2000) column to a table with a varbinary(8000) column that is in-row will cause the varbinary(8000) column to be moved off-row.
All index key columns must be stored in-row; if the index key columns in a table do not all fit in row, adding the index will fail. Consider the same table as in the previous example. If an index is created on the varbinary(8000) column, that column will be moved back in-row, and the varchar(2000) column will be moved off-row, as index key columns must live in-row.

The following query shows all columns that are stored off-row, along with their sizes and memory utilization. A size of -1 indicates a LOB column. All LOB columns are stored off-row.

 SELECT object_name(moa.object_id) AS 'table', c.name AS 'column', c.max_length
FROM sys.memory_optimized_tables_internal_attributes moa
     JOIN sys.columns c ON moa.object_id = c.object_id AND moa.minor_id=c.column_id
WHERE moa.type=5

To get more details about the memory consumption of off-row column you can use the following query, which shows the memory consumption of all internal tables and their indexes that are used to store the off-row columns:

 SELECT
  OBJECT_NAME(moa.object_id) AS 'table',
  c.name AS 'column',
  c.max_length,
  mc.memory_consumer_desc,
  mc.index_id,
  mc.allocated_bytes,
  mc.used_bytes
FROM sys.memory_optimized_tables_internal_attributes moa
   JOIN sys.columns c ON moa.object_id = c.object_id AND moa.minor_id=c.column_id
   JOIN sys.dm_db_xtp_memory_consumers mc ON moa.xtp_object_id=mc.xtp_object_id
WHERE moa.type=5

ALTER TABLE Optimizations

ALTER TABLE is used to make schema changes and tune indexes. For details about syntax and examples see the documentation about Altering Memory-Optimizes Tables.

In SQL Server 2016 ALTER TABLE operations on memory-optimized tables are offline, meaning that the table is not available for queries while the operation is ongoing. All operations that make changes to the in-memory data structures, including column and index changes, result in a new copy of the table being created under the hood. An ALTER operation on a 10GB table takes roughly 1 minute when running in parallel on a server with 24 logical processors, and the time taken scales with the size of the table. Good news is that it is possible to combine multiple ADD, DROP, or ALTER operations in a single ALTER TABLE statement. For example, you could add a column, add an index, and add a constraint, all in one ALTER TABLE statement.

Most ALTER TABLE scenarios run in parallel and the operation is log-optimized, meaning that only the metadata changes are written to the transaction log. However, there are some ALTER TABLE operations that run single-threaded and are not log-optimized, meaning that a complete copy of the table is written to the transaction log as part of the ALTER TABLE transaction.

The following ALTER operations run single-threaded and are not log-optimized:

ADD/ALTER a column to use a large object (LOB) type: nvarchar(max), varchar(max), or varbinary(max).
ADD/DROP a COLUMNSTORE index.
ADD/ALTER an off-row column and ADD/ALTER/DROP operations that cause an in-row column to be moved off-row, or an off-row column to be moved in-row.
- note: ALTER operations that increase the length of an off-row column are log-optimized
- Refer to the description in the previous section to determine whether a given column is stored off-row.

Statistics Improvements

Statistics for memory-optimized tables are now updated automatically, and sampling of statistics is supported. With these changes, statistics management for memory-optimized tables works essentially in the same way as disk-based tables, and comes with the same tradeoffs.

The logic to decide whether stats should be updated mirrors that of disk-based table, with one exception: disk-based tables have a mod-counter at the column-level, while memory-optimized tables have a mod-counter at the row level. These mod-counters are used to track how many changes have been made, and if a certain threshold is reached, auto-update of statistics will kick in. TF2453 and OPTION (RECOMPILE) with table variables are supported.
AUTO_UPDATE_STATISTICS_ASYNC is supported.
The sampling rate for statistics mirrors that of disk-based tables, and parallel sampling is supported.
For most statistics improvements to kick in, ensure your database is set to compatibility level = 130.
To enable pre-existing statistics to be updated automatically, a one-time manual update operation is required (see sample script below).
The recompilation of natively compiled modules is still manual. Use sp_recompile to recompile natively compiled modules.

One-time script for statistics: For memory-optimized tables that were created before SQL Server 2016 CTP3.3, you can run the following Transact-SQL script one time to update the statistics of all memory-optimized tables, and enable automatic update of statistics from then onward (assuming AUTO_UPDATE_STATISTICS is enabled for the database).

 -- Assuming AUTO_UPDATE_STATISTICS is already ON for your database:
-- ALTER DATABASE CURRENT SET AUTO_UPDATE_STATISTICS ON;
GO
ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 130;
GO
DECLARE @sql NVARCHAR(MAX) = N'';
SELECT
      @sql += N'UPDATE STATISTICS '
         + quotename(schema_name(t.schema_id))
         + N'.'
         + quotename(t.name)
         + ';' + CHAR(13) + CHAR(10)
   FROM sys.tables AS t
   WHERE t.is_memory_optimized = 1
;
EXECUTE sp_executesql @sql;
GO
-- Each row appended to @sql looks roughly like:
-- UPDATE STATISTICS [dbo].[MyMemoryOptimizedTable];

Next steps

For all the features added in SQL Server 2016 up to and including CTP3 see What's new for In-Memory OLTP in SQL Server 2016 CTP3

For general information about In-Memory OLTP, see In-Memory OLTP (In-Memory Optimization).

To get started today, download the latest build of SQL Server 2016 or spin up an Azure VM with SQL Server 2016 pre-installed.