Slow performance and out of memory issues caused by large batch

Recently, I worked with a customer who reported out of memory errors from his SQL Server 2008 R2 with 32GB of RAM.   One of the errors is 701 error (Error: 701, Severity: 17, State: 123. There is insufficient system memory in resource pool 'default' to run this query.)

After taking a memory dump, we discovered that the particular query that consumed large amount of memory was a query like below. I rewrote is so that it doesn’t reference customer’s table names.

declare  @tempIDs TABLE (id int primary key);
insert into @tempIDs values (0);
insert into @tempIDs values (1);
insert into @tempIDs values (2);
insert into @tempIDs values (3);
insert into @tempIDs values (4);
insert into @tempIDs values (5);
insert into @tempIDs values (6);
....
.... 1.5 million inserts like this
select * from sys.objects   where object_id in (select id from             @tempIDs )

 

From application side, the following C# code will generate the query above

static void InsertMethod()
{
StringBuilder builder = new StringBuilder();
builder.Append("declare @tempIDs TABLE (id int primary key);");

for (int i = 0; i< 1000000;i++)
{
builder.Append(string.Format("insert into @tempIDs values ({0}); ", i));
}

builder.Append("select * from sys.objects where object_id in (select id from @tempIDs )");
SqlConnection conn = new SqlConnection(ConnectionString);
conn.Open();
SqlCommand cmd = conn.CreateCommand();
cmd.CommandText = builder.ToString();
cmd.ExecuteReader();

}

This query would do 1 million inserts into a table variable and then does a join with other tables.  As you can imagine, the query batch size is quite large.   This kind of batch will cause two issues.  The performance will be slow.  In addition, it will consumes large amount of memory.   This because SQL will need to parse each statement.  It not only takes time but also consumes memory to store internal structure.  In addition, using table variable this way with large number of rows is inappropriate.   See this post for details.

So what’s the solution?

There are various solutions depending on your situation.  For example, you can use SSIS or bulk insert to get the data into a permanent table and then join with other tables.  But do NOT use IN or OR clause.  If you use IN or OR clause, you will have 1 million values in IN and OR cause.  That type of approach will cause slower performance and memory error as well with large number of values like this.

If  you have to generate the values dynamically and join with other tables, use SqlBulkCopy object to pump data into a temp table and then do joins.  The following C# code example will achieve that.  It will be fast and memory requirement will be smaller.

static void BulkCopy()
        {
            SqlConnection conn = new SqlConnection(ConnectionString);
            conn.Open();
            SqlCommand cmd = conn.CreateCommand();
            cmd.CommandText = "            create table #tempIDs (id int primary key);";
            cmd.ExecuteNonQuery();

            SqlBulkCopy bulkcopy = new SqlBulkCopy(conn);

            bulkcopy.DestinationTableName = "#tempIDs";
            SqlDataAdapter schemaAdapter = new SqlDataAdapter("select id from #tempIDs", conn);

            DataTable dt = new DataTable();
            schemaAdapter.Fill(dt);
            for (int i = 0; i < 1000000; i++)
            {
                DataRow row = dt.NewRow();

                row["id"] = i;

                dt.Rows.Add(row);
            }
            bulkcopy.WriteToServer(dt);

            cmd.CommandText = "select * from sys.objects   where object_id in (select id from             #tempIDs )";
            SqlDataReader reader = cmd.ExecuteReader();
            while (reader.Read())
            {
                Console.WriteLine(reader[0]);
            }

            conn.Close();

        }

 

Jack Li  | Senior Escalation Engineer | Microsoft SQL Server Support