Be careful with using RegEx on x64 bit boxes

I have been working on a case where a customer reported that, once he plugged in his custom assembly which implemented a set of SqlFunctions and he invoked those functions from T-SQL, SQL Server started eagerly eating all memory and performance dropped significantly.

We collected perfmon memory counters and the output from DBCC MEMORYSTATUS which showed these values for the SQLCLR clerk.

MEMORYCLERK_SQLCLR (Total)
VM Reserved = 12534912 KB
VM Committed = 6272956 KB
AWE Allocated = 0 KB
SM Reserved = 0 KB
SM Committed = 0 KB
SinglePage Allocator = 23392 KB
MultiPage Allocator = 15458112 KB

 

Meaning that his clerk, had accounted around 15 GB of memory allocated via the multipage allocator alone, which is a bit too much. :-)

 

What was all that memory being used for? We looked into the managed heaps and noticed they weren't using that much memory.

 

0:000> !eeheap -gc
Number of GC Heaps: 4
------------------------------
Heap 0 (0000000080a64d30)
generation 0 starts at 0x000000048103f018
generation 1 starts at 0x0000000481018f30
generation 2 starts at 0x0000000480ff0068
ephemeral segment allocation context: none
segment begin allocated size
0000000080880af0 0000064274e17588 0000064274e4d910 0x0000000000036388(222088)
00000000807f6220 00000642787c3188 0000064278804c38 0x0000000000041ab0(268976)
0000000480ff0000 0000000480ff0068 00000004812b9340 0x00000000002c92d8(2921176)
Large object heap starts at 0x0000000580ff0068
segment begin allocated size
0000000580ff0000 0000000580ff0068 0000000581147128 0x00000000001570c0(1405120)
Heap Size 0x4981d0(4817360)
------------------------------
Heap 1 (000000008078eea0)
generation 0 starts at 0x00000004c1058f60
generation 1 starts at 0x00000004c1014c80
generation 2 starts at 0x00000004c0ff0068
ephemeral segment allocation context: none
segment begin allocated size
00000004c0ff0000 00000004c0ff0068 00000004c12ca088 0x00000000002da020(2990112)
Large object heap starts at 0x00000005a0ff0068
segment begin allocated size
00000005a0ff0000 00000005a0ff0068 00000005a10900c8 0x00000000000a0060(655456)
Heap Size 0x37a080(3645568)
------------------------------
Heap 2 (00000000807b51d0)
generation 0 starts at 0x0000000501088298
generation 1 starts at 0x0000000501054a80
generation 2 starts at 0x0000000500ff0068
ephemeral segment allocation context: none
segment begin allocated size
0000000500ff0000 0000000500ff0068 000000050134baf0 0x000000000035ba88(3521160)
Large object heap starts at 0x00000005c0ff0068
segment begin allocated size
00000005c0ff0000 00000005c0ff0068 00000005c0ff0080 0x0000000000000018(24)
Heap Size 0x35baa0(3521184)
------------------------------
Heap 3 (00000000804d0080)
generation 0 starts at 0x000000054104d1f8
generation 1 starts at 0x0000000541024368
generation 2 starts at 0x0000000540ff0068
ephemeral segment allocation context: none
segment begin allocated size
0000000540ff0000 0000000540ff0068 00000005412d14e8 0x00000000002e1480(3019904)
Large object heap starts at 0x00000005e0ff0068
segment begin allocated size
00000005e0ff0000 00000005e0ff0068 00000005e1185ba8 0x0000000000195b40(1661760)
Heap Size 0x476fc0(4681664)
------------------------------
GC Heap Size 0xfe4cb0(16665776)

 

So, what was all that memory being used for then?

Well, it happens that our customer was using several instances of RegEx with huge pattern strings, plus setting the Compiled option. And... there's a bug in the current implementation of 64 bit jitter which under those circumstances, the jitter eats too much memory used to compile the regular expressions. That issue will be addressed in a future release of the CLR (scheduled to be part of one of the post Orcas builds). Until then, your workaround consists of not setting RegexOptions.Compiled on your regular expressions.

 

For more information about the mentioned bug, visit http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=96339

It's also worth reminding that not all assemblies which are provided by the .Net framework have been thoroughly tested, and therefore there is a policy which explains the scope of our support regarding this.

Thanks!