Web Application Memory leakage caused by XML operations – GetElementsByTagName()



 


Symptom


=============


In ASP.NET web application, if you do a lot of GetElementsByTagName() operations with an XML document which is stored in ASP.NET Application state, the CLR memory usage will continuously increase and finally leads to OOM(Out Of Memory).


 


Root Cause


=============


This problem occurs because the GetElementsByTagName method returns an XmlNodeList collection that registers listeners(instances of XmlNodeChangedEventHandler) on the NodeInserted and the NodeRemoved events. For example, when you call the GetElementsByTagName method ten times, the NodeInserted and the NodeRemoved events have ten listeners. Therefore, when you call the GetElementsByTagName method many times, many XmlNodeChangedEventHandler objects are created and they will only be released when the XmlDocument is released.


 


Analysis


=============


With the memory Userdump, we can find most of the memory is consumed by XmlNodeChangedEventHandler  and XmlElementList. Please ignore the XmlElementList, because they are created together with XmlNodeChangedEventHandler. The amount of XmlNodeChangedEventHandler  is almost two times of XmlElementList, this means two listeners(on NodeInserted and NodeRemoved events) serve for one XmlElementList.


 


0:000> !DumpHeap -stat


Using our cache to search the heap.


   Address         MT     Size  Gen


0x79bff564          1            12 System.Runtime.Remoting.Activation.ActivationListener


……


……


0x16b111c4     92,970     1,859,400 System.Xml.XmlText


0x0221236c        767     2,005,896 System.Char[]


0x0221209c     56,987     3,424,876 System.Object[]


0x79b94690    163,304    15,341,816 System.String


0x17c4f0c4  4,159,551   183,020,244 System.Xml.XmlElementList


0x16adcc14  8,319,114   232,935,192 System.Xml.XmlNodeChangedEventHandler


Total 13,363,835 objects, Total size: 456,945,968


 


If you never manually add the listeners on the XmlDocument object, then it is mostly caused by GetElementsByTagName() operations. And we can find the memory is continuously increasing as time go on.


 


However, we cannot say this is a bug for GetElementsByTagName().The MS implementation of this function conforms to the W3C Level1 DOM spec. NodeLists and NamedNodeMaps in the DOM are “live”, that is, changes to the underlying document structure are reflected in all relevant NodeLists and NamedNodeMaps. In other words, GetElementsByTagName is, according to the spec, supposed to return a ‘live list’ where changes to the underlying DOM are reflected in the returned NodeList.


 


For details please refer to http://msdn.microsoft.com/en-us/library/system.xml.xmlelement(VS.80).aspx


 


Solution


=============


To avoid this problem, please replace GetElementsByTagName with SelectNodes or SelectSingleNode. Another choice, don’t maintain the XmlDocument in memory for a long time.


 


Regards,


 


ZhiXing Lv

Comments (3)

  1. SharePoint Uploading Files to SharePoint Server 2007 from ASP.NET Web Applications by Using the HTTP

  2. Matt Hagopian says:

    Thank you!  We ran into this recently after updating to .NET 4 and had been tracking down the side-effects of this as opposed to the core issue.  This post was extremely helpful in resolving our issue.  Much appreciated!

    Kind regards,

    _Matthew

  3. Gil Matalon says:

    Thanks alot, this was a very helpful post.

Skip to main content