In ASP.NET web application, if you do a lot of GetElementsByTagName() operations with an XML document which is stored in ASP.NET Application state, the CLR memory usage will continuously increase and finally leads to OOM(Out Of Memory).
This problem occurs because the GetElementsByTagName method returns an XmlNodeList collection that registers listeners(instances of XmlNodeChangedEventHandler) on the NodeInserted and the NodeRemoved events. For example, when you call the GetElementsByTagName method ten times, the NodeInserted and the NodeRemoved events have ten listeners. Therefore, when you call the GetElementsByTagName method many times, many XmlNodeChangedEventHandler objects are created and they will only be released when the XmlDocument is released.
With the memory Userdump, we can find most of the memory is consumed by XmlNodeChangedEventHandler and XmlElementList. Please ignore the XmlElementList, because they are created together with XmlNodeChangedEventHandler. The amount of XmlNodeChangedEventHandler is almost two times of XmlElementList, this means two listeners(on NodeInserted and NodeRemoved events) serve for one XmlElementList.
0:000> !DumpHeap -stat
Using our cache to search the heap.
Address MT Size Gen
0x79bff564 1 12 System.Runtime.Remoting.Activation.ActivationListener
0x16b111c4 92,970 1,859,400 System.Xml.XmlText
0x0221236c 767 2,005,896 System.Char
0x0221209c 56,987 3,424,876 System.Object
0x79b94690 163,304 15,341,816 System.String
0x17c4f0c4 4,159,551 183,020,244 System.Xml.XmlElementList
0x16adcc14 8,319,114 232,935,192 System.Xml.XmlNodeChangedEventHandler
Total 13,363,835 objects, Total size: 456,945,968
If you never manually add the listeners on the XmlDocument object, then it is mostly caused by GetElementsByTagName() operations. And we can find the memory is continuously increasing as time go on.
However, we cannot say this is a bug for GetElementsByTagName().The MS implementation of this function conforms to the W3C Level1 DOM spec. NodeLists and NamedNodeMaps in the DOM are "live", that is, changes to the underlying document structure are reflected in all relevant NodeLists and NamedNodeMaps. In other words, GetElementsByTagName is, according to the spec, supposed to return a ‘live list’ where changes to the underlying DOM are reflected in the returned NodeList.
For details please refer to http://msdn.microsoft.com/en-us/library/system.xml.xmlelement(VS.80).aspx