DOM Traversal


The latest Platform Preview Build includes two great interoperable features for working with the DOM – DOM Traversal and Element Traversal. These features provide web developers with simple, flexible, and fast ways of traversing through a document using the same markup across browsers. These features come in the form of flat enumeration, simplifying the DOM tree to an iterative list, and filtering which enables you to tailor the set of nodes you traverse. These features work with the same markup across browsers – you can try out any of the code here in the IE9 platform preview and other browsers.

Without these features, finding an element of interest on a page requires you to do one or more depth-first traversals of the document using firstChild and nextSibling. This is usually accomplished with complex code that runs slowly. With the DOM and Element Traversal features, there are new and better ways of solving the problem. This blog post is a primer and provides a few best practices to get you on your way.

I’ll start with Element Traversal, since it’s the simplest of the interfaces and follows familiar patterns for enumerating elements in the DOM. Element Traversal is essentially a version of DOM Core optimized for Elements . Instead of calling firstChild and nextSibling, you call firstElementChild and nextElementSibling. For example:

if (elm.firstElementChild)
{
elm = elm.firstElementChild;

while (elm.nextElementSibling)
{
// Do work...
}
}

This is faster and more convenient, saving you the trouble of having to check for text and comment nodes when you’re really only interested in elements.

DOM Traversal is designed for much broader use cases. First, you create a NodeIterator or a TreeWalker. Then you can use one of the iteration methods to traverse the tree:

var iter = document.createNodeIterator(elm, NodeFilter.SHOW_ELEMENT, null, false); // This would work fine with createTreeWalker, as well

var node = iter.nextNode();
while (node = iter.nextNode())
{
node.style.display = "none";
}

The codepath above iterates through a flat list of all nodes in the tree. This can be incredibly useful since in many cases you don’t care whether something is a child or sibling of something else, just whether it occurs before or after your current position in the document.

A big benefit of DOM Traversal is that it introduces the idea of filtering, so that you only traverse the nodes you care about. While nodeIterator only performs flat iterations, TreeWalker has some additional methods, like firstChild(), that let you see as much or as little of the tree structure as you want.

The SHOW_* family of constants provides a way to include broad classes of nodes, such as text or elements (like SHOW_ELEMENT in the earlier example). In many cases, this will be enough. But when you need the most precise control, you can write your own filter via the NodeFilter interface. The NodeFilter interface uses a callback function to filter each node, as in the following example:

 

var iter = document.createNodeIterator(elm, NodeFilter.SHOW_ALL, keywordFilter, false);

function keywordFilter(node)
{
var altStr = node.getAttribute('alt').toLowerCase();

if (altStr.indexOf("flight") != -1 || altStr.indexOf("space") != -1)
return NodeFilter.FILTER_ACCEPT;
else
return NodeFilter.FILTER_REJECT;
}

For a live example, check out my demo for DOM Traversal — I used NodeFilter extensively there. Complex filtering operations on the list of media elements were as simple as using a NodeFilter callback like the one above.

In this post, I showed that you have options in how to traverse a document. Here are suggested best practices for when you should use the various interfaces:

  • If the structure of the document is important – and you’re only interested in elements – consider Element Traversal. It’s fast and won’t leave a big footprint in your code.
  • If you don’t care about document structure, use NodeIterator instead of TreeWalker. That way, it’s obvious in your code that you’re only going to be using a flat list. NodeIterator also tends to be faster, which becomes important when traversing large sets of nodes.
  • If the SHOW_* constants do what you need for filtering, use them. Using constants makes your code simpler, as well as having slightly better performance. However, if you need fine-grained filtering, NodeFilter callbacks become indispensable.

I’ve already found these features to be a great help in my own coding, so I’m really excited to see what you do with them. Download the latest Platform Preview, try out the APIs, and let us know what you think.

Thanks!
Jonathan Seitel
Program Manager

Comments (24)

  1. Anonymous says:

    the IE9 Whether to support Array.valueOf()

  2. Anonymous says:

    Awesome! Can't wait for IE9 to come out.

    Also, why is box-shadow implemented differently from other browsers? It 'fades out' too quickly. And please implement text-shadow too, that would be the only thing stopping me to write "Fully Supported" in my browser support list for IE9 🙂

    Thanks

  3. Infinte says:

    @jimmie lin: The box-shadow feature is currently uncomplete and under test(even not included in the release notes).

  4. Anonymous says:

    @Infinte: the box-shadow feature is actually complete, but its blur radius property is extremely buggy (I submitted a bug on it, still waiting for an answer): if you specify '0' for blur radius, the shadow works properly. If you specify anything other than '0', you get the following cases:

    – blur is non-linear

    – shadow's opacity is set around 0.5 and impossible to override

    – shadow cuts off before the page's bottom

    Back to this post.

    Here again, we see several improper Jscript habits that MUST NOT be followed:

    – do not use loose comparison (!=), use strict ones ( !== ).

    – one VAR statement is allowed by function or program body: don't do 'var i=something;var j=else;' but do 'var i=something,j=else;'. Note that you can declare a variable without setting it, like this: 'var i=something,j=else,x;'

    – on an IF…ELSE statement, individual statements must be put between accolades '{}' otherwise the first Jscript compressor around will screw it up.

    – in the condition set for the IF, individual booleans should be put between parentheses, otherwise you risk a bad evaluation.

    Please IE team, before you put example on your pages, put them for a spin in JSlint.

    Otherwise, good job on finally implementing transversal methods in the DOM. Let's just hope IE 9's DOM construction won't be as wonky as in previous versions, otherwise we'll still need IE-specific code (what is a node and what isn't still differs between IE9pre3 and other browsers).

  5. Anonymous says:

    Please stop with the "same markup" slogan.  JavaScript is not markup and it makes you look utterly clueless when you refer to it as such.

  6. Anonymous says:

    >the box-shadow feature is actually complete

    Based on what do you make this claim? You don't work on IE, do you?

    > – do not use loose comparison (!=), use strict ones ( !== ).

    While generally a best practice, it makes absolutely no functional difference in the example provided.

    >one VAR statement is allowed by function or program body

    If you're going to use tools like JSLint, you should probably understand the difference between Doug Crockford's personal style and what's actually *required* by the language.

    The language has no such requirement, which is good, since following a style like that leads to very confusing code.

    >statements must be put between accolades

    Those are called "curly braces" both by developers and the JavaScript standard. And, if your JavaScript compressor doesn't know how to parse JavaScript, you should probably find a better one.

    >otherwise you risk a bad evaluation

    No. Otherwise you risk misunderstanding your own code, but no JavaScript engine would get it wrong.

  7. Anonymous says:

    NodeFilter is still undefined in IE9PP3. Is that sample a sign that next PP/Beta will allow us to access Node constants ?

  8. Anonymous says:

    Just noticed that code in the demo 🙂

    <<

    if (!NodeFilter)

    {

    var NodeFilter = new Object();

    NodeFilter.SHOW_ELEMENT = 1;

    NodeFilter.FILTER_ACCEPT = 1;

    NodeFilter.FILTER_REJECT = 2;

    NodeFilter.FILTER_SKIP = 3;

    }

    >>

    This is why IE's demos are working while samples hosted elsewhere were failing. You should have said it in the relaese notes, I think.

  9. Anonymous says:

    The CSS3 property ' Text-shadow:2px 2px 2px 2px; " is still not working in IE9 Platform preview 3. So please fix this one also for IE9.

  10. Anonymous says:

    I really hope IE9 supports bigger larger buttons.

  11. Anonymous says:

    @No…:

    The box-shadow implementation in IE 9 is feature-complete but the blur radius property is buggy: well, it IS complete (it parses and acts on all parameter changes submitted to it, I tried) and one (significant) bug is on blur radius. So there.

    Using loose comparison: in THAT case, it's not a big risk; it's still a bad habit, and one should use loose comparison only when the process is well understood – the best way to remove bugs is to not allow them in first.

    If declaring variables at the start of a function is so difficult, why do they do it in Delphi and in C (and all its descendants), I wonder. And while ECMAscript indeed allows multiple VAR statements, it's good practice (and a performance saver) to allocate all your variables at the start: the parser and evaluator will not need to increase the "stack's" size (or more probably, have to re-work the index) more than necessary.

    About 'accolades' VS. 'curly braces': well, I'm not English; sometimes I mix up some terms. I'd like to see your French. About their use: here, the reason the parser doesn't go wonky because it's a 'simple' statement (only one instruction per code block); add one (ONE) line in either of these blocks = break! Bug! Time spent debugging due to the lack of curly braces (or accolades). Not using curly braces is Not A Good Idea ™. If you want a braces-less statement, you could simply use:

    return ((altStr.indexOf("flight") !== -1 || altStr.indexOf("space") !== -1)?NodeFilter.FILTER_ACCEPT:NodeFilter.FILTER_REJECT)

    Which is the syntax intended to replace very simple IF…ELSE statements.

    About parentheses: too many of them won't hurt, while not enough may cause errors in your code to be happily parsed and give unwanted results – which means longer testing, and more difficult bug hunting.

  12. Anonymous says:

    @Mitch74: re "If declaring variables at the start of a function is so difficult, why do they do it in Delphi and in C (and all its descendants), I wonder."

    I cannot understand where you took that notion from. C does not force local variables to be declared in a single place (since the C99 standard), and the same is true for all its well known descendants C++, Objective C, C# and Java. As for Delphi, Oxygene (aka Chrome) also allows local variable declarations within any code block.

    The trend looks pretty clear to me… maybe it wasn't such a hot idea after all.

  13. Anonymous says:

    "Mitch"– It's not your place to say a feature is complete, particularly if you're trying to suggest that it's buggy. If it's buggy (and a bug filed on the issue hasn't been punted), then, by definition, it's not complete.

    Maybe things are different in France?

    There's nothing inherently wrong with declaring your variables at the top– the point is that collapsing them into one statement is both unnecessary and likely to lead to confusion. When you're complaining about code samples, you should demonstrate clearly that you understand the difference between the requirements of the language and what are common coding style preferences.

  14. Anonymous says:

    @mitch74

    > one VAR statement is allowed by function or program body

    I heard that in a webcast that gave a lot of other interesting generalizing advice like

    'dont use bitwise operators because you dont need them'

    and

    'multiplicative integer operations using bitshifting is not faster but probably slower than using multiplicative operators'.

    But i found out that the first is only true if you dont need them, and that the second is wrong.

    As for the variable declarations, i believe there are situations where the code 'is better' when the declarations are not at the beginning of the function or program body, e.g. the initializer expression of a three-expression 'for' loop. Or loop bodies and intermediate results.

    The speaker in that webcast also made some JSON – XML comparisons that revealed a fatal lack of  XML knowledge. He really should have been prepared better, especially for a beginners class webcast.

    I hope you didnt see the same webcast and follow all the speakers opinions without double checking and thinking for yourself.

    g

  15. Anonymous says:

    @Jim

    "Please stop with the "same markup" slogan.  JavaScript is not markup and it makes you look utterly clueless when you refer to it as such."

    YES. Not to mention the fact that you can't possibly engineer a single browser in a way that ensures that developers can write markup that will be rendered the same by the browsers you DON'T control.

    This was obviously the idea of some less-than-technical manager, and I'm sure the hard-working IE9 team is silently bearing the painful stupidity of this slogan. It makes about as much sense as Chewbacca living on Endor.

  16. Anonymous says:

    @unicorn: They don't need to control what the other browsers do– they can simply test them. If there is consensus (spec or no) then they can match their behavior with IE9.

    Your choice of a Star Wars metaphor clearly flags you as NOT the sort of reader they're optimizing for with the "same markup" slogan– they're trying to reach the hundreds of millions of more normal people who understand "same markup" == "good"… good for business, good for users, good for the web.

  17. Anonymous says:

    @NotLivingInMomsBasement

    Ah HA! It was a South Park reference to a Star Wars metaphor, which flags me as EXACTLY one of the hundreds of millions of people they're trying to reach, and your failure to publicly recognize that flags you as somebody who has no right to be flagging me as a… em… I don't know.

    Anyway, yes, they can attempt to adhere to other browsers' implementations IF there is a consensus, but they can do nothing for the cause of "same markup" if Opera, Firefox, and Safari are all handling something in different ways. The best they can manage in such a case is "Slightly Less Different Markup". Don't get me wrong, I applaud the efforts in the direction of coming back from Obscure Microsoft Island and joining the mainland where everybody else is, but it just sounds dumb for a browser team to be boasting that they are making their browser so that your HTML will work the same way in all browsers. It just doesn't make sense. Did I mention Chewbacca?

  18. Anonymous says:

    Kiddo, defending yourself as a "SouthPark-watcher" isn't going to win you a lot of credibility when someone accuses you of living in your parents' basement.

  19. Anonymous says:

    Huh? Oh, was he accusing me of that? I assumed that he was just adamantly declaring that HE did not live in his mother's basement. I completely missed that. Either way, where I live is irrelevant, and my credibility is not the issue. For the sake of argument, let's go ahead and say I live in TEN mother's basements… regardless of this, you either agree that "same markup" is a silly promise that a single browser maker can't deliver on, or you ascribe to some other belief. I, anonymous Internet commenter, and my illustrious reputation, neither add weight to nor subtract weight from my simple argument.

  20. Anonymous says:

    – CSS Transitions

    – CSS 2D Transforms

    – CSS Border Image

    Begging on both knees, IE Team.  Please make it happen!

  21. Anonymous says:

    > I’ve already found these features to be a great help in my own coding …

    How so? Since a web site needs to deal with downlevel browsers, there will need to be a completely different code path for the two cases. That's why we have frameworks like jQuery and Prototype trying to hide those differences. So I can see how frameworks might find these DOM traversal features useful, but if you're using them bareback it seems like you'll end up writing a lot of framework-like wrappers around them.

    Given that querySelectorAll exists, a NodeIterator or TreeWalker don't seem that handy. I wouldn't be surprised to find, for example, that document.querySelectorAll("[alt*=space], [alt*=flight]") is much faster than your example above, since there's no Javascript involved in the node inspection or tree walk.

  22. Anonymous says:

    Why does MSFT always find odd examples to use?

    If you are testing for the alt attribute, then you are looking for images… those images are already available in a DOM-L0 collection;

    document.images

    thus you can save piles on the traversal with:

    for(var i=0,iL=document.images.length;i<iL;i++){

     //do tests…

    }

    or even document.getElementsByTagName('img');

    You are surely not suggesting that developers should ignore the specs and add alt attributes to all kinds of other elements are you?

  23. Anonymous says:

    Writing examples that are understandable is to be commended. And ALT is a supported attribute on OBJECT, AREA, APPLET and INPUT in addition to IMG.

  24. Anonymous says:

    @Mitch 74,

    Your personal programming habits do NOT mean they should be obeyed by everyone.

    "one VAR statement is allowed by function or program body: don't do 'var i=something;var j=else;' but do 'var i=something,j=else;'. Note that you can declare a variable without setting it, like this: 'var i=something,j=else,x;'

    If declaring variables at the start of a function is so difficult, why do they do it in Delphi and in C (and all its descendants), I wonder. And while ECMAscript indeed allows multiple VAR statements, it's good practice (and a performance saver) to allocate all your variables at the start: the parser and evaluator will not need to increase the "stack's" size (or more probably, have to re-work the index) more than necessary."

    Please, we are not in the 1980's anymore. I've been programming since the early 1980's, and yes in the early days performance and resource-saving WAS critical so people usually recommends those kind of "tricks", but now we are in the 21st century, and currently we teach people program readability > performance and resource-saving, unless it's some mission-critical low-level programming for stuffs like SoC and extremely limit resources. And one variable declaration for a whole program body is a big NO in this case. People prefer clearer and readable code much more than putting all kinds of important and temporary variables in one line at the start of a program. The performance gain of putting all variable declaration at the start in one line is negligible given today's hardware resource and interpreter/compiler/JIT compiler optimization technologies.

    You can go look at any of those large websites and javascript frameworks out there, like Google's web apps, Apple's sites, Prototype framework, etc. etc. and you'll find them using one VAR for one important variable declaration, for better code readability over negligible performance gains. For these couple decades, stuffs like "var i=something,j=else,x;" is the bad practice. Every program guide tells you to do "for (int i=0;i<something;i++) for (int j=0;j<else;j++) { … }" nowadays, NOT "int a,b,c,i,j,k,…;  … after 100 lines of codes … for (i=0;i<something;i++) for (j=0;j<else;j++) { … }", although the latter theoretically helps performance, it's a big NO-NO in this day and age.

    Sames goes for the "!=" vs. "!==", it's like saying people should do "if (0==i)" instead of "if (i==0)", that's just some recommendations that theoretically improves productivity, but in reality it's all down to personal preferences, the extra time costs to consciously typing the extra "=" everywhere when it's not exactly needed, may far surpass the potential and theoretical productivity improvement.