MSDN has had an article entitled Security Considerations: Dynamic HTML for a while. It is a good article, but it simply says what not to do. Everytime I run across it I promise myself I am going to write something more useful someday, something that says “Don't do this; do this instead.” Today is that day.
The following is a list of things you should be wary of:
These things are safe to use if and only if you do not allow untrusted input to find its way to them. For example, the following is perfectly safe:
document.writeln('Mysterious hardcoded string #5');
But I doubt the majority of use these functions see are to insert static strings. Frequently they are used to do more useful things. Let us assume you want to display an URL as a link on your page, but that URL can change and it is untrusted input. The simple and insecure way to do this is:
document.write('<a href=“' + szUrl + '“>' + szUrl + '</a>');
Now you have opened yourself up to all the nastiness of script injection. You must, at this point, resist the temptation to write code to 'sanitize' the string before the document.write(). This is very hard to do correctly. You can remove 'invalid' characters and you can subject it to regular expressions and you can do all sorts of clever things. What you must remember is the attacker can view the source code and spend an unbounded amount of time crafting an url that will get past your protection. Things like url escaping and the large variety of valid url syntax will give you quite a headache. If you must insert untrusted input, the following is the safe way to do it:
var aElement = document.createElement(”A”);
aElement.innerText = szUrl;
aElement.href = szUrl;
All of the bulletted items above can be converted in similar ways. When reviewing your code, ask if you really meant innerText where you used innerHTML. Grep your codebase for “document.write“ and justify the existence of each one. Ones that you cannot get rid of should be converted similar to the example above.
This particular flavor of script injection often manifests itself when web page authors use HTML dialogs. If you see the word dialogArguments [doc link] anywhere in your code, you need to be extra careful to avoid this kind of script injection. You should grep your code for “dialogArguments” and closely examine your usage of it.
Be careful about displaying untrusted input. You may be able to do it and safely and avoid script injection, but you might still allow clever people to format things such that they can fool casual surfers into doing something bad. I feel bad for not having a concrete example, but the mischievious readers out there will know what I mean. Keep this in mind when designing your web site.