Interesting thing found at OSCON: Taint

I attended a session this morning called "PHP Taint Tool: It Ain't a Parser" by Luke Welling. Luke introduced a tool he's working on at OmniTI that is designed to assist in sniffing out where the potential for untrusted input is handled. From the session description:

... You want to see where untrusted input can propagate taint within the application. In complex logic that might mean chasing many possible execution paths. Using an automatic tool to try to follow these paths without running all possible input variations is called static analyis. ... The Taint tool allows the PHP engine to do as much as possible, then cuts in at the last stage to analyze the compiled opcodes and trace possible flow of execution.

The Taint tool presents opcodes in a readable way, making it clear what lines of source got compiled into specific opcodes. It also performs a static analysis on the code, following the opcodes to attempt to trace all possible code branches and mark lines that tainted data can be passed to.

Essentially, the tool uses the parts of the PHP engine to compile PHP code to opcodes, and then tracks where data comes and goes, and highlights the code that handles data that *could* be tainted--that is, input from the user either by POST or GET parameters.  This provides a facility for a developer to identify the lines that they should closely review to ensure that they are not accidentally introducing security holes (like cross-site-scripting opportunities). 

Now, it's not-quite-ready for prime-time, but it's getting close, and the folks over at OmniTI intend to release it as open source when they are ready.  When this gets released, I'll be really excited, as it looks like it could be really good for hunting down security holes.

I also attended Rasmus Lerdorf's (the Yahoo PHP guy) tutorial on "PHP: Architecture, Scalability, and Security" that was really quite good too, and he demonstrated a tool (the name of which I can't remember now...grrr) that they have at Yahoo that he points to a web page, and it starts throwing a large library of strings that may uncover security problems, but it does it from the client side.  Unfortunately, he's not releasing it, not because he doesn't want to let folks find and fix their bugs, but because the release of a such a tool could bring about Internet Armageddon--it would likely find exploitable problems in the vast majority of the Internet. 

Both approaches to finding application holes are useful, and it's clear from both talks that this is still a really large problem that developers need to address.

(I've had a problem with spam comments; I'll be addressing that soon, so if you see comments turned off you can drop me a email: garretts ... at ... microsoft ... dot ... com)