Threat Modeling, once again

About 2.5 years ago, I wrote a series of articles about how we threat model at Microsoft, about 18 months ago, I made a couple of updates to it, including a post about why we threat model at Micrososoft, and a review of how the process has changed over the years.

It's threat modeling time again in my group (it seems to happen about once every 18 months or so, as you can see from my post history :)), and as the designated security nutcase in my group, I've been spending a LOT of time thinking about the threat modeling process as we're practicing it nowadays.  It's been interesting looking at my old posts to see how my own opinions on threat modeling have changed, and how Microsoft's processes have changed (we've gotten a LOT better at the process).

One thing that was realized very early on is that our early efforts at threat modeling were quite ad-hoc.  We sat in a room and said "Hmm, what might the bad guys do to attack our product?" It turns out that this isn't actually a BAD way of going about threat modeling, and if that's all you do, you're way better off than you were if you'd done nothing.

Why doesn't it work?  There are a couple of reasons:

  1. It takes a special mindset to think like a bad guy.  Not everyone can switch into that mindset.  For instance, I can't think of the number of times I had to tell developers on my team "It doesn't matter that you've checked the value on the client, you still need to check it on the server because the client that's talking to your server might not be your code.".
  2. Developers tend to think in terms of what a customer needs.  But many times, the things that make things really cool for a customer provide a superhighway for the bad guy to attack your code. 
  3. It's ad-hoc.  Microsoft asks every single developer and program manager to threat model (because they're the ones who know what the code is doing).  Unfortunately that means that they're not experts on threat modeling. Providing structure helps avoid mistakes.

So how do we go about threat modeling?

Well, as the fictional Maria Von Trapp said in her famous introductory lesson to solfege, "Let's start at the very beginning, A very good place to start"...

 

One of the key things we've learned during the process is that having a good diagram is key to a good threat model.  If you don't have a good diagram, you probably don't have a good threat model.

So how do you go about writing a good diagram?

The first step is to draw a whiteboard diagram of the flow of data in your component.  Please note: it's the DATA flow you care about, NOT the code flow.  Your threats come via data, NOT code.  This is the single most common mistake that people make when they start threat modeling (it's not surprising, because as developers, we tend to think about code flow).

When you're drawing the whiteboard diagram, I use the following elements (you can choose different elements, the actual image doesn't matter, what matters is that you define a common set of elements for each type):

Element Image Element Type What the heck is this thing?
image External Interactor An external interactor is an element that is outside your area of control.  It could be a user calling into an API, it could be another component but not one that's being threat modeled.  For example, if you're threat modeling an API, than the application which invoked the API is an external entity.  On the other hand, if you're threat modeling an application that calls into an API, the API is an external entity
image Process A "process" is simply some code.  It does NOT mean that it's a "process" as OS's call processes, instead it's just a collection of code.
image Multiple Process A "multiple process" is used when your threat model is complex enough to require multiple DFDs (this is rarely the case, but does happen).  In that case, the "multiple process" is expanded in the other DFD.  I'm not sure I've ever seen a threat model that used a "multiple process" element - you can usually break out everything that you want to break down, so they're very rarely seen.
image Data Store A datastore is something that holds data.  It could be a file, a registry key, or even a shared memory region.
image Data Flow A dataflow represents the flow of data through the system.  Please note that it does NOT represent the flow of code, but that of data.
image Trust Boundary A trust boundary occurs when one component doesn't trust the component on the other side of the boundary.  There is always a trust boundary between elements running at different privilege levels, but there sometimes are trust boundaries between different components running at the same privilege level.

image

Machine Boundary A machine boundary occurs when data moves from one machine to another.
image Process Boundary A process boundary occurs when data moves from one OS process to another. 

 

You build a data flow diagram by connecting the various elements by data flows, inserting boundaries where it makes sense between the elements.

 

 Now that we have a common language, we can using it to build up a threat model.

Tomorrow: Drawing the DFD.