Optimizing picture password security

We wanted to talk a bit more about the security of picture passwords in a follow up post based on some of your comments. Jeff Johnson, the Director of Development for the User Experience team, is particularly interested in the math and security of this feature and authored this post on how to optimize the security of the picture password. Since this is a new form of logging on and concerns over security (especially with mobile devices) as well as new authentication techniques (fragility of facial recognition for example or the challenges we've seen with biometrics) it is no surprise folks took to thinking about potential pitfalls in the approach. Our goal was to provide a convenient mechanism that was clearly no less secure than text passwords (all that math Jeff provided). Below Jeff talks about why this is a robust solution in general. Keep in mind in reading this that over the years many "best practices" have been established for typed passwords (policies such as numbers+letters+mixed case, length, inability to recycle passwords, no dictionary words, etc.) as well as important cautions (such as avoiding public internet terminals with potential for overhead cameras or keystroke loggers) -- these types of practices all have analogs in the use of picture password as you can imagine. Jeff outlines some of these and the logic behind the security of the model. --Steven


A question we’ve been asked several times in one way or another is “I care about keeping my machine secure; what are the best practices for creating the most secure sequence of login gestures?” This leads to an interesting (at least to me, as a math guy) analysis. It involves game theory, but first I’ll distill it down to the following best practices.

  • Pick a photo that has at least 10 points of interest. A point of interest is an area that can serve as a landmark for a gesture – a point that you would touch, places you would connect with a line, an area you would circle.
  • Use a random mixture of gesture types and sequence. While a line is the gesture that has the most permutations, if you always use 3 lines, that actually makes it easier for an attacker, as they can rule out trying sequences with the other gesture types.
  • If you choose to use a tap, a line, and a circle, randomly choose the order of those gestures; this creates 6 times the number of combinations as a predictable order.
  • For circle gestures, randomly choose whether you draw it clockwise or counterclockwise. Also consider making the size of the circle bigger or smaller than the “expected” size.
  • For line gestures, your instinct may be to always draw from left to right, but it is more secure if you randomly choose the direction with which you connect the two points.
  • As with all forms of authentication, when entering your picture password, avoid allowing other people to watch you as you sign in.
  • Keep your computer in a secure location where unauthorized people do not have physical access to it.  As with any password entry, be aware of line of sight and potential recording devices that intrude on your screen.
  • Be aware that smudges on the screen could potentially identify your gestures. Clean your screen thoroughly on a regular basis. Although this increases the risk if you clean, sign in, and then do nothing, the buildup of oils from repeated use is generally easier for an attacker to see (plus, who likes using an oily device?). Note that buildup is more of an issue for entering numeric PINs, when the device is frequently turned on and off and you enter the sequence dozens of times a day (oils can build up in those locations). Periodically look at your screen at an oblique angle while on the picture password login screen and see if there appears to be a pattern pointing to your gesture sequence. If so, either clean your screen or add a handful of additional smudges in the picture password area (which effectively increases the POIs discussed below)

If you follow these tips, you will substantially increase the security of your computer.

As several comments suggested, we also considered shrinking the size of the image and displaying it at random positions and slight rotations on the screen to minimize any risk from smudges.  We knew from usability feedback that decreasing the size of the image both increased the difficulty of properly entering the gesture and made the login experience feel less immersive; however, if there were a significant improvement to security, we wanted to consider the costs and benefits.  What we discovered was that while shifting the image could reduce the buildup of smudges in specific spots, there were even more prominent “clouds” of taps, lines and circles that were identical relative to each other.  With this information, an attacker could easily figure out the gestures relative to each other.  With that information, it was a simple exercise to move them around the picture until they appeared to coincide with significant elements of the picture.  There wasn’t a noticeable improvement in security and we were able to measure significant degradations to the fast and fluid user experience.  In reality, using smudges is very difficult.  When we took tablets that had been used for a number of days by folks, there were typically too many smudges to even begin to deduce their gesture set.  Even when we were given their login sequence and knew what to look for we had limited success.  We included this analysis because we feel it is important that whenever any innovative new technology is introduced that potential attack vectors are disclosed and the technical community can reach a general consensus of the degree of a threat and its potential mitigations.  Of course we also have confidence that screen technologies will continue to improve and smudges will someday seem quaint.

The analysis

It is also interesting to compute the odds of an attack succeeding in various scenarios. As discussed in the previous blog post, gestures are based on a 100 x 100 grid, giving even the simplest gesture (the tap) a potential of 10,000 values (given proximity matching, this number is effectively reduced to 270). In reality, the number of points of interest (POI) is much lower than that – there are only so many memorable locations in a given photograph.

Although there are other ways to structure an analysis, for the purposes of this discussion we will assume that there are a small number of POIs, and all gestures involve only those points. We assume that taps are directly on a POI, circles only come in two sizes (say, small around the point, and larger around the point) and two directions (clockwise and counterclockwise), and lines always connect two POIs. Because this isn’t strictly true, the number of permutations is actually even greater.

Windows provides additional protection for picture passwords (and PINs) by disabling the login mechanism after 5 incorrect tries (you then have to use your conventional password). With this in mind, it is interesting for a given scenario to frame the relative security in two ways.

First, what are the odds that an attacker with full knowledge of your gesture selection methodology would be able to sign in to your machine before the lockout is triggered (we will refer to this as Odds1). If there are x equally likely gesture sequences, then the odds of guessing it in five tries before lockout are 5 / x .

The second interesting view is assume you were given 100 machines each with a password picked randomly according to the rules of the scenario (we will refer to this as Odds100). What are the odds that an attacker could log in to at least one of those machines? Since these are independent events, the odds of this are:
  1)/x)^100.

Base scenario

Let’s assume a horribly insecure scenario: Your “picture” is entirely black with a single white dot in the middle of it. Because there is only one POI, only the tap and circle gesture can be used (there is nowhere to connect a line to). Obviously, if I used only the tap gesture, an attacker would have 100% success as the only valid sequence would be three taps on the white dot. Let’s assume we only use circles and no points. There are 4 possible circles we can randomly choose for each gesture. This gives us a total of 43 = 64 possible gesture sequences. For this scenario, Odds1 is 7.81% and Odds100 is 99.97%. It’s surprising that for a single machine the odds of a successful sign in with my picture password is less than 8% (my intuition would have guessed a higher number), though you can see it is a virtual certainty that with 100 machines, at least one of them would be compromised. While some users might be comfortable with these odds, most security conscious folks and IT admins who manage a population of machines would find this unacceptable.

Let’s now augment the scenario by saying we will randomly choose for each gesture whether it is a tap or a circle. It is tempting to say that this doubles the complexity of each gesture, but it does not. There are 4 possible circles and 1 possible tap, so there are 5 unique gestures giving a total of 125 sequences.

Let’s say that we choose to implement our new “random” methodology as follows: flip a coin to determine if it’s a tap or a circle. If it’s a circle, we’ll randomly decide which of the four possibilities it will be. While this seems nice and random, it is actually less secure than just using only circles. This is because half the time we will pick a gesture for which there is only one possibility (the tap). An attacker would focus their attack on gestures that featured two or three taps and achieve higher success. An ideal attack strategy (there are others with identical odds) would be to test for 3 taps, and then test for two taps followed by each of the four circle types for the 5 attempts before lockout. Instead of the apparent Odds1 of 4% (an improvement over the previous 7.81%), an attacker would actually achieve Odds1 of 25%, more than three times worse than just using circles. Statistics can be tricky!

Fortunately, there is an easy fix to this scenario. For each gesture, we pick a random number between 1 and 5. If it is a 1, we use a tap. Otherwise we use the value to pick one of the 4 circle possibilities. This does yield an Odds1 of 4% (almost twice as good as the first scenario), but the Odds100 is still an abysmal 98.31%.

A slight improvement

Let’s make just a small improvement to our methodology. This scenario involves a picture with only two POIs (it’s really hard to imagine a real photo this simple, so we can pretend it’s a black canvas with two white dots). This allows us to add the line gesture, but there are only two possibilities for it: drawing from the first dot to the second, or from the second to the first.

Learning from the previous example, we will not randomly pick the gesture type and then the gesture. We will sum up all possible gestures and then pick a random number to map with equal probability onto each possible gesture. There are 2 possible taps, 8 possible circles, and 2 possible lines. The total number of gesture sequences is 123=1728. This gives us an Odds1 of .29% and Odds100 of 25.2%. It is somewhat remarkable that so simple of a picture with only 2 POIs would have odds this low for a successful attack. Even if you had 100 machines to attempt to break into, you would only succeed getting into at least one machine 1 out of 4 tries.

Ramping it up

Let’s assume there are now 5 POIs in your picture. I can begin to imagine some very simple pictures where this might be the case. We now have 5 possible taps, 20 possible circles, and 20 possible lines. This gives us 453=91,125 possible sequences. Odds1 is now vanishingly small at 0.0055% and Odds100 is also very low at 0.55%. For many users, these odds are sufficient to protect their data.

To the max

Let’s assume you are very security conscious and choose a picture with 10 POIs. There can be debate as to how many POIs a particular photo contains. However, it doesn’t matter how many POIs are “obvious” as long as you pick 10 points that are identifiable to you to randomly choose gestures with. Actually, if some of the points aren’t obvious (but you can still reliably target them), that is a security plus.

We now have 10 possible taps, 40 possible circles, and 90 possible lines. This is a very robust 1403=2,744,000 sequences. Odds1 is vanishingly small at 0.0002%. In fact, you are more than 50 times more likely to win $10,000 with a $1 ticket in the Washington State Select 4 Lottery than you are to have your machine broken into using a picture with 10 POIs! The Odds100 has dropped to 0.018% and even Odds1000 is only 0.18%.

Social engineering

Social engineering is one of the most significant threats to sign-in security of all types, whether password, PIN, or picture password. Using a randomizer to help construct your sign-in sequence is equally useful for each of these methods.

For the technical enthusiast, it is possible to implement the above schemes with a small amount of programming or the use of Excel. However, it would be useful to have a lower tech way of creating a gesture sequence that a larger audience could employ. Of course, we should not be under any illusions that the number of people who seek out these tools and procedures will be any greater than the number who would voluntarily pick strong text passwords if not required by site admins.

Roll of the dice

As a whimsical exercise, I thought it would be fun to come up with an analog way of generating a random gesture sequence. To do this, I chose to employ a six-sided die (D6 for hard core gamers :-)) to generate a 6-POI gesture sequence. In addition to mapping nicely onto the die, a 6 POI picture has the useful property that the number of possible lines (30) exactly equals the number of taps (6) plus circles (24), so it is easy to bifurcate the gesture type as well.

Repeat the following steps for each of the three gestures:

  1. Roll the die.
    The number indicates which of the six POIs to use for the gesture (for a line it will be the starting POI).
  2. Roll the die again.
    • If the die is even, the gesture will be a line
      Roll the die again.
      If the number matches the first roll to pick the initial POI, reroll until you get a different number.
      This number is the second point for the line.
    • If the die is odd, the gesture will be a tap or circle
      Roll the die again.
      Use the roll value list below to determine the gesture.
      1 - The gesture is a tap
      2 - The gesture is a small clockwise circle
      3 - The gesture is a small counterclockwise circle
      4 - The gesture is a larger clockwise circle
      5 - The gesture is a larger counterclockwise circle
      6 - Reroll

As expected, the complexity provided by 6 POIs is between the numbers for 5 POIs and 10 POIs. Odds1 is 0.0023% and Odds100 is 0.23%.

We hope you enjoy using the new picture password sign-in as much as we have enjoyed creating it!

--Jeff Johnson