Using Face Recognition as Authentication or the CNTK vs Cognitive APIs Discussion

A couple of days ago, in one of those "exciting" conversations that we geeks have by the water cooler, I has pulled in to this discussion around using  "Face Recognition" in an app (i.e. validating user in front of the camera) to ensure the right user is using the application.

I saw all this argue about what approach should be used.

The argument were involved around the following. Should be better using "Cognitive APIs"? Should the "CNTK (Microsoft Cognitive Toolkit)" approach be used? Cool discussion, right? That's what we, developers, like to do, discuss cool tech stuff.

However looking a this hard fought battle of the paladins of this two factions, I could not help my self that, in this case, they should be discussing rather the concept itself and tackle the true issue there. And for me this was:

"Should we consider these technologies around "Face Recognition" accurate enough for authentication?" and in my opinion, we should not.

Of course, that I agree that using these new sexy "Face Recognition" technologies as first step or enhancer of the user experience is a good thing. But we need to think about the following. Once the user is authenticated with face recognition what kind of resources can he access? Is face recognition using domain username and password? Can face recognition unlock certificate or PIN number from TPM on laptop? Rightful concerns, right?

So, as a first step let me start by saying that we shouldn’t use the word “Authentication” to describe what this facial recognition does. This isn’t authentication. Cognitive APIs vs CNTK, as a second form, maybe. Even then, one could argue it’s a weak form. In fact, if I can print your photo from Facebook, walk in front of the camera and get myself authenticated, then we have a problem. This is why Windows Hello or iOS aren’t relying just on RGB input.

I know that, for example, Windows Hello is a very good and safer approach, but still is not 100% reliable, we should use MFA and not solely rely on facial recognition.

Same problem with speaker recognition (https://azure.microsoft.com/en-us/services/cognitive-services/speaker-recognition) which despite of what that page says, it really does not perform authentication.

The metric I should be use is FAR (False Acceptance Rate) and it is very low for Hello – 0.001%.  For many apps and customers, that is good enough. For others, they may want to use a second factor.

/en-us/windows-hardware/design/device-experiences/windows-hello-face-authentication

You may also want to check out the keyword “Adversarial Attacks” – one vulnerability of DNN.

https://e4c5blog.wordpress.com/2017/11/16/favercial-recognition-adversarial-attack/ https://arxiv.org/abs/1801.01944

This could be another layer that we would need to consider when we are taking these technologies for authentication.  Let alone how we would tell if the image taken from the camera is photo or real.

These, while very real, are in my opinion a bit theoretical.  In reality you need to “train” the bad examples before you present them to the network. So if falsified images can be inserted between the camera and the neural network in real time (to the rate of 24-60 per second), without being detected then yes, this becomes real. I know that this feels far fetched… even if it is conceivable. But someone has to hack the system quite a bit. Might as well just hack it and penetrate the system without hacking the facial recognition system. Also, there are ways to train systems to discriminate the liveliness of an image (given enough training data, as always), or augment with infrared signals (like Windows Hello and FaceID on iPhone).

Now getting back to the discussion they were having about "CNTK" over "Cognitive APIs" :-), well CNTK is a general purpose SDK for deep learning. To use it for face recognition, you would have to develop and train your own model based on it. While Cognitive Services Face API is a ready-to-use model which has been trained by Microsoft.

So if you want to get something up and running quickly, then you can use Face API. But if you have no control to the model used behind it and the data used to train it, and if you have the expertise and want to develop and tune your own model with your own data, then you should go with CNTK.

Hope that helps