Credentials vs. Identity; Authentication vs.... what?

Article
06/11/2007

[EDIT: added some sketch]

In short: I briefly discuss some differences between the password based authentication model and the token based one; then I propose that we lack a proper term for describing some of the transactions enabled by cardspace and the token based model.

Sometime we get so used to the metaphors used in computer science, that they cease to be metaphors. When I use my Windows' desktop I certainly don't think of my physical desk (though they are messy in a very similar fashion), nor I think of real folders when I design the directory structure of a Visual Studio project.

During almost 2 years spent explaining CardSpace to a wide variety of people, I have noticed some consequences of this phenomenon in the identity management space.

The Identity Metasystem offers a very natural way of thinking about identity, one that allows us to leverage the knowledge and skills that serve us well in identity-related transactions in the offline world (the beaten up driving license for buying alcohol example comes to mind). CardSpace supports that fully, by supplying a solid & intuitive way of handling tokens and exercising full control on what information is disclosed to whom. However, is that message intuitively compatible with the idea that the typical web site tenant have of authentication? In my experience, not always; luckily, however, bridging the gap is very easy and takes few simple considerations.

In basic scenarios, authentication is often viewed as one mechanism for making sure that who is knocking at the door of your web site right now is actually the same person that signed up for an account some time ago. One of the minimal cases is the one in which the user provides just a user name and a password, and the web site does not memorize any further info in the user profile (more about what is a profile later). Sure, you may classify the username and the password as claims: however, given their low descriptive power, this may not be especially useful for explaining what a claim is to somebody who is not used to the idea. Let's instead consider username and password as parts of a device, a mechanism for ensuring that the user is actually one whose credentials are already in our database. OK, holding on that idea consider the following statement:

security tokens offer a way better mechanism for doing the job we just described. In other words, in general tokens are a kind of credentials which is far superior to passwords (you may want to disregard username security tokens here and stick with the SAML-like ones).

pwdsteal

Passwords can be guessed, stolen, and have all the nasty defects we devoted rivers of ink to describe (see above our phishing racoon as it steals a password and reuse it) . If your user signs up for an account by presenting a security token, say one obtained via a personal information card, you can store data about the public key being used for verifying the token signature. The assumption is that the corresponding private key can be used only by that user. Once the user will return at a later time, at the moment of authenticating he will send a new token. If the new token is signed with the same key used at sign up time, you will know that the user is the same and the authentication will be successful. If somebody (say a phisher) manages to intercept the token, he won't be able to impersonate the user with it. What identifies the user is the capability of signing with the private key, and there's no way to obtain that capability just by examining a token that has been signed (that's pretty much the meaning of the term asymmetric cryptography :-)). Below you see how using tokens changes the outcome for our phishing racoon: stealing a token does not empower you to sign a new one!

token

Let me summarize: even if you are not thinking in term of identity and claims yet, using CardSpace for "simple authentication" purposes is a huge advantage (and I didn't even mentioned all the anti phishing tricks the identity selector embeds, I'm purely talking of the protocol here).

Moving from here, we have 2 paths we may follow. We can go the trust/federation way and go beyond the "you're one of the guys I already have in my user store" authentication style; or we can go the "hostage identity" way. We have to go to both places, it's just a matter of choosing the order; and I'm picking up the "hostage identity" one. Trust/Federation will naturally follow.

When a user signs up for an account to a web site, she will often have to supply more than sheer credentials. First & last name, e-mail, birth date are all common information that the web site may gather during the registration step. That's what we refer to as a "profile": it does not really matter if the registered information is accurate (I often say that I can register on many sites as Sofia Loren, and the Internet police won't show up for complaining), that set of data forms an identity that sleeps at the web site. Every time the corresponding user signs in, that identity comes back to life and its components can be used for the business uses that called for their disclosure in the first place; the name for greeting the user, the address for targeting the shipment of an online purchase, and so on. That's what I call an "hostage identity": it exists, but the user can really "wear" it only when she signs in at that web site. Picture follows.

HostageId

Now: CardSpace allows you to represent such an identity in a personal card: you can think of a personal card as a profile that lives on your machine, as opposed to the traditional profile that is kept hostage at some RP website. One of the net results of this new possibility is that you can reuse the same profile across multiple websites, as opposed to reentering the same information over and over again (I've just registered as a speaker for the 5th edition of a conference; every single time I had to write exactly the same information, and that bugs me no end). This very, very different than the dreaded practice of password reuse across websites: for one, CardSpace does not use shared secrets. Every time you use a personal card with an RP, the resulting token will be signed with a key that has been generated specifically for that website; and the famous PPID will be different for every website as well. That will prevent RPs from colluding and comparing their logs for understanding if you are the same person and track your movements beyond their own website (there's an interesting debate on the subject: check it out here). Below there's a small graphical interpretation of the new situation.

personalcard2

Bringing your own claims with you when you sign in, like the hermit crab brings its shell around, opens up interesting scenarios. For example, a website may decide that in fact it's not worth to keep personal identifiable information about its users: the risk of being hacked and the consequent liability may not be justified for certain businesses. In that case, such a website may in theory keep just the bare minimum for recognizing the subject as a returning user (ie, the uniqueID) and rely on the claims presented at the beginning of the session for performing its business function. That's of course a pretty extreme case: apart from the value of knowing the demographic of your users for, no pun intended, profiling purposes... there are still things you may need to do while the user is not logged in. A simple thing like sending a notification email, for example, imposes to the website to know the email address of the user (and it may be impractical to keep the notification on hold until the next time that the user signs in and brings its email address in a claim). The result is that the concept of profile does not disappear with the introduction of tokens, it simply becomes a continuum for which certain info stay in the profile sitting at the website and certain others will travel inside a token. Sometimes it will be super clear if something should be a claim or part of the profile: if Amazon would accept tokens, for example, it would probably keep the last 10 books you bought in your profile rather than expecting you to bring that info in a claim (it's about their business, rather than your "personality"). Note that if Amazon would issue a token as an IP, things would be entirely different (it's all about "ownership").

Sometime the placement of some piece of data may be less crisp: the email address is a good example of that. The semantic of the sign in process may be less clear as well: if an RP memorized a certain email address for a user, and that user signs in with a token containing a different email address, what should the RP do? Should it update the profile with the new email? Should it use the email in the claim just for the duration of the current session? Should it ignore it altogether, and use the token just for verifying the status of returning user? The classical meaning of authentication would call for the latter, but the other strategies have their merits too. The topic is complex and would actually deserve an book (oh wait a minute... we are writing one :-)). We will return to this idea at the end.

So far we discussed scenarios involving personal cards; however this new hermit crab trick works even better with managed cards. The traditional authentication model concentrates on recognizing returning users: that's because the idea is that during that first registration moment the RP made its maximum effort for understanding who you are, and all subsequent sessions it just want to enjoy the ROI of that moment and performs the bare minimum for unlocking your "hostage identity" (see above). But let's dream! What if the RP had a way of knowing all the info it needs for making business with a user, directly from somebody whose word is final in the matter, and in full confidence that the source of the information can be verified? Bingo. Let's say that Pizza&Fichi software, an imaginary software vendor, enters in an agreement with your employer for making special prices to you and all your colleagues. When you go to their website to buy something, pizza&fichi software is not really interested to recognize that the current user is YOU: they are rather interested in knowing if you are the employee of a partner and if you are eligible for a discount. Managed cards, or better the tokens obtained from IPs via managed cards, can tell that much (and more) to the RP. That's federation babe :) but it's in the hands of the user, as opposed to traditional scenarios in which the token is assigned to the user behind the scenes by some redirection (take a look to WS-Federation in the passive case).

Well well. The simple example above is already enough for showing a subtle shift from the flavor of authentication we defined at the beginning. While with passwords and personal cards the accent was on recognizing a certain user, with managed card we can do something different: we can supply the details that are necessary to the RP for performing its business function. Are those details still identity? In many cases they are: in the example above pizza&fichi software verifies your identity of employee of a certain company. But if you think about it, the token mechanism can be used for transmitting anything: it is enough to have a relationship with an IP, and you can obtain a token which embodies the details of that relationship. An airline may give you a managed card that represents your boarding pass, that you may use for qualifying yourself as a passenger and buying things at the web frontend of the duty free; a credit card company may give you a corresponding managed card; a shop may give you a coupon in form of managed card. Sure, you can still talk of your passenger identity, of your identity of card holder and your identity of the holder of a coupon that entitles to a certain discount. I personally think that sometimes it's a bit of a stretch, but I am fine with that. However, in the light of what we have seen above I am wondering if it's still handy to call "authentication" the disclosure of such information via tokens. I mean, it is still compatible with the meaning of the word in itself: after all the RP checks that the incoming token is "worth of acceptance or belief"; but when I explain the whole managed cards trick (there's more that the small example above) I don't have the feeling that "authentication" really gives the idea. That strikes me even harder when I consider that an RP may require those managed tokens even AFTER having authenticated (traditional meaning) the user, that is to say in the context of a traditional session (example: you begin a session at an ecommerce site by presenting a personal card you associated with your account at sign up time; when you check out a transaction, you are asked to pick a a further card (like a coupon managed card)).

managed

The picture above shows the case of the managed card/boarding pass. In this case the claims are about my flight, and in fact I draw them in the shape of an airplane (while in the former pictures they were a straw man, more "identity"). The RP does not need to check that you are a returning user: it is happy to know that you have a valid airplane ticket, and that's certified by the fact that the claims are signed by the airline key (triangular top). The RP just needs to know the public key of the airline, all the other information comes via claims.

So, after a fairly long post, I can get back to my point: I think it would be nice to have a verb for describing the "disclosing" operation described above. I think Mike has a similar idea :-)

[note: it's 12:50 AM and tomorrow I have to wake up pretty early. Hence I am posting this as is, and I will eventually add the links and some sketch for easing a bit the burden of all that text :-)]

Credentials vs. Identity; Authentication vs.... what?

Additional resources