In the last week or so Paul Madsen made at least a couple of posts with strong visual components: one that resumed my old 2005 post on a notation for message crypto, the other on Feynman diagrams. Nice! Paul, when I am in that mood I find especially pleasant to thumb through Tufte: I highly recommend it. Like Paul, in a former life I dealt with completely different stuff: I spent few years on computational geometry first, and on scientific visualization later. I am absolutely in love with what I do now (proof?), but I still have some residual forma mentis from those times. There's nothing on TV until Friday (can't wait for the next Battlestar Galactica!), and I am not focused enough to make real work; hence for this post I will indulge my inner geek a bit.
On the topic of notation and diagrams, I often wonder if it would be of value to find an expressive representation of the claim propagation pattern. Would a circuit-like notation work? Or a network flow would work better? The main idea can be simple: all the claims inserted in the circuit must be there for a reason, since at a certain point the policy of an RP requested them; so for every claim produced there must be a piece of biz logic that eventually uses ("consumes") it. Hence IPs are sources and RPs are sinks; an initial coarse simplification may indirectly factor out subjects, by assuming that an RP-IP edge is in the schema if the subject chose to disclose.
Let's take the example of one RP that implements a content portal; let's assume that the RP requires a personal card for signing in, and a managed card from IP1 for making higher value operations such as showing special content (like movie trailers rated only for people beyond a certain age). Our little schema would look like
Whoah, rocket science uh? Not exceptionally useful. The reason is that it is difficult to make an interesting network if you have just sources, sinks and edges: we need nodes, primitives that have both inputs and outputs. What are the nodes of the IdM? Claim transformers, of course! Think of an R-STS, the favorite tool for implementing CTs: it chews the token used for security the RST and spits out a new token with transformed claims. Let's revisit our little sample by adding a meaningful claim transformer:
Our RP accepts claims of type AllowedRating (from [ G | PG | PG-13 | R | NC-17 ]), and we have a claim transformer (CT1) that is able to transform Birthdate claims into AllowedRating claims. So if I'd be the subject the RP would receive a token with 5 instances of AllowedRating which would cover all possible values; if the subject would be Giacomo, my youngest nephew (I have 12 of them :-)), he'd get just G and PG. This is an easy case, we can certainly imagine scenarios with multiple CTs in cascade.
Ahh, now we're onto something. The flow/circuit notation may have suggested that here the property of transport networks are preserved, but we just observed it's not true. Kirchhoff junction law does not work here, because the amount of information exiting one node can be smaller than the one that entered. A birthdate is much more information than a rating: I can always derive a rating from a birthdate, but I cannot guess the age of somebody just by knowing his movie rating privileges. This is bad news for our little notation experiment, but good news for the subject: user consent and minimal disclosure are respected here, the subject gave consent for using his birthdate and the RP is using no more than that. Seems a good result: it matches the intuition, but it's always nice to get more reassurance.
Perhaps representing the CT as a single node is not such a good idea, for waht we said above it is rather an RP-IP "molecule". Let's try that, and let's add a bit more complexity to it:
Everything like before, we just exploded CT1 in RPCT and IPCT. Ah, and we added a little detail, now the claim transformed adds a new claim to the output token, Loyalty. The claim Loyalty is some measure of how often the subject uses the service: the higher it is, the higher the bandwidth that the RP will grant him. As usual, I can hear the objections rising: "But Vittorio, the subject never consented to disclose Loyalty information! This is in violation of the Laws!". Well, well. Let's think about it for a moment. What if the CT is actually a resource managed by the same entity that manages the RP? After all it is very common to have a resource STS that lives in the same domain of the RP: at that point using an R-STS is barely an implementation detail, the RP may obtain the very same information from a profile store. If the info comes from a profile there's not much that the subject can do, the RP can save whatever it pleases it: Amazon remembers the last n books I bought and I can try to exercise my consent until I am blue in the face, but I can't prevent them from using it in the application. The matter is different if the claim transformer is a distinct entity from the RP. The additional claims that may be injected in the process may not be visible to the subject, since the claim set explicitly consented is just the one that starts the chain (IP1 in our case). On the other hand, the fact that CTs are intermediate endpoints rather that the starting point of a chain implies that there is no explicit direct relationship between the subject and the CTs (otherwise the chain would have not progressed beyond the CT searching for a more distant fixed point, IP1 in this case), hence the possibility of the subject to handle information release would be nil anyway. It's like having people about gossiping about you: you can't really avoid it, and it can happen even without the need for you to start a conversation.
OK, enough playing around for the evening. I am happy of the Birthdate->AllowedRating information funnel result, but I am still no 100% satisfied with the information injection case: I have the intuition, but I've to work more for a proper formalization. And the notation? Before throwing it away, I'll give it a try with the next big system I'll design and keep you appraised of the results 🙂