I got an interesting Connect item recently asking for a measurement type. (Yes, we do read these!) Essentially, the request was for a unit-aware value. From the request:
When we need to store measurements types in a table, we are faced with a problem. The data is not only a number, but also a scale (kilo, mil, ton) and a unit (gram, oz etc.). In order to maintain data consistency, we are forced to either hard code the unit and scale in the name of the column, and control that data insert process is not invalidating the data, or to have 2 columns one for the number, and one for the scale/unit, and then we must use functions to be able to work with the data, like sort, join group by etc. After all even though ‘1255 oz’ and ‘35578.6515 g’ have the weight, they will not be aggregated ion a group by statement. 2000 g and 2 t will be sorted wrong, as 2 is less than 2000, but 2000 gram is less than 2 tons.
This sounded like a perfectly reasonable idea, and it’s one I’ve heard before without giving it too much thought. When I did think about it more it seemed problematic. As I said on Connect:
This is an interesting idea, but I’m trying to wrap my head around how me might implement such a thing. I see two general ways this could be implemented:
1) SQL knows about a set of units and allows you to set only those.
2) SQL doesn’t know the units, but lets the user set a scale factor on each value.
I see problems in two areas. First is in managing the units. If we have a list of units, then someone who finds a unit that we don’t support (and we are guaranteed to miss some) will be out of luck. If we leave that management up to the user—let them set a scale on each value—then it’s really no better than letting the user store two columns.
Second is really a typing issue. If we manage the units, then we’d have to fail if we compare 2 feet to 3 seconds, since we cannot determine the relative scale of the units—they don’t have a relative scalse since they’re really not the same type. We also can’t add 2 feet to 3 seconds. (Should we be able to assign a 2 feet to a variable that contains 3 seconds?) We now have the situation where two instances of the same type are incompatible—this feels really weird.
If, on the other hand, we let the user manage the units, then we could do the addition of 2 feet and 2 seconds—we wouldn’t really know about units, just scale factors—but the result would be nonsense. I.e., it just shifts the semantic problem to the user.
It seems to me that at very least one should separate out different units into different classes based on what they’re measuring. E.g., have a separate type for weight, volume, length, time, resistance, temperature, pressure, work, hat sizes, tennis racquet gauges, etc. Obvisouly, some of these are more common than others, but I think to adequately cover this would require quite a few types.
I’m sure you can see where I’m going with this… My feeling is that from the platform perspective, we’re better off letting users solve this themselves, either through the use of multiple columns, or by creating a UDT.
An reasonable request, but one that seems to bring up some interesting mucky issues. (This, of course, is what makes it a fun problem.) Anyone else want to weigh in? Am I missing a third way (or the third rail)?