Computed Properties One Pager


UPDATE: this blog post is out of date. Please review this instead. 


If you’ve been paying attention to what the Entity Framework team has been saying. You will remember hearing us talk about using the Entity Data Model (EDM) as a common way of describing models that can be understood by many different applications.


As of V1, the Entity Framework supports ad-hoc line of business applications. In the future though we also want to add extra services, such as reporting. To make this work we need to make a number of key extensions both to the EDM and the Entity Framework itself.


The first of these is Computed Properties.


The Reporting Model requirement, is the ability to create reports in terms of properties that are not mapped to fields in the database, but are instead computed.


Supporting this idea is then vital if we imagine a world where Reporting Services works over the EDM.


Computed Properties however are useful for more than just Reporting scenarios. What follows is the initial one pager (okay more like 5 pages) on the topic:


Scenario:


We want to support Reporting Services (and other customers too) that want to be able to have some properties of the entity be computed in the store.


Examples range from a Person’s Fullname which is simply the Person’s Firstname and Surname concatenated, to more complicated scenarios that use Navigations, and / or completely random eSQL.


Scenario Detailed Walkthrough:


In the EntityType definition a user creates a Computed Property, either with or without the help of tooling:


<EntityType Name=Person>
  <Key>
      <PropertyRef Name=ID />
  </Key>
 
<Property Name=ID Type=Int32 />
 
<Property Name=Firstname Type=String MaxLength=50 />
 
<Property Name=Surname Type=String MaxLength=50 />
 
<Property Name=Fullname Type=String>
    <
DefiningExpression>
   
it.Firstname + ‘ ’ + it.Surname
   
</DefiningExpression>
 
</Property>
  <
Property Name=CreatedDate Type=DateTime ReadOnly=True/>
</EntityType>


In this example the “Fullname” property is computed using the eSQL included in the DefiningExpression under the Property.


Interesting Points:



  • We re-use the <Property> element for calculated properties, the thing that distinguishes computed properties is that the property has a <DefiningExpression> sub element.

  • Calculated Properties are always read only.

  • We also want to introduce a ReadOnly attribute to the Property element that can be used to mark non-calculated properties as read only too (See the CreatedDate property above).

  • In the DefiningExpression’s eSQL we add support for a contextual variable, tentatively called “it”, which can be used by programmers to reference properties and navigation properties on the Entity.

  • While any eSQL can be written in the DefiningExpression, the expect shape of the resulting query must be projectable into the Type specified at the Property level. For example in the above, the Fullname property is of type string, so the DefiningExpression is valid only because its return shape is expected to be a single row with a single string column.

  • Calculated Properties can’t be referenced in the Key of the Entity.

  • Can Calculated Properties reference other Calculated Properties in the same Entity? Ideally yes.

  • Can Calculated Properties reference Calculated Properties in other Entities? Ideally yes.

  • Invalid eSQL or invalid return type of the expression should ideally result in a Parsing Error at runtime.

Computed Property Return Types:


The example given so far, is a special case, in that the eSQL in the DefiningExpression only references properties of the current entity. We can therefore deduce that there will be just one row and from the shape of the projection we can deduce there is one string column, hence we can safely assume this eSQL can be materialized into a single string column or property in the reader or object respectively.


However a DefiningExpression can easily produce multiple rows, either by using a NavigationProperty hanging off the current Entity, or by virtue of being a standard query with an undetermined set of results.


This means we need to also look at handing computed properties that return a Collection.


Additionally the eSQL can return more than one column, so the collection may not necessarily be a Collection of scalars.


Ideally then we should support a collection of ComplexTypes, Entities and perhaps even RowTypes.


Collections of Scalars:


An example would be this:


  <EntityType Name=Person>
   
<Key>
      <PropertyRef Name=ID />
    </Key>
    <Property Name=ID Type=Int32 />
    <Property Name=Firstname Type=String MaxLength=50 />
   
<Property Name=Surname Type=String MaxLength=50 />
    <Property Name=Fullname Type=String>
        <
DefiningExpression>
       
it.Firstname + ‘ ’ + it.Surname
       
</DefiningExpression>
    </Property>
    <NavigationProperty Name=Friends Relationship=FriendShip FromRole=FriendOf ToRole=Friend />
    <Property Name=Acquaintances Type=Collection(String)>
        <
DefiningExpression>
        SELECT VALUE f.Fullname
        FROM it.Friends AS f
       
</DefiningExpression>
    </Property>
  </
EntityType>


In this example Acquaintances navigates through a navigation property “Friends”, which returns a Set, so the result is a set of the projected Fullnames.


Another way of writing the calculated property that hard-codes the target set in the model would be:


<Property Name=Acquaintances Type=Collection(String)>
  <
DefiningExpression>
  SELECT VALUE f.Fullname
  FROM [Person] AS P, P.Friends AS F
  WHERE P.ID = it.ID
 
</DefiningExpression>
</Property>


This approach is not to be recommended because it mixes the Entity structure with the extents or entitysets, which of course won’t work well with MEST.


Incidentally there is an unrelated proposal for RS to extend eSQL to make it possible to express the above in this simplified syntax


<Property Name=Acquaintances Type=Collection(String)>
  <
DefiningExpression>
  SELECT VALUE P.Friends..Fullname
  FROM [Person] P
  WHERE P.ID = it.ID
 
</DefiningExpression>
</Property>


Note if in the future ‘..’ notation is supported in eSQL, then it should be a no-op on our end to support this too:


<Property Name=Acquaintances Type=Collection(String)>
  <
DefiningExpression>
  it.Friends..Fullname
 
</DefiningExpression>
</Property>


You can also imagine a query that doesn’t use “it” at all, and is completely hardcoded:


<Property Name=TopGuns Type=Collection(String)>
  <
DefiningExpression>
  SELECT VALUE f.Fullname
  FROM [Person] AS P 
  WHERE P.Firstname IN {‘Colin’, ‘Kati’}
 
</DefiningExpression>
</Property>


Object Services:


Computed properties that return a single scalar value are treated the same as normal properties, the only difference is that they are read-only, which probably only means that the ObjectStateManager needs to ignore changes to these properties rather than making them impossible to change altogether by removing the Setter.


For Collections of Scalar types the Property should be a IEnumerable<T>, i.e. in the Acquaintances example above there should be a IEnumerable<String> property called Acquaintances on the entity. Again changes should be ignored.


Entities, ComplexTypes and RowTypes:


The eSQL in the DefiningExpression can easily project multiple columns so we need a way of setting the type of Property accordingly. There are only 3 real options: Entities, ComplexTypes and RowTypes.


In this example we reverse a navigation property as a computed property, and return a collection of Person entities. I.e. the People who have the current person as a friend:


    <Property Name=FriendOf Type=Collection(Person)>
        <
DefiningExpression>
        SELECT VALUE p
        FROM [Person] AS p, p.Friends AS f
        WHERE f.ID = it.ID
       
</DefiningExpression>
    </Property>


The query itself it unimportant, what is important, is that we are returning a set of Entities. Likewise we could do something similar to return a list of ComplexTypes:


    <Property Name=FriendsAddresses Type=Collection(Address)>
        <
DefiningExpression>
        SELECT VALUE f.Address
        FROM it.Friends f
       
</DefiningExpression>
    </Property>


If we use a navigation property that returns a single Entity then you could do something like this too: 


    <Property Name=MaternalGrandmother Type=Person>
        <
DefiningExpression>
        it.Mother.Mother
       
</DefiningExpression>
    </Property>


Or a single ComplexType: 


    <Property Name=MothersAddress Type=Address>
        <
DefiningExpression>
        it.Mother.Address
       
</DefiningExpression
    </Property>


Ideally we should support, both single and Collections of both Entities and ComplexTypes.


Random projections (i.e. RowTypes) are much more difficult.


If they can be coerced into a wellknown shape by property name matching then it might be possible to project into ComplexTypes. But if not, then there is no O-Space equivalent of a RowType that is appropriate.


Random Examples:


Including an FK in the structure of an Entity:


  <EntityType Name=Person>
    <Key>
      <PropertyRef Name=ID />
    </Key>
    <Property Name=ID Type=Int32 />
    <Property Name=Firstname Type=String MaxLength=50 />
    <Property Name=Surname Type=String MaxLength=50 />
    <Property Name=MotherID Type=Int32>
        <
DefiningExpression>
        it.Mother.ID
       
</DefiningExpression>
    </Property>
    <
NavigationProperty Name=Mother Relationship=Mother_Child FromRole=Child ToRole=Mother />
  </EntityType>


Note: While this is a handy, we should not confuse this with a solution to requirement for collocation of FK data, because as mentioned previously calculated properties are always read-only and we need a read-write solution to the collocation problem, to allow setting the FK in the Entity directly.


Why DefiningExpression?


DefiningExpression is something we will probably use for C-Side Functions (i.e. functions defined in Entity Data Model in the CSDL). So it makes sense to re-use this term, in order to keep the CSDL vocabulary and mental load smaller.


Also today we already have DefiningQuery in the storage model and QueryView in the mapping, so re-using either of those terms could easily lead to confusion.


Also and perhaps most interestingly, if we interpret the eSQL as a expression, then:


1 => int


Whereas if we interpret as a query:


1 => Collection(int)


This former has useful semantics because we can call functions in the eSQL fragment and declare that the property has a scalar return type rather than a Collection(scalar) return type.


…eof…


I for one would love to hear your thoughts.


Alex James
Program Manager,
Entity Framework Team


This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Comments (17)

  1. Although we want all folks from the team to be able to post content as they saw fit on the EF Design

  2. PeterV says:

    Those computed properties could be very handy in certain cases! Like when you are keeping ‘length’, ‘height’ and ‘width’ properties and want to have a property ‘volume’ that is calculated for you.

    One thing I’m thinking about is how exceptions raised while evaluating the expressions would/could be handled.

    What if in the example

    <Property Name="MothersAddress" Type="Address">

           <DefiningExpression>

           it.Mother.Address

           </DefiningExpression>

    </Property>

    Mother is null, will there be an exception thrown? Or what happens when you do some calculations and get a DivideByZeroException? Will it be possible to put a try/catch inside the DefiningExpression tag?

    Or are exceptions always thrown and is the caller responsible for handling these exceptions?

    Kind regards,

    Peter

  3. JorgeLeo says:

    To be honest I don’t get this one.

    If we are moving into POCO… Why would I define computed properties in the scheme map? What I would do is to have read only properties in C# (or whatever other language) with the formula. Done!

    I’ll check for my nulls, I have a much richer environment to express my formulas, and it keeps me in the POCO filosophy.

    The only place where I can see some use of it is when the value of a property mus be calculated as an aggregation of some child objects. But there are already so many ways to do it.

    Regards.

    Jorge

  4. @JorgeLeo

    POCO works if you have classes.

    As Alex mentioned, the primary scenario for this is Reporting Services. In reporting services there are no classes, Reporting Services will work against the Entity Client (the EDM aware ADO.NET Provider) and will query using ESQL and get back readers. In this case the computed properties need to be defined in the model they cannot be defined in classes.

  5. JorgeLeo says:

    @timall

    I understand that the primary scenario is for reporting purposes.

    Still I disagree.

    If there is a calculated value that needs to be displayed as part of the form, then it makes sense to have a calculated property. The class should "know" about the results that the class calculates. The knowledge of the formula and its meaning in the business semantics is something that the class should know.

    Since the class produces the result, this result can be persisted with the rest of the properties. In other words, if the storage is a database (but Persistence Ignorance can override this example), another column needs to be added in the database for the read only property, and it contains the result of that calculation for that particular object.

    From a reporting perspective this is much more efficient since the results are already calculated. From a model perspective it reduces the places where the formula is represented to a single place, in the property definition, added bonus: easier maintenance, and as opposed to having the formula replicated in the report too.

    If the formula needs only to be displayed as consequence of a report, still the class (in my very personal opinion) should be the right place to put it. Like in the case of (for the sake of the argument) windows forms, where we use an MVC model, it should be the same case of the reports. Meaning, do not storing business knowledge in the report definition, the report should be just a read-only paper (or screen) representation of the data. The fact that it makes sense for a particular field to be totaled (as opposed to average or something else) is part of the business knowledge, not part of the report presentation.

    In other words, I think is a reasonable expectative that the reporting tool should be able to query from the POCO, not from the storage. Persistence Ignorance also can say that the objects are stored in a place where common reporting tools cannot reach.

    Reporting services should be extended to support Persistence Ignorance.

    My 2 cents…

    Jorge

  6. @Jorge

    I think you are missing the point.

    We are doing work for the RS team, the RS team will not use classes, they will just use EntityDataReader there are no classes in their scenarios.

    The EDM is a data model that is intended to be used across a number of SQL Server services, such as RS, where there is no expectation that CLR classes exist. This scneario is not about POCO or Prescribed Classes, this is about Entity Client and the EDM.

  7. JorgeLeo says:

    @timmall,

    Question… what comes first? RS or EF?

    Looks one of them will do a sacrifice, and based on your response it sounds like it was already decided EF.

    If RS comes first, then the point is null. Duplication of formulas will be a necesary evil.

  8. jrista says:

    @JorgeLeo,

    You should open your mind up a bit and look at the larger picture the EF team is targeting. As Timmal said, this 1-Pager has to do with EDM. EDM is a conceptual design language, and the ultimate goal is to provide a common conceptual definition framework that can service many higher level needs, not JUST the needs of programmers. POCO is a need for programmers, and as you said, computed properties can just be done in the class.

    However, when it comes to a different higher level need, such as reporting, there IS no class. RS consumes entities directly from the conceptual model. Not only that, but the developer of a report does not neccesarily have the need, desire, or even right to modify the database to add a computed column or modify a view to provide the required computations at the lowest possible level.

    Having the capability to define computed properties in an expressive manner directly as part of an entity will be a huge, timesaving boon.

  9. jdevonshire says:

    Wouldn’t the value of a computed property be easier to derive if System.ValueType supported an OnChange event?  The event would facilitate rippling and tracking changes across a computed value graph allowing Calculated Properties to reference Calculated Properties in other Entities within the conceptual model.

    Implementing the event in the base Framework shouldn’t be too intrusive to merit consideration.

  10. shawn says:

    I think something like this could be very useful!

  11. Can I uses Compoted Properties into eSQL ? Like :

    "select VALUE  from Person as p where p.Mother.Id=10"

    where Mother is a composed property…

  12. Alex D James says:

    Alexnaldo,

    Hi, Alex James here… long time no talk.

    I will be writing a follow-up post on computed properties and something called Model Defined Functions soon. That will hopefully answer all your questions!

    Cheers

    Alex

  13. A while back I wrote a post that introduced the concept of Computed Properties . Since that time we’ve

  14. Привет всем! At Microsoft PDC2008, Program Manager Tim Mallalieu presented "Entity Fretwork Futures"…

  15. Brian P says:

    This is great news.  I agree with JorgeLeo in the sense that it would be nice if the CLR classes were the model and RS could use these classes as their source.  I would rather put these calculated properties in code.  But, you can’t use those properties in your linq queries, so this will solve that right?  So I’m glad it will at least be possible.  Any chance that you guys could work on this scenario?

  16. Tanveer Badar says:

    @ jdevonshire

    Change notification for value types looks like a very desirable feature, until you look at the consequences.

    Value types have no hidden costs when it comes to storage. This is the reason you cannot do things like lock( i ) where i may be an integer.

    Consider how such a change notification would be implemented for a simple case of integer.

    int i = 5; // This may translate to push 5 in x86 terms.

    i = 6; // This may translate to mov eax, 6

    If the runtime was supposed to raise a change notification at the second line, a simple mov won’t do that. You need much more baggage than that. A place to hold the delegate reference, a slot in function table to invoke that delegate, etc. Its not a trivial thing to do.

  17. JLSPublic says:

    @Tanveer Badar

    What about having the Property "Set" methods on classes automatically raise the change notification? It seems like a straightforward thing for the CLR to do.