The F# 3.0 Freebase Type Provider Sample – Integrating Internet-Scale Data Sources into a Strongly Typed Language


The Visual F# team have a new blog The F# 3.0 Freebase Type Provider Sample – Integrating Internet-Scale Data Sources into a Strongly Typed Language



Comments (1)

  1. Nick Evgeniev says:

    If external data schema changes how it would be different from dynamic languages (i.e. python)? It looks like it doesn't eliminate fragility but rather introduces complex to implement constructs (Type providers).. Am I missing the point?

    Don replies: There's a discusion about schema change in the tech report. For the Freebase type provider sample, there are several factors in play:

    • The Commons schema evolves at a surprisingly benign pace
    • You can set a "SnapshotDate" static parameter to get a non-evolving view of the schema and data, eliminating fragility completely (at the cost of getting stale schema + data) 
    • In practice, if the schema evolves, you typically re-open your Freebase data script or data access components and they get re-typechecked automatically. The changes you need to make are highlighted by red-squigglies. Very simple. (depending on your schema-cahching settings)
    • Depending on your goals, you can arrange to have your code/data-scripts regularly rebuilt against the latest schema as part of a continuous integration build process. This will automatically bring breaking schema changes to your attention.
    • Assuming a low rate of schema change. the tooling at design-time gives really huge productivity and discoverability improvements over dynamic languages

    My experience is that the data-access authoring improvements over dynamic languages are huge. Agreed those improvements come at the cost of "needing to write a type provider" (for the data source or data access standard like SQL or OData – not everything is for free). In contrast, with a dynamic language you are given no tooling/completion/documentation/checking assistance "in the language" at all, and generally only become aware of schema change through something akin to a high-speed car crash.

    Fundamentally there are many tradeoffs inherent in connected information access. Eliminating fragility in all circumstances is not an option, because these are heterogeneous systems. Giving mechanisms to control it under particular assumptions is more realistic – this is one such mechanism.

    Follow up may be best on the F# MSDN forum

    best wishes, don