New Tech Report from Microsoft Research: Strongly-Typed Language Support for Internet-Scale Information Sources

 I'm very pleased to announce that Microsoft Research have published a new technical report related to F# 3.0 called Strongly-Typed Language Support for Internet-Scale Information Sources, (or go straight to the PDF).

To reference this work, please cite MSR technical report number MSR-TR-2012-101.

Authors: Don Syme, Keith Battocchi, Kenji Takeda, Donna Malayeri, Jomo Fisher, Jack Hu, Tao Liu, Brian McNamara, Daniel Quirk, Matteo Taveggia, Wonseok Chae, Uladzimir Matsveyeu, and  Tomas Petricek, 21 September 2012

Abstract: A growing trend in both the theory and practice of programming is the interaction between programming and rich information spaces. From databases to web services to the semantic web to cloud-based data, the need to integrate programming with heterogeneous, connected, richly structured, streaming and evolving information sources is ever-increasing. Most modern applications incorporate one or more external information sources as integral components. Providing strongly typed access to these sources is a key consideration for strongly-typed programming languages, to insure low impedance mismatch in information access. At this scale, information integration strategies based on library design and code generation are manual, clumsy, and do not handle the internet-scale information sources now encountered in enterprise, web and cloud environments. In this report we describe the design and implementation of the type provider mechanism in F# 3.0 and its applications to typed programming with web ontologies, web-services, systems management information, database mappings, data markets, content management systems, economic data and hosted scripting. Type soundness becomes relative to the soundness of the type providers and the schema change in information sources, but the role of types in information-rich programming tasks is massively expanded, especially through tooling that benefits from rich types in explorative programming.

Introduction.

A key direction for the future evolution of programming is to allow strongly typed programming to “escape the box” of type structures defined in hand-written or tool-generated code, and to systematically bridge the gap between the language and the schematized information found in external information systems. In this report

  • We describe the design and implementation of a novel type-bridging mechanism, the type provider mechanism in F# 3.0.
  • We describe its applications to strongly typed programming with web ontologies, web-services, database mappings, directory navigation, content management systems, scientific data sets and hosted scripting.
  • We consider the tradeoffs of these mechanisms, including the relative soundness properties of the different systems that may be designed and implemented.
  • We describe how type-bridging both radically expands the role for names and types, but also challenges existing, comfortable assumptions about what types are, how they are selected and what properties they should have.
  • We illustrate the relative ease-of-use of the type provider mechanism as compared to alternate technologies, in addition to its performance and scaling benefits.

While we have made valuable initial progress for supporting information-rich applications, we believe that this area is an excellent opportunity for future language and tooling research, information-space modeling, schematization techniques, and language usability efforts.

This report is structured as follows. In Section 2, we consider the problem of information-rich programming, especially in the context of strongly-typed languages. Section 3 presents the type provider mechanism and explains its role in addressing information-rich programming problems, and Section 4 looks at specific examples of using the mechanism to integrate “internet-scale” information sources. Section 5 looks at themes that arise when using the type provider mechanism in practice, many of which raise interesting future R&D directions. In Section 6, we briefly describe how information-rich programming can affect our view of the logical characteristics usually associated with programming languages such as type-soundness. In Section 7 we describe other applications we have explored with the type provider mechanism, and in Section 8 we summarize, describe related work and future directions.

Enjoy!

Don Syme, for Keith, Kenji, Donna, Jomo, Jack, Tao, Brian, Danl, Matteo, Wonseok, Uladzimir, and Tomas