CLS compilation of IronPython

One of the common feature requests for IronPython is to support static compilation. While the feature looks like a no-brainer initially, it does have a few wrinkles when you look at the details. Here are the different forms that a related question might look like?

  • Can I call IronPython code from C#?
  • Can I statically compile my IronPython source code to a binary?
  • Can I define custom attributes in IronPython?

Terminology

This has several aspects related to compilation, with different folks caring about different aspects of the issue. To clarify the issues involved, here is the terminology I will use.

  • Compiling to an assembly on disk - This is useful if you do not want to ship your source code. This by itself does not imply if the IL it contains is CLS compliant or not. IronPython 1.X does support this feature, and IronPython 2 will be restoring the feature in Beta 4 (due shortly).
  • Generating CLS compliant IL - Common Language Specification (CLS) specifies the requirements on IL (and metadata) so that it can be seamlessly consumed from other statically compiled languages like C# and VB.Net.  IronPython normally generates mangled IL to implement all the dynamic features of Python which cannot be represented with CLS compliant IL in general. However, users are willing to put up with some restrictions in order to get CLS compliant code (which can be either on disk or in-memory).
  • Calling IronPython code from C# - Many folks assumes this implies compilation of IronPython code, but this is not true. See the Alternatives section below.

CLS compilation usage scenarios

This blog is about the CLS compliant compilation aspect. While CLS compliance is a nice goal by itself, it is interesting to think of the different ways that the generated IL could be consumed.

  1. Authoring .NET assemblies - The compiled assembly would be referenced from statically typed language like C# or VB.Net. This would be consumed using something like "csc /r:PythonWidgetAssembly.dll app.cs". The author of PythonWidgetAssembly.dll would probably want IronPython to support the full range of CLS features as she wants to use IronPython as a first-class languages completely on par with C# and VB.Net.
  2. Reflection on the generated assembly - This is used by tools like NUnit which look for all types in a test assembly with a given custom attribute. Similarly, a host might look in a plugin assembly for a type inherting from some known interface.
  3. Reflection on an object at runtime - Some .NET API might inspect its incoming argument object for a method with given name, or a property with a custom attribute. LINQ to SQL looks for custom attributes like TableAttribute on the class of the object being used in a LINQ query to determine how to map the LINQ query to a database operation. Normally, IL generanted by Python is mangled so that a Python class method does not really exist on instances of the Python class. Also, Python does not have a concept of custom attributes.

Issues

  • First two usage patterns, you need to explicitly compile your Python code to an assembly. This takes away one of the important qualities of Python which is the ability to just run your source code (as "ipy.exe foo.py") without an extra compile step. In general, you want the code to have the same semantics irrespective of whether you precompile it to as assembly on disk, or not.
  • Extensions to Python syntax - Expressing the full range of CLS features will need adding new syntax to IronPython. For example, Python has no syntax for accessibility of members (public, protected, private, internal, etc), method type like virtual or abstract, the type of arguments or the return type, custom attributes on assembly, types, methods, and arguments, specifying events, etc. See the description of CLS for the complete list. It could stay valid Python code by expressing the features as comments or interpreting existing Python elements in creative ways. But at some point, it will start becoming P# in appearance (ie. not quite Python). At that point, would you be better off writing those parts of your application in C# or VB.Net?
  • Extensions to Python semantics - All Python name resolution is dynamic. For eg, if you had "System.Console.WriteLine("foo")", then the name "System" is usually resolved at runtime. For the first two usage patterns, IronPython would need to attempt to bind this at compile-time which does bend the rules. In general, IronPython's support of .NET is done in a very Pythonic way. IronPython code using .NET code can in theory also run under CPython if the .NET libraries were implemented as Python libraries. There is an experimental project called Python-System which attempts to do this. This would be a tall task, but the point is that IronPython has not bent the core semantics of the language.

Alternatives

I do want to provide people with some possible alternative solutions that might work for their scenarios in the meantime

  • Writing part of the application in C# or VB.Net, and then extending it from IronPython - For example, you could define the SQL-related types in C# or VB.Net so that you could set the custom attributes from the System.Data.Linq namespace. (This would not work as IronPython does not support authoring LINQ queries yet, but it illustrates the point)
  • Using the DLR hosting APIs to call into IronPython code from C# or VB.Net
  • In the future, languages could possibly provide better support to call into IronPython as discussed in this blog.

Conclusion

Usage 3 above (Reflection on an object at runtime) seems like the main issue to me as the user is not trying to persist an assembly to disk (which is not Pythonic).

Also, some features might provide big bang for the buck (like custom attributes on types and members as required by LINQ to SQL), whereas supporting the full range of CLS features may not be warranted. I would be curious to hear of real world use-case of how people want to use "static compilation" to understand what the right design point is.

If you have run into a wall with trying to use IronPython because of one of the issues above, do leave a comment about your exact scenario, which libraries were involved, what your workaround was, etc.