F# for games and machine learning: .NET + performance + scripting

Ralf Herbrich is a co-leader of MSR Cambridge's Applied Games Group, which specializes in using machine learning techniques such as TrueSkill to improve the player experience of XBox Live and other applications.  He's now also an F# user and advocate, having recently succesfully used F# to rapidly perform new, experimental analysis of masses of new data.  Here's what he had to say in an email to his group last week.  Note particuarly the combination of scripting, performance and interoperability that made these applications so sweet. 

The first application was parsing 110 GB of log data spread over 11,000 text files in over 300 directories and importing it into a SQL database. The whole application is 90 lines long (including comments!) and finished the task of parsing the source files and importing the data in under 18 hours; that works out to a staggering 10,000 log lines processed per second! Note that I have not optimised the code at all but written the application in the most obvious way. I was truly astonished as I had planned at least a week of work for both coding and running the application

The second application was an analysis of millions feedbacks. We had developed the model equations and I literally just typed them in as an F# program; together with the reading-data-from-SQL-database and writing-results-to-MATLAB-data-file the F# source code is 100 lines long (including comments). Again, I was astonished by the running time; the whole processing of the millions of data items takes 10 minutes on a standard desktop machine. My C# reference application (from some earlier tasks) is almost 1,000 lines long and is no faster. The whole job from developing the model equations to having first real world data results took 2 days.

These are my first impressions but based on them and the things I learned along the way I would like to encourage each of you to give F# a serious try (and, if you get stuck then do not hesitate to ask James Margetson; he truly is an F# evangelist and will be glad to help you out). One of the extensions developed by Don and James is the interactive mode of F#; it gives you a MATLAB-like command prompt where you can execute single-line commands right away without any need to compile source or create solutions (its called F# Interactive, or FSI for short). Combine this with a full integration into Visual Studio 2005 (including Intellisense) and you can get started with F# without reading a 100 page manual! I could go on listing more and more reasons why you should consider F# (e.g., built-in infinite-precision arithmetic, built-in matrix type, accessibility of all .NET libraries, etc.) but if you are interested, just drop by my office and I am happy to give you a demonstration (and hopefully convince you).

The performance of the first application derives largely from that of the .NET database libraries (though any overhead in the scripting language would also hurt), but the performance of the second is dependent on F#.

A happy customer, and happy gamers ;-)