F# for games and machine learning: .NET + performance + scripting


Ralf Herbrich is a co-leader of MSR Cambridge’s Applied Games Group, which specializes in using machine learning techniques such as TrueSkill to improve the player experience of XBox Live and other applications.  He’s now also an F# user and advocate, having recently succesfully used F# to rapidly perform new, experimental analysis of masses of new data.  Here’s what he had to say in an email to his group last week.  Note particuarly the combination of scripting, performance and interoperability that made these applications so sweet. 


The first application was parsing 110 GB of log data spread over 11,000 text files in over 300 directories and importing it into a SQL database. The whole application is 90 lines long (including comments!) and finished the task of parsing the source files and importing the data in under 18 hours; that works out to a staggering 10,000 log lines processed per second! Note that I have not optimised the code at all but written the application in the most obvious way. I was truly astonished as I had planned at least a week of work for both coding and running the application


The second application was an analysis of millions feedbacks. We had developed the model equations and I literally just typed them in as an F# program; together with the reading-data-from-SQL-database and writing-results-to-MATLAB-data-file the F# source code is 100 lines long (including comments). Again, I was astonished by the running time; the whole processing of the millions of data items takes 10 minutes on a standard desktop machine. My C# reference application (from some earlier tasks) is almost 1,000 lines long and is no faster. The whole job from developing the model equations to having first real world data results took 2 days.


These are my first impressions but based on them and the things I learned along the way I would like to encourage each of you to give F# a serious try (and, if you get stuck then do not hesitate to ask James Margetson; he truly is an F# evangelist and will be glad to help you out). One of the extensions developed by Don and James is the interactive mode of F#; it gives you a MATLAB-like command prompt where you can execute single-line commands right away without any need to compile source or create solutions (its called F# Interactive, or FSI for short). Combine this with a full integration into Visual Studio 2005 (including Intellisense) and you can get started with F# without reading a 100 page manual! I could go on listing more and more reasons why you should consider F# (e.g., built-in infinite-precision arithmetic, built-in matrix type, accessibility of all .NET libraries, etc.) but if you are interested, just drop by my office and I am happy to give you a demonstration (and hopefully convince you).


The performance of the first application derives largely from that of the .NET database libraries (though any overhead in the scripting language would also hurt), but the performance of the second is dependent on F#.


A happy customer, and happy gamers 😉


 


 

Comments (12)

  1. If this

    type of performance is typical, then I just might need to consider it…

    It seems a really…

  2. It is very interesting! This F# thing pulls me toward the world of functional and declarative tounges again and for some times ago I went on with OCaml,Haskell and Erlang. I think there is more interesting things here to discover that F# can handle them. For example a process model and distributed computing like what Erlang has already, would be very usefull (very more, than something buzzy like .NET remoting). I also will continue to tag along with open world and that in my country. But sometime when I am enforced to Microsoft, I would prefer semi-open creatures and especially the most handy ones like F#! For sometimes in near feautre I need to put my F# experiences to some projects that are be developing by now and my programmer friends who know F# are also so happy and more productive because of that! I thank your efforts again! ( A Happy Good Technology User 😀 )

  3. Just once I’d like to be at a dinner party and when asked what I do for a living, respond "I work for an Applied Games Group".

    Anyone that munges through 100GB or more in data on a regular basis is OK in my book.  But, are you sure that those guys are studying the datasets and not contributing to the datasets with their own "user experiences"?

    (("In other news at MSR-Cambridge, a research-facility-wide effort was engaged to provide more game data to the Applied Games Group . . ."))

    —O

  4. Rob Burke has just written a blog entry on F#:  It was great to have Rob to visit – I first…

  5. … and now I’ve got to worry about F#! F# is a programming language that provides the much sought-after

  6. Some time ago, I speculated that F# should be the lingua-franca of BI . Well, had I done a little research,