On a recently snowboarding holiday I found myself having some decidedly geeky conversations with a good friend of mine, Giles Knap. Giles is currently learning F# and was extolling its virtues while I did my best to defend C#, explaining to him that C# 3.0 included most of the functional language constructs that I was interested in.
At the end of the holiday the two of us came up with an idea. We both wanted to better understand each other’s chosen programming language, and Giles wanted something that would help him learn F#. We were both familiar with a website called ProjectEuler.net which presents a number of mathematical questions and challenges people to solve them – generally be writing a small amount of code. So, we picked 6 questions from the ProjectEuler.net site and each coded up a solution; mine in C# and Giles’ in F#.
What we were looking for was the descriptive power of the languages. The questions on ProjectEuler.net quite often have several solutions, where the most efficient solution is based upon knowing some mathematical shortcut (e.g. calculation of Fibonacci sequence is very much quicker if you happen to know about the golden ratio) but we weren’t particularly interested in writing the very best performing code – rather, we wanted to write code that showed the language off in its best light and described to a reader exactly what was going on. That said, if the code wasn’t able to generate an answer in a few seconds then it wasn’t considered.
So, here are the 6 problems, from ProjectEuler.net, that Giles chose for us:
I’ll cover problems 10 and 12 here, and the others in subsequent posts. The source code for our solutions can be found at http://FSharpCSharp.codeplex.com. If you want to play along then you’ll need to get the September 2008 CTP for F#.
Problem 10: Calculate the sum of all the primes below 2 million.
Full details of problem 10 can be found here.
To solve this problem you first need something that can generate primes. Then, with this, you can start summing them. The Sieve of Eratosthenes is a simple technique for finding all the prime numbers up to a specified integer, so in both the C# and F# solution we use this technique.
To generate primes I created a type that returned an IEnumerable<long>. This method returns an enumerator that will supply prime numbers up to some given maximum value:
The sieve works by creating nonprimes, an array of bool, and marking each position in the array as being a prime or not. All values are assumed to be prime (i.e. nonprime[n] = false) until marked as nonprimes (nonprime[n] = true). Whenever we find a prime then we mark all the multiples of that prime as being non-prime. We start marking multiples of the prime from the square of the prime because we can assume all other numbers below that value will have already been marked.
With this Prime number generator in place, the C# solution to problem 10 looks pretty simple:
By making use of the extension method Sum on IEnumerable<T> I can write a single line of code that creates my sieve, gets an enumeration of primes up to 2 million, and then sums them. This approach of writing sequence generators (methods that return IEnumerable<T>) is something that I’ve used throughout the C# solutions because it results in very simple and easy to understand solutions.
The F# solution to this problem takes the exact same approach:
In my opinion this solution benefits from a number of F# features. Firstly, it’s way more compact. All those opening and closing brackets in the C# solution are missing. Yes, I could have dropped a number of them, but I tend to stick to StyleCop’s rules for C# formatting. Secondly, the way that sequence expressions and sequences in general work in F# make it a very powerful language for solving any problem that deals with a sequence of numbers, such as those generated by the Sieve of Eratosthenes.
So, apart from some stylistic differences there isn’t much difference between the C# and F# solutions to this problem.
Problem 12: What is the value of the first triangle number to have over five hundred divisors?
Full details of problem 12 can be found here.
Problem 12 deal with triangle numbers, and then working out how many divisors each of these numbers has.
The first thing I did was write code that generated triangle numbers. As with Problem 10, I did this by writing code that returned an IEnumerable<T>. Here is my Triangle Number generator:
This simple code will yield each a triangle number when the called “pulls” in the enumerator. Next, I needed something that would tell me each of the divisors that a given integer had. I did this by writing an extension method on the Int32 type that returned an IEnumerable<int> containing each divisor:
With my Triangle Number generator, and my GetDivisors extension method, I was finally able to solve this problem with the following code:
This comprehension syntax query is, in my opinion, pretty simple to read and does a good job of describing the solution to the problem.
Giles’ F# solution also follows the same strategy as mine. Here is the code:
Conclusions (so far)
Maybe it’s because we both chose similar ways of solving these problems, or maybe it’s because these questions don’t lend themselves to showing the language differences, but to me Problems 10 and 12 have equally readable and understandable representations in both C# and F#.
One big difference that we found between the C# and F# solutions was the performance. Using some simple timing code we observed that for Problem 10 the F# solution was approx 5 time slower than the C# solution. In addition the F# solution used a pretty large amount of memory. Why is this?
The F# collection types all support IEnumerable and you can build them with a sequence generating pattern which uses yield, just like C#. However, it turns out that there are some performance overheads with this approach. These sequence iterator blocks are compiled into state machines in C# (Jon Skeet has an article that explains how this happens) but in F# they use lambda functions. Using this approach in a nested loop is therefore very expensive!
The HubFS site proved invaluable in resolving this issue, for the full thread see http://cs.hubfs.net/forums/thread/9149.aspx. A total of 4 different solutions were suggested and tested. We selected the solution that had reasonable performance but still demonstrated an elegant way of using the F# sequence type (thanks to ‘kvb‘ for the suggested code). We did find a solution that gives the same performance as C# but it was rejected as being too ugly! It’s understood that this perf issue with sequence blocks will be addressed in a future release of F#.
In the next post I’ll be looking at 25 and 27. 25 is a particularly interesting Problem in that solving it in C# was tricky and, in fact, required a little help from F#. In the mean time, feel free to play with our code, make it better, show us the error of our ways, or come up with even more elegant solutions that show one or other of the languages in their best light.