Our story begins with this guy, Stuart Bowers, sipping functional programming cool aid out of a kickass mug from Cafe Press.
Stuart is a developer on Amalga and full-time F# developer at Microsoft. The world is at a loss because this man does not blog…
Anyways, he has been reviewing chapters some chapters of my book and we recently had a long conversation about what should be idiomatic F# style which I’d like to share.
Question: What is the ‘correct F# way’ to write the following code:
Create a shared, mutable value and spawn two thread pool threads to sum elements in a list. (Intentionally creating a race condition, which was the point of the code example.)
The Imperative (?) Way
Let’s start with how to do this imperatively.
open System.Threading let sumArray (arr : int) = let total = ref 0 // Add the first half let thread1Finished = ref false ThreadPool.QueueUserWorkItem( fun _ -> for i = 0 to arr.Length / 2 - 1 do total := arr.[i] + !total thread1Finished := true ) |> ignore // Add the second half let thread2Finished = ref false ThreadPool.QueueUserWorkItem( fun _ -> for i = arr.Length / 2 to arr.Length - 1 do total := arr.[i] + !total thread2Finished := true ) |> ignore // Wait while the two threads finish their work while !thread1Finished = false || !thread2Finished = false do Thread.Sleep(0) !total
The imperative code is very straight forward and offers a few benefits:
- Simplicity. Not a whole lot is actually going on, just two calls to QueueUserWorkItem and a couple of boolean flags to keep track of when the threads complete.
However, there are some drawbacks too:
- Redundancy. Both lambdas passed to QueueUserWorkItem are essentially the same, so the code is duplicated. If you want to spice things up, such as adding logging you will either forget to update the code in both locations or simply add it twice.
- Hidden data. The key difference between the two threads is the range of elements in the array that they sum. The first array counts 0 to len / 2 – 1, the second thread to len / 2 to len. However, this crucial fact is buried in the code in two separate places.
The Functional (?) Way
Now let’s examine how to do this in a more functional style.
open System.Threading let sumArray (arr : int) = // Define a location in shared memory for counting let total = ref 0 // Define two flags in shared memory let thread1Finished, thread2Finished = ref false, ref false // Generate a lambda to sum a section of the array. let sumElements (startIndex,endIndex) flag = fun (_:obj) -> for i = startIndex to endIndex do total := arr.[i] + !total flag := true // Divy up the array let firstHalf = 0,(arr.Length / 2 - 1) let secondHalf = (snd firstHalf)+1,(arr.Length - 1) // Generate Delegates to sum portions of the array let thread1Delegate = sumElements firstHalf thread1Finished let thread2Delegate = sumElements secondHalf thread2Finished [ thread1Delegate; thread2Delegate ] // Queue up the delegates |> List.map ThreadPool.QueueUserWorkItem // Queue the use work items |> ignore // Ignore the results // Wait while the two threads finish their work while !thread1Finished = false || !thread2Finished = false do Thread.Sleep(0) !total
Let me start with the drawbacks of this functional style first:
- Too many values. There are seven let bindings in the function – presumably they are all important.
- Concept count. This more functional way of doing thing certainly requires you to understand a bit about F# coding. (Mapping the thread code to a curried QueueUserWorkItem is pretty hot though.)
While arguably obfuscated, the functional style offers some clear benefits:
- Important data is called out. Rather than hard coding the ‘0 to len / 2’ and ‘len / 2 to len’ values, the bounds for which to sum array elements is called out explicitly and on consecutive lines. This makes declaring an off-by-one error much easier to spot and fix.
- Code reuse. Rather than hard coding the two thread pool functions, a sumElements function is introduced which takes explicit bounds to sum.
Personally I’m partial to the simplicity of the first example, but the value in calling out the range values in the second example shouldn’t be discounted. One of the best things I like about F# is that being a multi-paradigm language there is usually more than one approach to solve a problem, meaning you have options.
Thanks goes to Stuart for letting me post his code. I’m curious to hear what you think with regard to the two styles, please leave a comment and let me know what you think.