Idiomatic F# – Functional vs. Imperative

Our story begins with this guy, Stuart Bowers, sipping functional programming cool aid out of a kickass mug from Cafe Press.

IMAGE_154

Stuart is a developer on Amalga and full-time F# developer at Microsoft. The world is at a loss because this man does not blog...

Anyways, he has been reviewing chapters some chapters of my book and we recently had a long conversation about what should be idiomatic F# style which I’d like to share.

The Problem

Question: What is the ‘correct F# way’ to write the following code:

Create a shared, mutable value and spawn two thread pool threads to sum elements in a list. (Intentionally creating a race condition, which was the point of the code example.)

The Imperative (?) Way

Let’s start with how to do this imperatively.

 open System.Threading

let sumArray (arr : int[]) =
    let total = ref 0

    // Add the first half
    let thread1Finished = ref false
    ThreadPool.QueueUserWorkItem(
        fun _ -> for i = 0 to arr.Length / 2 - 1 do
                    total := arr.[i] + !total
                 thread1Finished := true
        ) |> ignore

    // Add the second half
    let thread2Finished = ref false
    ThreadPool.QueueUserWorkItem(
        fun _ -> for i = arr.Length / 2 to arr.Length - 1 do
                    total := arr.[i] + !total
                 thread2Finished := true
        ) |> ignore

    // Wait while the two threads finish their work
    while !thread1Finished = false ||
          !thread2Finished = false do

          Thread.Sleep(0)

    !total

The imperative code is very straight forward and offers a few benefits:

  • Simplicity. Not a whole lot is actually going on, just two calls to QueueUserWorkItem and a couple of boolean flags to keep track of when the threads complete.

However, there are some drawbacks too:

  • Redundancy. Both lambdas passed to QueueUserWorkItem are essentially the same, so the code is duplicated. If you want to spice things up, such as adding logging you will either forget to update the code in both locations or simply add it twice.
  • Hidden data. The key difference between the two threads is the range of elements in the array that they sum. The first array counts 0 to len / 2 – 1, the second thread to len / 2 to len. However, this crucial fact is buried in the code in two separate places.

The Functional (?) Way

Now let’s examine how to do this in a more functional style.

 open System.Threading

let sumArray (arr : int[]) =
    // Define a location in shared memory for counting
    let total = ref 0

    // Define two flags in shared memory
    let thread1Finished, thread2Finished = ref false, ref false
    
    // Generate a lambda to sum a section of the array. 
    let sumElements (startIndex,endIndex) flag = 
          fun (_:obj) -> 
              for i = startIndex to endIndex do
                total := arr.[i] + !total
              flag := true  
    
    // Divy up the array
    let firstHalf  = 0,(arr.Length / 2 - 1)  
    let secondHalf = (snd firstHalf)+1,(arr.Length - 1)
    
    // Generate Delegates to sum portions of the array
    let thread1Delegate = sumElements  firstHalf  thread1Finished
    let thread2Delegate = sumElements  secondHalf thread2Finished
    
    [ thread1Delegate; thread2Delegate ]       // Queue up the delegates     
    |> List.map ThreadPool.QueueUserWorkItem   // Queue the use work items
    |> ignore                                  // Ignore the results 
        
    // Wait while the two threads finish their work
    while !thread1Finished = false ||
          !thread2Finished = false do

          Thread.Sleep(0)

    !total

Let me start with the drawbacks of this functional style first:

  • Too many values. There are seven let bindings in the function – presumably they are all important.
  • Concept count. This more functional way of doing thing certainly requires you to understand a bit about F# coding. (Mapping the thread code to a curried QueueUserWorkItem is pretty hot though.)

While arguably obfuscated, the functional style offers some clear benefits:

  • Important data is called out. Rather than hard coding the ‘0 to len / 2’ and ‘len / 2 to len’ values, the bounds for which to sum array elements is called out explicitly and on consecutive lines. This makes declaring an off-by-one error much easier to spot and fix.
  • Code reuse. Rather than hard coding the two thread pool functions, a sumElements function is introduced which takes explicit bounds to sum.

Personally I’m partial to the simplicity of the first example, but the value in calling out the range values in the second example shouldn’t be discounted. One of the best things I like about F# is that being a multi-paradigm language there is usually more than one approach to solve a problem, meaning you have options.

Thanks goes to Stuart for letting me post his code. I’m curious to hear what you think with regard to the two styles, please leave a comment and let me know what you think.