A Sample of the Memoization Pattern in F#


Pieter Breed has been asking some useful questions on the F# list recently about using the “Map” type. I really appreciate it when people take the time to let us know when data structures are tricky to learn to use in F#.

Looking at Pieter’s blog made me think it might be instructive to show how the memoization pattern looks in F# (or at least one example of the pattern).  I’ve included the code below:

#light

 

open System

open System.Text

open System.Collections.Generic 

// Save computed results by using an internal dictionary.
// Note that memoize is inferred to have type
// memoize: (‘a -> ‘b) -> (‘a -> ‘b)

let memoize f =

    let cache = Dictionary<_, _>()

    fun x ->

        if cache.ContainsKey(x) then cache.[x]

        else let res = f x

             cache.[x] <- res

             res

// An example, taken from Pieter’s blog

let memoizedAppend =

    memoize (fun input ->

        printfn “Working out the value for ‘%A'” input

        String.concat “, “ [ for i in 0 .. 9 -> sprintf “%d: %s” i input ])

And using this in F# Interactive:

> Console.WriteLine(memoizedAppend(“this”))

Working out the value for“this”

0: this, 1: this, 2: this, 3: this, 4: this, 5: this, 6: this, 7: this, 8: this, 9: this

> Console.WriteLine(memoizedAppend(“this”))

0: this, 1: this, 2: this, 3: this, 4: this, 5: this, 6: this, 7: this, 8: this, 9: this

Some things to note here:

  • Working out the value for ‘this’ is only printed once, as we expcet
  • The generic type of memoize is computed automatically. This makes the code short and clear (once you get used to the F# syntax). This is known as automatic generalization, and is a key part of type inference and succinct coding in all typed functional languages.
  • The example uses double lookup – this can be changed if necessary, but I thought it instructive to keep the same structure as Pieter’s sample

There will be a section on memoization and caching in the Expert F# book, in the “Techniques” chapter

Update: Here’s the single-lookup version of this:

let memoize f =
    let cache = Dictionary<_, _>()
    fun x ->
        let mutable ok = Unchecked.defaultof<_>
        let res = cache.TryGetValue(x,&ok)
        if ok then res
        else let res = f x
             cache.[x] <- res
             res

.Update: here’s a version that uses a single mutable reference cell holding an immutable tree-based Map value.

let memoize f =

    let cache = ref Map.empty

    fun x ->

        match (!cache).TryFind(x) with

        | Some res -> res

        | None ->

             let res = f x

             cache := (!cache).Add(x,res)

             res

 

 

don

 

Comments (10)

  1. I find myself not liking the "pattern" of doing a .ContainsKey then a seperate call to access the value.  Much better is this pattern (sorry C# here, don’t know F# yet):

    if (!cache.TryGetValue(x, out res)

    {

      res = f(x);

      cache[x] = res;

    }  

    return res;

  2. Bill Odefey says:

    I have a very basic question – if you are giving an example about Map, why didn’t you use type Map from the FSharp namespace instead of using the .Net Dictionary. Does it make a difference?

  3. dsyme says:

    Hi Bill,

    I’ve added the code for a version that uses "Map". Maps are an immutable data structure not usually used for this purpose , since the pattern fundamentally uses hidden mutable state, and a hash table is usually the best implementation for mutable state lookup. Maps are usually used in algorithms that explore different paths under different environments.

    That said, using "Map" has some interesting tradeoffs: maps are in general slightly slower on lookup than hash tables (O(log(n)) v. ~O(1)), though that also depends on the behaviour of the hash/compare operations on keys. However, the behaviour of the two approaches w.r.t. to concurrency is interesting. Strictly speaking both require mutual exclusion for the accesses to the mutable store.  However using Maps also gives you another option: you can simply leave the code as it is above and accept a degree of recomputation. There are race conditions, but these probably aren’t important if the function being computed is idempotent:

     – two threads might both end up computing a value for f(x), and only one result will get written to the table. That’s OK since the results should be identical

     – two threads may race to update the table, and one result gets thrown away, but that’s OK, since you will just recompute on the next call.

    Don

  4. Pieter Breed says:

    This is slightly OT, but it is amazing what you learn from watching other people code. It seems like for some problems I see the solution before I make the problem. For others, I make the mistake repeatedly and never realise what’s going on, until I see what somebody else is doing.

    In this case its the double-lookup. I just never _realised_ that I do the lookup twice. Thanks Don.

  5. snk_kid says:

    I have to say F# is looking more impressive everyday, F# is so much more pleasent with the lightweight syntax.

    I have to say guys well done for the great progress, serious F# needs to replace C# as the flagship language 😛

  6. Tom Kirby-Green says:

    Again slightly off topic I’m afriad Don, but I was wondering how the Expert F# book was coming along?

    Kind regards – Tom

  7. Nik says:

    Memoize that use Dictionary is way faster then version with Map. In my test example, memoize with Map is even slower then no memoize :)

  8. This week we have practical examples of Lazy Evaluation with Memoization, an interview with Don Syme

  9. Nicolas R says:

    I think memoization pattern can be greatly enhanced, from a practical point of view, by using a generic version of it. I will post a version of it. That is an excellent use case for type classes actually.

  10. Goswin says:

    The names "res" and "ok" are not in the right place in the first update.

    correct is:

       let memoize f =

           let cache = Collections.Generic.Dictionary<_ , _>()

           fun x ->

               let mutable res = Unchecked.defaultof<_>

               let ok = cache.TryGetValue(x,&res)

               if ok then res

               else let res = f x

                    cache.[x] <- res

                    res