Memoization


Don Syme blogged this quite some years ago but it just came up in a design review on my team this afternoon and it bears repeating.

let memoize f =
    let cache = new Dictionary<_,_>()
    (fun x ->
        match cache.TryGetValue x with
        | true, res -> res
        | _ -> let
res = f x in cache.Add(x, res); res)

This generic function takes a function and returns a new caching version of it. So simple!

You can then take some expensive function such as the standard example Fibonacci:

let rec fib = function
    | 1 | 2 -> 1
    | n -> fib (n 1) + fib (n 2)

Which is quite expensive when written this naïvely because of the exponential number of recursive calls; most with the same arguments.

> fib 42;;
Real: 00:00:01.663, CPU: 00:00:01.669, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 267914296

It takes 1.663 seconds for fib 42!

Okay, let’s make a memoized version:

let rec fastFib = memoize (function
    | 1 | 2 -> 1
    | n -> fastFib (n  1) + fastFib (n  2))

It really can’t be any easier than that! Note that because it’s a recursive function simply saying let fastFib = memoize fib doesn’t work.

> fastFib 42;;
Real: 00:00:00.000, CPU: 00:00:00.001, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 267914296

Viola!

Comments (8)

  1. Andrii says:

    The classical fix to it 🙂

    let memoize func =

               let cache = Dictionary<,>()

               fun i->

                   let (hasValue,value) = cache.TryGetValue(i)

                   if hasValue then

                       value

                   else

                       let value = func i

                       cache.[i] <- value

                       value

    we have only one cache lookup if we already have valie in cache

  2. Alex says:

    So… how does this work? When fib is defined it calls itself recursively, not the memoized wrapper of itself… This would only work the second time you run the function, and only on the same input original parameter. Nothing else is memoized.

  3. Tguy says:

    @alex. You need to review basic programming constructs.

  4. Alex says:

    That's not particularly helpful – the sample actually only works when you have already run it once. You can't memoize a recursive function in this way because it is defined in terms of itself, not the memoized version of itself.

  5. vytah says:

    I've tested it and the first call to fastFib 42 indeed runs the same amount of time as the call to fib 42. The subsequent calls to fastFib 42 are instantaneous. Also, running fastFib 42 in no way makes calculating fastFib 43 faster.

  6. AshleyF says:

    You're right, thanks for spotting that! I updated fastFib above.

  7. AshleyF says:

    @Andrii, good idea, thanks! I went with pattern matching, but certainly cache.TryGetValue x is better than ContainsKey x followed by cache.[x]

  8. Steve says:

    Memoization is terrific but the example above couldn't be put into a real-world product because of the possibly unbounded size of the input set.  Better to put limits on memoization so you don't run out of memory.  See here: http://www.atalasoft.com/…/limit-your-memoization-please.aspx