F# in 20 Minutes – Part II

Article
05/09/2008

Now that we have covered the basics, in minutes 8 - 14 we will cover the foundational concepts and types in F#. By the end of Part II you should be able to read F# code reasonably well.

Immutability

You may have noticed that I’ve been using the term ‘value’ to refer to identifiers rather than ‘variable’. The reason for this is that types in F# are immutable by default, meaning that once they are created they cannot be changed. This may seem like a severe limitation, but immutability actually prevents some classes of bugs. In addition, immutable data is inherently thread safe meaning you don’t need to worry about sync locks in order to make your code parallelizeable. I’ll go into asynchronous programming in Part III.

If you do need to mutate some data however, F# has the mutable keyword. This creates a variable who’s value can be changed by using the left arrow operator (<-).

 > let mutable x = "the original value.";;
val mutable x : string
> printfn "x's value is '%s'" x;;
x's value is 'the original value.'
val it : unit = ()

 > x <- "the new one.";;
val it : unit = ()
> printfn "x's value is now '%s'" x;;
x's value is now 'the new one.'
val it : unit = ()

Reference values (Microsoft.FSharp.Core.Ref<_>)

Reference values are another way of being able to mutate data. But rather than having a mutable variable on the stack, a reference cell is a pointer to variable which you modify on the heap. In F# there are some restrictions on where you can use mutable values. (Such as not being able to use them in inner lambdas.) But ref objects can be safely passed around, since they are immutable record values. (Which have a mutable field which can be modified.)

To use reference cells you use (:=) to assign a new value and (!) to dereference.

 > let refCell = ref 42;;
val refCell : int ref

> refCell := -1;;
val it : unit = ()

 > !refCell;;
val it : int = –1

Modules

In Part I we just declared values and functions arbitrarily. You might have wondered ‘where did they go’, since in C# everything must belong to a class. Though F# can declare the standard .NET classes you are familiar with, it also has the notion of a module; which is a collection of values, functions, and types. (Contrast this with a namespace, which can only contain types.)

This is how we were able to access ‘List.map’. In the F# library (FSharp.Core.dll) there is a module named ‘List’ and on that is a function named ‘map’.

Modules serve as a way of encapsulating code when you are rapid prototyping without needing to spend the time to design a strict object-oriented type hierarchy. To declare your own module, you can use the module keyword. In the example we will associate a mutable variable with the module, which will serve as a global variable.

 module ProgramSettings =
    let version = "1.0.0.0"
    let debugMode = ref false
    
module MyProgram =
    
    do printfn "Version %s" ProgramSettings.version
    
    // You can also open modules like you can a namespace, which
    // brings all their functions and values into scope.
    open ProgramSettings
    debugMode := true

Tuples

A tuple (pronounced 'two-pull') is an ordered collection of values treated like an atomic unit. Traditionally if you wanted to pass around a group of semi-related values you would need to create a struct or class, or perhaps rely on ‘out’ parameters. A tuple allows you to keep things organized by grouping related values together without introducing a new type.

To define a tuple, simply enclose the group of values in parentheses separate the individual components by commas.

 > let tuple = (1, false, "text");;
val tuple : int * bool * string

> let getNumberInfo (x : int) = (x, x.ToString(), x * x);;
val getNumberInfo : int -> int * string * int

> getNumberInfo 42;;
val it : int * string * int = (42, "42", 1764)

Functions can even take tuples as arguments.

 > let printBlogInfo (owner, title, url) = printfn "%s's blog [%s] is online at '%s'" owner title url;; 
val printBlogInfo : string * string * string -> unit

 > let myBlog = ("Chris", "Completely Unique View", "https://blogs.msdn.com/chrsmith");;
val myBlog : string * string * string

> printBlogInfo myBlog;;
Chris's blog [Completely Unique View] is online at 'https://blogs.msdn.com/chrsmith'
val it : unit = ()

Function Currying

A novel feature F# provides is the ability to provide a subset of a function’s parameters, and bind a new value to the partial application. This is known as function currying. For example, consider a function which takes three integers and returns their sum. We can curry that function, by fixing the first parameter to be 10, resulting in a function that only takes two integers and returns their sum plus 10.

 > let addThree x y z = x + y + z;;
val addThree : int -> int -> int -> int

> let addTwo x y = addThree 10 x y;;
val addTwo : int -> int -> int

> addTwo 1 1;;
val it : int = 12

Discriminated Unions

Consider the following enumeration:

 enum CardSuit { Spade = 1, Club = 2, Heart = 3, Diamond = 4};

Logically there can only be one possible suit for a card, but since an enum is just an integer behind the scenes you don’t actually know if the value is valid or not. For example, in C# you can write:

 CardSuit invalid1 = (CardSuit) 9000;
CardSuit invalid2 = CardSuit.Club | CardSuit.Diamond;

This will break any code that expects the enum to only have one of the four card values. Also, Consider the case where you need to extend an enum. (Example borrowed from Jomo’s blog.)

 enum Title { Mr, Mrs }

The Title enum works great, but imagine that later you add a ‘Ms’ value. Now every single switch statement you had is a potential bug. You can try to fix up all your code, but you are bound to miss something.

Enums provide a way to easily express some concepts but don’t provide enough compiler checks to prevent errors. Enter the Discriminated Union. A Discriminated Union in F# is a type which can only have one of a set of values, known as data tags. For example, consider a discriminated union for all Microsoft employees.

 type MicrosoftEmployee =
    | BillGates
    | SteveBalmer
    | Worker of string
    | Lead   of string * MicrosoftEmployee list

If you have an instance of type MicrosoftEmployee, you know that it can only be one of {BillGates, SteveBalmer, Worker, or Lead}. Moreover, if it is a Worker then you know that there is an string associated with that data tag, probably corresponding to a name. We can create discriminated unions easily and use pattern matching - described later - to match their values.

 let myBoss = Lead("Yasir", [Worker("Chris"); Worker("Matteo"); Worker("Santosh")])

let printGreeting (emp : MicrosoftEmployee) =
    match emp with
    | BillGates   -> printfn "Hello, Bill"
    | SteveBalmer -> printfn "Hello, Steve"
    | Worker(name) | Lead(name, _) 
                  -> printfn "Hello, %s" name

Now what if we extend that Discriminated Union…

 type MicrosoftEmployee =
    | BillGates
    | SteveBalmer
    | Worker of string
    | Lead   of string * MicrosoftEmployee list
     | ChrisSmith

… now we get a compiler warning in our match.

The compiler detected that you didn’t match every tag of the discriminated union and issued a warning. Checks like this go a long way towards preventing bugs. To see a few examples of just how powerful discriminated unions are, check out this post.

Pattern Matching

Pattern matching is like a powerful switch statement, allowing you to branch control flow. You can do more than just comparing a value against a constant however, pattern matching allows you to also capture new values. Such as in our previous example, we bound the identifier ‘name’ when matching against discriminated union tags.

 let printGreeting (emp : MicrosoftEmployee) =
    match emp with
    | BillGates   -> printfn "Hello, Bill"
    | SteveBalmer -> printfn "Hello, Steve"
    | Worker(name) | Lead(name, _) 
                  -> printfn "Hello, %s" name

Pattern matching can also match against the ‘structure’ of the data. So we can match against list elements joined together. (Remember that x :: y means that x is an single element of the list, and y is list of items after it. Also ‘[]’ represents the empty list.)

 let listLength alist =
    match alist with
    | [] -> 0
    | a :: [] -> 1
    | a :: b :: [] -> 2
    | a :: b :: c :: [] -> 3
    | _ -> failwith "List is too big!"

At the end of the pattern match we had a wildcard ‘_’, which matches anything. So if variable ‘alist’ had more than three items, the last pattern clause would execute and throw an exception. Pattern matching also allows you to execute an arbitrary expression to determine if the pattern is matched. (If that expression evaluates to false, then the clause is not matched.)

 let isOdd x =
    match x with
    | _ when x % 2 = 0 -> false
    | _ when x % 2 = 1 -> true

You can even pattern match using a dynamic type test:

 let getType (x : obj) =
    match x with
    | :? string    -> "x is a string"
    | :? int       -> "x is an int"
    | :? Exception -> "x is an exception"

Records

Records are a lightweight syntax for declaring a type with several public properties. One advantage of records is that by using the type inference system the compiler will figure out the type of the record by you simply setting its values. So for example, once we define the record type ‘Address’ we don’t need to annotate a type when declare an instance of it. The compiler knows that only one record type exists with fields Name, Address, and Zip and infers that to be the type you meant.

 type Address = { Name : string; Address : string; Zip : int}    
let whiteHouse = {Name = "The White House"; Address = "1600 Pennsylvania Avenue"; Zip = 20500}

Forward Pipe Operator

I love this guy. The Forward pipe operator is simply defined as:

let (|>) x f = f x

And has a type signature:

'a -> ('a -> 'b) -> 'b

Which translates to: given a generic type 'a, and a function which takes an 'a and returns a 'b, then return the application of the function on the input.

Rather than explaining this, let me give you an example of where it can be used:

 // Take a number, square it, then convert it to a string, then reverse that string
let square x         = x * x
let toStr (x : int)  = x.ToString()
let rev   (x : string) = new String(Array.rev (x.ToCharArray()))

// 512 -> 1024 -> "1024" -> "4201"
let result = rev (toStr (square 512))

The code is very straight forward, but notice just how unruly the syntax looks. All we want to do is take the result of one computation and pass that to the next computation. We could rewrite it by introducing a series of new variables:

 let step1 = square 512
let step2 = toStr step1
let step3 = rev step2
let result = step3

But now you need to keep all those temporary variables straight. What the (|>) operator does is take a value, and 'forward it' to a function, essentially allowing you to specify the parameter of a function before the function call. This dramatically simplifies F# code by allowing you to pipe functions together, where the result of one is passed into the next. So to use the same example the code can be written clearly as:

 let result = 512 |> square |> toStr |> rev

Sequences (System.Collections.Generic.IEnumerator<_>)

A Sequence or ‘seq’ in F# is just another name for System.Collections.Generic.IEnumerator, but it has a special place in the F# world. Unlike lists or arrays, Sequences can specify infinite series. Only the current value is stored in memory, and once the sequence computes the next item, the current value is forgotten. For example, the following code snippet generates an sequence of all integers.

 let allIntegers = Seq.init_infinite (fun i -> i)

Collections (Seq, List, Array)

When you need a collection of values in F# you have a least three good options – Arrays, Lists, and Sequences. Each has their own advantages. Built into the F# library are a series of modules designed for each of these collections. You can use Visual Studio’s intellisense to explore the methods and what they do, but I’ll go into the most common ones here.

iter. The ‘iter’ function iterates through each item in the collection. This is identical to a ‘foreach’ loop. The following prints every item in a list:

 List.iter (fun i -> printfn "Has element %d" i) [1 .. 10]

map. Like I described in Part I, map transforms a collection based on a function. The following maps an array of integers to their string representations.

 Array.map (fun (i : int) -> i.ToString()) [| 1 .. 10 |]

fold. Fold is used to take the collection and fold it into a single value, or reduce it. Like Iter and Map it takes a function which will be applied to each element in the collection, but unlike iter and map that function takes an additional ‘accumulator’ parameter. As fold is continually called it builds up this accumulator parameter based on the previous computation. So for example, given a sequence of integers a valid fold operation would be to sum each item in the series. (And reduce the series to a single value.) Our accumulator would be the sum of all the elements we have seen before.

 Seq.fold (fun acc i -> i + acc) 0 { 1 .. 10 }

Only Seq has a fold method, List and Array have a fold and foldBack. The difference is the order in which items are evaluated. fold goes left to right while foldBack goes right to left. (As an exercise to the reader, consider why Seq does not have a foldBack method.)

Option values (Microsoft.FSharp.Core.Option<_>)

With such a demphasis placed on mutability, it is difficult to find nulls in F# code since values are always initialized. However, there are times when a ‘null’ value means more than an uninitialized variable. Sometimes it means the absence of a value. (Option values are similar to nullable types in C#.)

F# has an ‘option type’ to represent two states: ‘Some’ and ‘None’. In the following record type Person, the middle initial field may or may not have a value.

 type Person = { First : string; MI : string option; Last : string }

let billg    = {First = "Bill";  MI = Some("H"); Last = "Gates" }
let chrsmith = {First = "Chris"; MI = None;      Last = "Smith" }

Lazy values (Microsoft.FSharp.Core.Lazy<_>)

Lazy initialization is a term used to describe something which is computed as needed (and not right when it is declared). F# has lazy values, such as the following example where ‘x’ is an integer, but as part of its evaluation prints ‘Computed’ to the screen.

 > let x = lazy (printfn "Computed."; 42);;
val x : Lazy<int>

> let listOfX = [x; x; x];;
val listOfX : Lazy<int> list

> x.Force();;
Computed.
val it : int = 42

You see that the value of x was computed when we called ‘Force’ and the value 42 was returned. You can use lazy initialization to avoid computing values which aren’t needed or if they are computationally expensive. In addition they are useful when constructing recursive values. For example, consider a discriminated union which represents an circular linked list:

 type InfiniteList =
    | ListNode of int * InfiniteList
    
let rec circularList = ListNode(1, circularList)

The value ‘circularList’ has a reference to itself (representing an infinite loop of ListNodes with value 1). Without lazy initialization declaring this type of value would be impossible. But behind the scenes the compiler uses lazy initialization to make this work.

By now you should have enough exposure to F# primitives to understand the growing number of F# blogs out there. Next, in Part III we will go over advanced topics – things which F# can do which other .NET languages can’t. Stay tuned!