F# in 20 Minutes – Part II


Now that we have covered the basics, in minutes 8 – 14 we will cover the foundational concepts and types in F#. By the end of Part II you should be able to read F# code reasonably well.


Immutability


You may have noticed that I’ve been using the term ‘value’ to refer to identifiers rather than ‘variable’. The reason for this is that types in F# are immutable by default, meaning that once they are created they cannot be changed. This may seem like a severe limitation, but immutability actually prevents some classes of bugs. In addition, immutable data is inherently thread safe meaning you don’t need to worry about sync locks in order to make your code parallelizeable. I’ll go into asynchronous programming in Part III.


If you do need to mutate some data however, F# has the mutable keyword. This creates a variable who’s value can be changed by using the left arrow operator (<-).

> let mutable x = “the original value.”;;
val mutable x : string
>
printfn “x’s value is ‘%s'” x;;
x’s value is ‘the original value.’
val it : unit = ()
> x <- “the new one.”;;
val it : unit = ()
>
printfn “x’s value is now ‘%s'” x;;
x’s value is now ‘the new one.’
val it : unit = ()
 

Reference values (Microsoft.FSharp.Core.Ref<_>)


Reference values are another way of being able to mutate data. But rather than having a mutable variable on the stack, a reference cell is a pointer to variable which you modify on the heap. In F# there are some restrictions on where you can use mutable values. (Such as not being able to use them in inner lambdas.) But ref objects can be safely passed around, since they are immutable record values. (Which have a mutable field which can be modified.)


To use reference cells you use (:=) to assign a new value and (!) to dereference.

> let refCell = ref 42;;
val refCell : int ref

> refCell := -1;;
val it : unit = ()

> !refCell;;
val it : int = –1
 

Modules


In Part I we just declared values and functions arbitrarily. You might have wondered ‘where did they go’, since in C# everything must belong to a class. Though F# can declare the standard .NET classes you are familiar with, it also has the notion of a module; which is a collection of values, functions, and types. (Contrast this with a namespace, which can only contain types.)


This is how we were able to access ‘List.map’. In the F# library (FSharp.Core.dll) there is a module named ‘List’ and on that is a function named ‘map’.


Modules serve as a way of encapsulating code when you are rapid prototyping without needing to spend the time to design a strict object-oriented type hierarchy.  To declare your own module, you can use the module keyword. In the example we will associate a mutable variable with the module, which will serve as a global variable.

module ProgramSettings =
let version = “1.0.0.0”
let debugMode = ref false

module MyProgram =

do printfn “Version %s” ProgramSettings.version

// You can also open modules like you can a namespace, which
// brings all their functions and values into scope.
open ProgramSettings
debugMode := true


 


Tuples


A tuple (pronounced ‘two-pull’) is an ordered collection of values treated like an atomic unit. Traditionally if you wanted to pass around a group of semi-related values you would need to create a struct or class, or perhaps rely on ‘out’ parameters. A tuple allows you to keep things organized by grouping related values together without introducing a new type.


To define a tuple, simply enclose the group of values in parentheses separate the individual components by commas.

> let tuple = (1, false, “text”);;
val tuple : int * bool * string

> let getNumberInfo (x : int) = (x, x.ToString(), x * x);;
val getNumberInfo : int -> int * string * int

> getNumberInfo 42;;
val it : int * string * int = (42, “42”, 1764)


 


Functions can even take tuples as arguments.

> let printBlogInfo (owner, title, url) = printfn “%s’s blog [%s] is online at ‘%s'” owner title url;; 
val printBlogInfo : string * string * string -> unit
> let myBlog = (“Chris”, “Completely Unique View”, “http://blogs.msdn.com/chrsmith”);;
val myBlog : string * string * string

>
printBlogInfo myBlog;;
Chris’s blog [Completely Unique View] is online at ‘http://blogs.msdn.com/chrsmith’
val it : unit = ()
 

Function Currying


A novel feature F# provides is the ability to provide a subset of a function’s parameters, and bind a new value to the partial application. This is known as function currying. For example, consider a function which takes three integers and returns their sum. We can curry that function, by fixing the first parameter to be 10, resulting in a function that only takes two integers and returns their sum plus 10.

> let addThree x y z = x + y + z;;
val addThree : int -> int -> int -> int

> let addTwo x y = addThree 10 x y;;
val addTwo : int -> int -> int

> addTwo 1 1;;
val it : int = 12


 


Discriminated Unions


Consider the following enumeration:

enum CardSuit { Spade = 1, Club = 2, Heart = 3, Diamond = 4};

Logically there can only be one possible suit for a card, but since an enum is just an integer behind the scenes you don’t actually know if the value is valid or not. For example, in C# you can write:

CardSuit invalid1 = (CardSuit) 9000;
CardSuit invalid2 = CardSuit.Club | CardSuit.Diamond;

This will break any code that expects the enum to only have one of the four card values. Also, Consider the case where you need to extend an enum. (Example borrowed from Jomo’s blog.)

enum Title { Mr, Mrs }

The Title enum works great, but imagine that later you add a ‘Ms’ value. Now every single switch statement you had is a potential bug. You can try to fix up all your code, but you are bound to miss something.


Enums provide a way to easily express some concepts but don’t provide enough compiler checks to prevent errors. Enter the Discriminated Union. A Discriminated Union in F# is a type which can only have one of a set of values, known as data tags. For example, consider a discriminated union for all Microsoft employees.

type MicrosoftEmployee =
| BillGates
| SteveBalmer
| Worker of string
| Lead of string * MicrosoftEmployee list


If you have an instance of type MicrosoftEmployee, you know that it can only be one of {BillGates, SteveBalmer, Worker, or Lead}. Moreover, if it is a Worker then you know that there is an string associated with that data tag, probably corresponding to a name. We can create discriminated unions easily and use pattern matching – described later – to match their values.

let myBoss = Lead(“Yasir”, [Worker(“Chris”); Worker(“Matteo”); Worker(“Santosh”)])

let printGreeting (emp : MicrosoftEmployee) =
match emp with
| BillGates -> printfn “Hello, Bill”
| SteveBalmer -> printfn “Hello, Steve”
| Worker(name) | Lead(name, _)
-> printfn “Hello, %s” name


Now what if we extend that Discriminated Union…

type MicrosoftEmployee =
| BillGates
| SteveBalmer
| Worker of string
| Lead of string * MicrosoftEmployee list
| ChrisSmith

… now we get a compiler warning in our match.


image


The compiler detected that you didn’t match every tag of the discriminated union and issued a warning.  Checks like this go a long way towards preventing bugs.  To see a few examples of just how powerful discriminated unions are, check out this post.


 


Pattern Matching


Pattern matching is like a powerful switch statement, allowing you to branch control flow. You can do more than just comparing a value against a constant however, pattern matching allows you to also capture new values. Such as in our previous example, we bound the identifier ‘name’ when matching against discriminated union tags.

let printGreeting (emp : MicrosoftEmployee) =
match emp with
| BillGates -> printfn “Hello, Bill”
| SteveBalmer -> printfn “Hello, Steve”
| Worker(name) | Lead(name, _)
-> printfn “Hello, %s” name

Pattern matching can also match against the ‘structure’ of the data. So we can match against list elements joined together. (Remember that x :: y means that x is an single element of the list, and y is list of items after it. Also ‘[]’ represents the empty list.)

let listLength alist =
match alist with
| [] -> 0
| a :: [] -> 1
| a :: b :: [] -> 2
| a :: b :: c :: [] -> 3
| _ -> failwith “List is too big!”

At the end of the pattern match we had a wildcard ‘_’, which matches anything. So if variable ‘alist’ had more than three items, the last pattern clause would execute and throw an exception. Pattern matching also allows you to execute an arbitrary expression to determine if the pattern is matched. (If that expression evaluates to false, then the clause is not matched.)

let isOdd x =
match x with
| _ when x % 2 = 0 -> false
| _ when x % 2 = 1 -> true

You can even pattern match using a dynamic type test:

let getType (x : obj) =
match x with
| 😕 string -> “x is a string”
| 😕 int -> “x is an int”
| 😕 Exception -> “x is an exception”

 


Records


Records are a lightweight syntax for declaring a type with several public properties. One advantage of records is that by using the type inference system the compiler will figure out the type of the record by you simply setting its values. So for example, once we define the record type ‘Address’ we don’t need to annotate a type when declare an instance of it. The compiler knows that only one record type exists with fields Name, Address, and Zip and infers that to be the type you meant.

type Address = { Name : string; Address : string; Zip : int}
let whiteHouse = {Name = “The White House”; Address = “1600 Pennsylvania Avenue”; Zip = 20500}

 


Forward Pipe Operator


I love this guy. The Forward pipe operator is simply defined as:


let (|>) x f = f x

And has a type signature:


‘a -> (‘a -> ‘b) -> ‘b


Which translates to: given a generic type ‘a, and a function which takes an ‘a and returns a ‘b, then return the application of the function on the input.


Rather than explaining this, let me give you an example of where it can be used:

// Take a number, square it, then convert it to a string, then reverse that string
let square x = x * x
let toStr (x : int) = x.ToString()
let rev (x : string) = new String(Array.rev (x.ToCharArray()))

// 512 -> 1024 -> “1024” -> “4201”
let result = rev (toStr (square 512))


The code is very straight forward, but notice just how unruly the syntax looks. All we want to do is take the result of one computation and pass that to the next computation. We could rewrite it by introducing a series of new variables:

let step1 = square 512
let step2 = toStr step1
let step3 = rev step2
let result = step3

But now you need to keep all those temporary variables straight.  What the (|>) operator does is take a value, and ‘forward it’ to a function, essentially allowing you to specify the parameter of a function before the function call. This dramatically simplifies F# code by allowing you to pipe functions together, where the result of one is passed into the next. So to use the same example the code can be written clearly as:

let result = 512 |> square |> toStr |> rev

 


Sequences (System.Collections.Generic.IEnumerator<_>)


A Sequence or ‘seq’ in F# is just another name for System.Collections.Generic.IEnumerator, but it has a special place in the F# world. Unlike lists or arrays, Sequences can specify infinite series. Only the current value is stored in memory, and once the sequence computes the next item, the current value is forgotten. For example, the following code snippet generates an sequence of all integers.

let allIntegers = Seq.init_infinite (fun i -> i)

 


Collections (Seq, List, Array)


When you need a collection of values in F# you have a least three good options – Arrays, Lists, and Sequences. Each has their own advantages. Built into the F# library are a series of modules designed for each of these collections. You can use Visual Studio’s intellisense to explore the methods and what they do, but I’ll go into the most common ones here.


iter. The ‘iter’ function iterates through each item in the collection. This is identical to a ‘foreach’ loop. The following prints every item in a list:

List.iter (fun i -> printfn “Has element %d” i) [1 .. 10]

map. Like I described in Part I, map transforms a collection based on a function. The following maps an array of integers to their string representations.

Array.map (fun (i : int) -> i.ToString()) [| 1 .. 10 |]

fold. Fold is used to take the collection and fold it into a single value, or reduce it. Like Iter and Map it takes a function which will be applied to each element in the collection, but unlike iter and map that function takes an additional ‘accumulator’ parameter. As fold is continually called it builds up this accumulator parameter based on the previous computation. So for example, given a sequence of integers a valid fold operation would be to sum each item in the series. (And reduce the series to a single value.) Our accumulator would be the sum of all the elements we have seen before.

Seq.fold (fun acc i -> i + acc) 0 { 1 .. 10 }

Only Seq has a fold method, List and Array have a fold and foldBack. The difference is the order in which items are evaluated. fold goes left to right while foldBack goes right to left. (As an exercise to the reader, consider why Seq does not have a foldBack method.)


 


Option values (Microsoft.FSharp.Core.Option<_>)


With such a demphasis placed on mutability, it is difficult to find nulls in F# code since values are always initialized. However, there are times when a ‘null’ value means more than an uninitialized variable. Sometimes it means the absence of a value. (Option values are similar to nullable types in C#.)


F# has an ‘option type’ to represent two states: ‘Some’ and ‘None’. In the following record type Person, the middle initial field may or may not have a value.

type Person = { First : string; MI : string option; Last : string }

let billg = {First = “Bill”; MI = Some(“H”); Last = “Gates” }
let chrsmith = {First = “Chris”; MI = None; Last = “Smith” }


 


Lazy values (Microsoft.FSharp.Core.Lazy<_>)


Lazy initialization is a term used to describe something which is computed as needed (and not right when it is declared). F# has lazy values, such as the following example where ‘x’ is an integer, but as part of its evaluation prints ‘Computed’ to the screen.

> let x = lazy (printfn “Computed.”; 42);;
val x : Lazy<int>

> let listOfX = [x; x; x];;
val listOfX : Lazy<int> list

> x.Force();;
Computed.
val it : int = 42



You see that the value of x was computed when we called ‘Force’ and the value 42 was returned. You can use lazy initialization to avoid computing values which aren’t needed or if they are computationally expensive. In addition they are useful when constructing recursive values. For example, consider a discriminated union which represents an circular linked list:

 

type InfiniteList =
| ListNode of int * InfiniteList

let rec circularList = ListNode(1, circularList)


 


The value ‘circularList’ has a reference to itself (representing an infinite loop of ListNodes with value 1). Without lazy initialization declaring this type of value would be impossible. But behind the scenes the compiler uses lazy initialization to make this work.


 


By now you should have enough exposure to F# primitives to understand the growing number of F# blogs out there. Next, in Part III we will go over advanced topics – things which F# can do which other .NET languages can’t. Stay tuned!

Comments (16)

  1. Paczesiowa says:

    let result = 512 |> square |> toStr |> ref

    should be rev

  2. ChrSmith says:

    Doh. Fixed it, thanks.

  3. Keith says:

    512^2 = 262144

    or

    32^2 = 1024

    Great series, btw!

  4. Anders Cui says:

    在上篇文章里,我们写出了F#的第一个程序,本文我们来看一些F#语言的核心部分,包括值的不变性,模块,Tuple,柯里化,Union类型,模式匹配,Record类型,序列和集合等内容,读完此文后,希望能让您对F#有个整体的认识。

  5. Kurt says:

    Thanks for the articles. Is part III available? I can’t seem to find the advanced topics article.

  6. Sam Stokes says:

    Wow, this is a great way to learn Data Structures!  I have ordered the F# for Scienctists, thanks for the review.  I am working to figure out how to demo F# to non-CS faculty for simulations, etc.

  7. Rick Love says:

    I was comparing F# features to C# 3.5, to see how they compare.

    One thing that interested me is the forward pipe operator (which would be nice for C# to implement next version).

    However, it can be simulated with Extension Methods:

    public static class FunctionalPipeExtensions

    {

     public static void Test()

     {

       var doSquare = new Func<int, int>(n => n * n);

       var doReverse = new Func<string, string>(

         s => s.ToCharArray()

           .Pipe(Array.Reverse)

           .Pipe(ca => new String(ca))

       );

       var strRev = (4)

         .Pipe(doSquare)

         .ToString()

         .Pipe(doReverse);

       var numberRev = strRev.Pipe(s => int.Parse(s));

       // numberRev == 61

     }

     public static TResult Pipe<T, TResult>(this T input, Func<T, TResult> doFunc)

     {

       return doFunc(input);

     }

     public static T Pipe<T>(this T input, Action<T> doFunc)

     {

       doFunc(input);

       return input;

     }

    }

    I considered naming this method "_" or "_P_" instead of "Pipe" because it is a strange but useful little guy in C#.

    Especially notice the 2nd Pipe method which allows one to use a void returning method (which changes the internal state of the mutable object), but continue the series of method calls.

    Also, object methods and extension methods fit right into the pipeline.

    Some limitations:

    – Cannot assign lambda expression to an implicitly-typed local variable.

    – The C# compiler will not automatically map a method to an equivalent delegate Func<T,TResult>, etc. The types on the generic method must be given, or a lambda expression must be used to wrap the method call.

  8. Jarlac says:

    let result = 512 |> square |> toStr |> rev

    It looks like RPN (Reverse Polish Notation) to me.

    But from the definition: let (|>) x f = f x ,

    I guess it only takes one parameter right? Is there a way to pass more parameters before the function call??

  9. ChrSmith says:

    Using the pipe forward operator (|>) you can pipe a single value to a function that takes any number of parameters. But from the definition it looks as if ‘f’ only takes one.

    The trick is to use function currying. Let’s say you have a function that takes three parameters:

    let f x y z = …

    If you curry f by providing two of its parameters:

    let g = f x y

    The result is a function that takes a single parameter (what would have been z).

    So to use the pipe forward operator with a function that takes multiple args you end up with something like this:

    z |> f x y

    The only downside is that the pipe forward operator only works for curryable functions. If a function takes a tuple:

    f (x, y, z)

    Then you will not be able to pipe data into it.

  10. Dustin says:

    The only downside is that the pipe forward operator only works for curryable functions. If a function takes a tuple:

    f (x, y, z)

    We can’t?  Perhaps I’m misunderstanding how a function is defined in F# (I’m using the longest of a few ways, I know):

    let sum3 = function

       | (x, y, z) -> x + y + z;;

    let my_tuple = (1, 2, 3);;

    let result = my_tuple |> sum3;;

    printf "%An" result;;

    That prints 6, right?  Perhaps I misinterpreted your statement, Chris?

  11. Ryan says:

    Curryable functions and tuple functions have different signatures. The key is to understand that functions are themselves values. Let’s compare:

    > let sum3tuple (x, y, z) = x + y + z;;

    val sum3tuple : int * int * int -> int

    > let sum3curry x y z = x + y + z;;

    val sum3curry : int -> int -> int -> int

    This shows that sum3tuple accepts a triple of integers and returns an integer. It’s used like this:

    > sum3tuple (1, 2, 3);;

    val it : int = 6

    The tuple is a single value that must be constructed with parentheses because syntactically that is the only way to construct a tuple.

    Whereas sum3curry accepts an integer and returns a function with the signature (int -> int -> int). The returned function is itself a function that accepts an integer and returns a function that accepts and integer and returns an integer. Confused yet?

    Here’s how it works:

    > sum3curry 1 2 3;;

    val it : int = 6

    I can add parenthetical nesting to emphasize how this works:

    > ((sum3curry 1) 2) 3;;

    val it : int = 6

    In this example, (sum3curry 1) returns a function which is then passed the argument 2. This returns another function that is passed the argument 3. Finally, this function returns an integer.

    The |> operator just lets us flip this around backward and put the inputs to the function first:

    > 3 |> (2 |> (1 |> sum3curry));;

    val it : int = 6

    Note that the parenthetical grouping is required this time because the default precedence is left-to-right.

    Incidentally, we can still use |> with the tuple version, we just can’t split up the parameters:

    > (1, 2, 3) |> sum3tuple;;

    val it : int = 6

  12. Steve says:

    And you can move between the curried world and the uncurried world using functions like:

    > let curry2 f x y = f (x, y);;

    val curry2 : (‘a * ‘b -> ‘c) -> ‘a -> ‘b -> ‘c

    > let uncurry2 f (x, y) = f x y;;

    val uncurry2 : (‘a -> ‘b -> ‘c) -> ‘a * ‘b -> ‘c

  13. OMG! I got pulled off my usual job of doing a bunch of things and have been focused on getting students

  14. beginner-1104 says:

    I still don’t get the immutability. In F# Interactive console I typed

    > let x = 3;;

    which evaluated x as

    val it : int = 3

    Then I typed

    > let x = 4;;

    First of all, I expected some kind of error message that x cannot be re-assigned or something like that. But it was accepted OK.

    Then, the evaluation of x was:

    val it : int = 4

    So, where is the immutablity? Do I miss something?

    Many thanks!

  15. beginner-1104 says:

    I found the answer to my above question at http://en.wikibooks.org/wiki/Haskell/Variables_and_functions

Skip to main content