[The first in a series of posts on the evolution of TransForth]

I’ve been enthralled by retro computing recently and have been filling my bookshelf [yes, actual bound paper books] with (nearly) extinct programming language topics: Lisp, Fortran, Smalltalk, APL, Forth, … About this time last year I did a blog series implementing Lisp in F#; a lot of fun and got a lot of hits! (hint: post to reddit!) This time I want to evolve a Forth, also in F# and following the same incremental transformation process – calling it “TransForth”. It should be a great, enlightening journey!

Abstract Up, Compile Down

I find it amazing that from the very beginning of programming history with Lisp and Fortran, we established the classic dichotomy – Fortran thinly abstracting up from the hardware out of practicality and Lisp beautifully compiling down from a mathematical basis. I believe that Forth is unique in the spectrum of languages. It is extremely minimal and yet has an amazing breadth, spanning all the way from direct hardware access dipping below even Fortran to layering abstractions, building up to the level of Lisp – complete with a language-oriented mindset similar to Lisp macros!

Forth, like Lisp, is not really a language as much as it is an idea with many dialects. To make it even more confusing, Forth is also a development environment (editor, compiler, assembler) and, in many cases, even stands alone as an operating system (device drivers, file system, …) and is a software design philosophy.

Small, Simple, Fast!

Hopefully I can convey some of the amazing ideas behind Forth as we go. Implementing Forth is right up there with
SICP for mind opening learning. We’ll start very simple, building an interpreter with abstracted components. This way we can see how the pieces work together. Slowly, we’ll take away the abstractions and move closer to the metal. In the end, we should have a very tiny implementation with most everything implemented in Forth itself. The kernel can then easily be ported to new hardware.

Forth is ultimately about getting intimate with the hardware. I plan to target a virtual machine at first. Eventually, we’ll move to some real machines. I’m thinking of targeting the Arduino (ATmega chips) and/or the 6809E (on which I cut my assembler teeth). If we still have the energy, we may continue exploring cross compilers and other threading methods such as
token threading or (crazy-compact) Huffman threading. As a grand finale we may even implement an SECD machine with garbage collector and all and build a Lisp… in Forth! I’m not kidding. By the way, the reverse has already been done.

Humble Beginnings

In this first cut, we’ll build little more than an RPN calculator. Of course, if this were the sole goal we could do it in much less code (
remember the TinyRPN post?). If you want to brush up on your RPN skilz, try playing with my HP-35 microcode emulator. We’re just laying the foundation here. Don’t worry, we’ll get to the good stuff soon enough.

Programming in Forth is done through a classic REPL. Development is very interactive with code/compile/execute rolled into one. At the prompt you can enter sequences of named operations (called words) and literal values, all separated by spaces. Each word has a side effect of manipulating a stack of values or emitting output to the console. For example:

7 42 SWAP / .

This will push 7 and 42 literals to the stack, then SWAP will reverse them and / will divide them (popping off two numbers and pushing back the result), and finally . (dot is a word) will print the top-most stack value – 6 ok is emitted to the console.

In the next post we’ll extend the system to allow user-defined words, but this form of
: FOO BLAH BLAH BLAH ; is the syntax for defining a new word (FOO) in terms of other words (BLAH). The basic idea is a bottoms-up construction of a program, each new word building on previous definitions and raising the level of abstraction. At any given point, you may think of the currently defined set of words as primitives in the language. Programs can end up reading almost like English. Here is an example taken from
Leo Brodie’s wonderful book, Starting Forth:


This defines a new word, FILL, in terms of words expected to have already been defined in the system (FAUCETS, TILL-FULL, OPEN, CLOSE). These words may really be primitives in the system (nativly talking to device drivers) or are more likely themselves defined in terms of lower level words. Eventually, at the lowest level, some words will talk to the hardware (actually gathering sensor data and triggering faucet actuators in this case). Further definitions can build on FILL:


Finally, a user simply enters WASHER at the interactive prompt and kicks off the process. A program then is the last word defined (MAIN, GO, RUN, or whatever you want to call it) or perhaps a set of words working together constitutes a program with the normal Forth interactive acting as the user interface.

Point-Free Programming

Forth is a stack machine. Words can be thought of as functions from Stack -> Stack. Some words have other side effects, but to a large extent they are functions taking parameters and returning values. Unlike most languages in which parameters are named and passed explicitly, in Forth parameters and return values are threaded through a chain of functions implicitly by way of the stack. This leads to a very succinct programming style in which parameters are not named, but are passed along through a composition of words.

Even before getting into Forth, I have enjoyed transforming programs in F# and Haskell to a so-called “point free” style (some call it “pointless”
J). For example, in the
Babbage Difference Engine post some time back, the original code read:

engine =

    let uncurry f (a, b) = f a b

    let flip f a b = f b a

    let next diffs =


        |> Seq.pairwise

        |> (uncurry (+))

        |> (flip Seq.append) [0]

    Seq.unfold (fun diffs -> Some(Seq.head diffs, next diffs))

But, as a friend of mine (Randy Kern) pointed out, why pass in diffs only to pipe it through a series of functions. Why not just define next as a composition of the functions directly (using >> instead of |>):

engine =

    let uncurry f (a, b) = f a b

    let flip f a b = f b a

    let next =


        >> (uncurry (+))

        >> (flip Seq.append) [0]

    Seq.unfold (fun d -> Some(Seq.head d, next d))

The next function still expects a parameter, but the name (diffs) is no longer mentioned. Stack based concatenative languages apply this tacit programming style exclusively. If you don’t believe its a viable way to program, you should check out Factor.

Often parameters need to be rearranged, duplicated or dropped in order to glue certain words together. For this purpose there are stack manipulation words:

The Stack

Okay finally, here is our first bit of implementation in the TransForth project:


open System.Text

mutable stack = []

let push value = stack <- value :: stack

let underflow () = failwith “Stack empty”

let pop () = match stack with h :: t -> stack <- t; h | _ -> underflow ()

A simple stack (of integers) represented as a cons list with push/pop operations. You’ll notice that we’re using mutable and a side-effecting imperative style. This will make the transition to running on bare hardware easier. Forth is inherently very imperative. I don’t like this style in general (as I’ve gone off about before – Oh, The Humanity!) but we’re warming up to and embracing the “Forth way.” It looks a little ugly in F# but I wouldn’t call it non-idiomatic F# code given that F# is a multi-paradigm language. But in general, what we’re doing in this series is not very pure. I think you’ll find as we move along though, that imperative programming in a structured way has a certain charm of its own.

The Dictionary

The initial set of language primitives are stored in a dictionary (Map), as are user-defined words once we add this feature in the next post. The name is a simple string and can be anything you like (sans whitespace). The code is a simple
unit -> unit function (or Action if you’re used to the .NET types). I said earlier that words were Stack -> Stack functions which is true but, like everything in Forth, is accomplished by side effect.

mutable dict = Map.empty

let define name code = dict <- Map.add name code dict 

let find name = Map.tryFind name dict

Some Primitives

To kick things off let’s add some primitives for doing basic math, manipulating the stack and interacting with the system:

dyadic fn () = pop () |> fn (pop ()) |> push // applied in infix order

dyadic (+) |> define “+”

dyadic (-) |> define “-“

dyadic (*) |> define “*”

dyadic (/) |> define “/”

dyadic (%) |> define “MOD”

To get in the mood of speaking like Yoda, I’m already habitualy writing F# code with lots of pipes and forward composition. The above each pop two arguments from the stack, apply a function and push the result back to the stack. Notice that they apply in infix order. That is, 42 6 / means 42 divided by 6.

“DEPTH” (fun () -> stack.Length |> push)

define “CLEAR” (fun () -> stack <- [])

define “PICK” (fun () -> pop () |> List.nth stack |> push)

define “ROLL” (fun () ->

    let i = ref (pop ())

    match List.partition (fun _ -> i := !i – 1; !i >= 0) stack with

    | (h, v :: t) -> stack <- v :: h @ t | _ -> underflow ())


let stk fn () = stack <- fn stack

define “DROP” (stk (function _ :: t -> t | _ -> underflow ()))

define “DUP” (stk (function x :: t -> x :: x :: t | _ -> underflow ()))

define “SWAP” (stk (function x :: y :: t -> y :: x :: t | _ -> underflow ()))

define “OVER” (stk (function (x :: y :: _ as t) -> y :: t | _ -> underflow ()))

define “ROT” (stk (function x::y::z::t -> z::x::y::t | _ -> underflow ()))


These stack-related primitives allow querying the depth (DEPTH) and clearing (CLEAR) the stack as well as dropping (DROP), duplicating (DUP) and generally permuting the values on the stack (SWAP, OVER, ROT). These all are common words in any Forth system.  A standard means of documenting the stack effects is to make a ( before – after ) diagram:

( a — )
DUP ( a – a a )
SWAP ( a b – b a )
OVER ( a b – a b a )
( a b c – b c a )

PICK and ROLL are more general. They take a number (n) from the stack and pull from or bury to the nth item. We’ll look more at these later.

out = new StringBuilder()

let print v = out.Append(v.ToString()).Append(‘ ‘) |> ignore


define “.” (pop >> print)

define “.S” (fun () -> stack |> List.rev |> List.iter print)

define “WORDS” (fun () -> List.iter (fun w -> print w.Name) dict)


These primitives are for the programmer to inspect the contents of the stack and the dictionary. I think you can figure out what they do. Notice that . is destructive (actually popping the stack) while .S merely prints. If you’re used to RPN calculators you might think of stacks growing from top to bottom but the Forth interactive visualizes them as growing from left to right (hence the List.rev).

Outer Interpreter

On to the so-called outer interpreter – we’ll discover what an “inner interpreter” is once we begin the migration to machine code. First, the tokenizer:

mutable source = Seq.empty


let token () =

    let next = Seq.skipWhile Char.IsWhiteSpace source // trim whitespace

    let word = next |> Seq.takeWhile (not << Char.IsWhiteSpace)

    source <- Seq.skip (Seq.length word) next // leave remaining

    new String(Array.ofSeq word) // as proper string


It can’t get much simpler than that! Tokens are just whitespace separated words. Notice however, that each call to token lops off a single token at a time. If you’re used to a lexer/parser pipeline in which the parser has visibility only at the token-level, then get used to something new! In a future post we will see that in Forth, words themselves may take over parsing and consume characters (not tokens) from the source stream. This makes tokenizing, parsing and compiling a very intertwined process. As just a sneak peek at one example, the dot-quote word (.”) parses off a string terminated by a closing double quote. For example: .” this is free text with spaces, not Forth words” Forth doesn’t need to know up front about such parsing anomalies. The .” word (which is user-defined) itself does.

rep input =

    out.Clear() |> ignore

    source <- input

    while not (Seq.isEmpty source) do

        let word = token ()

        if word.Length > 0 then

            match find word with

            | Some(code) -> code () // execute found definition

            | None -> // literal?

                let number, value = Int32.TryParse word

                if number then push value else word + “?” |> failwith

Given a piece of source (generally a line of input from the console), we walk word-by-word, attempting a dictionary lookup. If found, the word is executed. If it’s not found then we attempt to parse it as a literal number and push that to the stack. If it’s not in the dictionary and it’s not a number then it is an error.

rec repl output =

    printf “%s\n>” output


        Console.ReadLine() |> rep

        out.ToString() |> sprintf “%sok” |> repl

    with ex -> ex.Message |> repl

repl “Welcome to TransForth”

This closes the loop, taking input from the user and running it through the system. Rinse and repeat.


As we transform TransForth we want to be sure not to break things.

case source expected =

    stack <- []

    rep source

    let result = out.ToString()

    if result <> expected then

        printfn “FAILURE: %s (Expected: %s, Actual: %s)” source expected result


case “123 .” “123 “ // literals

case “1 2 3 .S” “1 2 3 “ // stack

case “5 6 + .” “11 “ // addition

case “5 6 7 + + .” “18 “ // addition

case “10 2 – .” “8 “ // subtraction

case “10 2 – 3 – .” “5 “ // subtraction

case “10 2 3 – – .” “11 “ // subtraction

case “2 3 * .” “6 “ // multiplication

case “2 3 4 * * .” “24 “ // multiplication

case “5 2 / .” “2 “ // division

case “5 2 MOD .” “1 “ // modulo

case “1 2 3 DEPTH .” “3 “ // stack depth

case “1 2 3 CLEAR .S” “” // clear stack

case “1 2 3 4 3 PICK .” “1 “ // pick

case “1 2 3 4  3 ROLL .S” “2 3 4 1 “ // roll

case “1 2 3 DROP .S” “1 2 “ // drop

case “1 2 3 DUP .S” “1 2 3 3 “ // duplicate

case “1 2 3 SWAP .S” “1 3 2 “ // swap

case “1 2 3 OVER .S” “1 2 3 2 “ // over

case “1 2 3 ROT .S” “2 3 1 “ // left rotate



Comments (14)

  1. TechNeilogy says:

    Awesome.  I've always loved FORTH, ever since I used it to program an old 6809-based computer I once had.  I've implemented an interpreted FORTH several times in C#.

  2. mattiasw says:

    Back in the early 1980, I worked at a company that created spreadsheets and wordprocessors using forth for Commodore computers (PET, C64, C128).

    The only big issue with Forth was out of stack, remember that the computer booted from Forth (cartridge), no OS protection, so very often the computer crashed and you had to reboot.  Mathematical routines were written in assembler, embedded in the Forth code.

    Ever since I learned ML-programming, I wanted to make a type-checking Forth.

  3. AshleyF says:

    @mattiasw A type-checking Forth certainly would be an interesting thing to try. Could be done I believe. You've probably heard of Joy (pretty cool Forth-like language, but more "pure" with quotations and combinators – Manfred von Thun has done tons of research on it. Very solid basis.) There's a variant of Joy by Christopher Diggins called "Cat" that is essentially Joy with type inference – very cool indeed.

  4. AshleyF says:

    @Neil I like your "If Programming Languages Were TV Shows" post (… "FORTH – The Woodwright’s Shop. Watch a guy make handmade furniture using handmade tools. Be careful of splinters." :-)

  5. John Metcalf | Retro Programming says:

    Your bookshelf sounds like mine!  I'd love to hear about any books you've really enjoyed.  I'm a fan of Threaded Interpretive Languages which implements a Forth-like system for the Z80.

    There are a few Forth interpreters for the Arduino but I've only looked at amforth.  It had a few quirks but they might have been fixed since I last looked.

  6. John Metcalf | Retro Programming says:

    By the way, I think your test for PICK has a typo?

    case "1 2 3 4 4 PICK ." "1 " // pick

    should read:

    case "1 2 3 4 3 PICK ." "1 " // pick

  7. AshleyF says:

    @JohnMetcalf Hey, yeah I have that book too; a great one! Another cool one is Koopman's Stack Machines book. And yeah, I looked at amforth too. It seems that Arduino's memory model (can't randomly write to flash except from boot code and can't have executable code in SRAM) makes a nice interactive Forth very difficult. Perhaps some hybrid with the dictionary split and secondary words only (native disallowed) interactively added to what little SRAM there is. I don't know…

    Thanks for pointing out the PICK confusion. It turns out that ANS Forth and Fig-Forth don't agree with each other! I only now even realized. ANS PICK and ROLL take arguments one less than in Fig-Forth. For example, 1 PICK is the same as OVER in ANS Forth, but equal to DUP in Fig-Forth. Wow… just wow…

  8. AshleyF says:

    @JohnMetcalf Actually, I just changed PICK to match ANS Forth. Thanks much for pointing that out!

  9. Bill says:

    @Ashley WRT The arduino.

    Have look at FlashForth. It solves the handling

    of the Harvard memory architecture in an elegant way.

    @ and ! can operate on all memory types by mapping them to different addresses.

  10. Vasudev Ram says:

    Great post title :)

  11. kib2 says:

    This is one of the best tutorial series I've read so far, thank you so much for the enlightment.

  12. Greg says:

    I'm pretty sure the Forth Interest Group (FIG) used to sell a bumper sticker with the title of this article printed on it.

  13. Mark Flamer says:

    Ashley, I'm also looking into creating a SECD VM in forth. Have you made any progress here?

  14. AshleyF says:

    @Mark, I haven't. Too much distraction. BTW, Peter Henderson's book ("Functional Programming Application and Implementation") is a must if you're going to do this! Very nice top-down (feels inverted) design of an SECD machine and a Lisp for it.