Heart Transplant

[The eighth in a series of posts on the evolution of TransForth]

 

They say that the inner interpreter is the heart of Forth and outer interpreter is the soul. It’s time to give TransForth a heart transplant!

To really understand what we’re doing here, I’d suggest watching my video demonstration below (perhaps the first bit on compiling before reading this post and then the second portion after).
 

Where We Are

Piece by piece, we’re moving closer to the machine. Up to this point, dictionary entries define code as a unit -> unit function: 
 

type WordRecord = {

    Name : string

    Code : unit -> unit

    Def : int // secondary definition address

    Immediate : bool ref } // execute immediately even while compiling 

All of the primitives are defined this way. Secondary words all use the interpret function for their Code which makes use of the sequence of dictionary indexes at Def
 

let interpret w () =

    rpush i

    i <- (index w).Def

    while mem.[i] <> -1 do

        let d = index mem.[i]

        i <- i + 1

        d.Code()

    i <- rpop ()  

This inner interpreter does essentially three things:
 

1) It jumps into new words (pushing i onto the return stack); recursively when d.Code is itself interpret.

2) It walks the sequence of words in a definition (within the while loop).

3) It returns from words; popping i back off the return stack.
 

Where We Are Going

This F#-centric interpret function will go away entirely and be broken into the holy trinity of a Forth inner interpreter: docol, next and dosemi.

In real hardware (or VM) the only primitives are instructions. What we plan to do now is to define a more processor-like VM with op codes for all of our primitives. We will get rid of the Code field and even primitives will be represented as a sequence of ints at the address given by Def. 
 

type WordRecord = { Name : string; Def : int; Immediate : bool ref }  

Secondaries are sequences of dictionary entries while primitives are sequences of op codes. The video is goes a bit further by defining words as sequences of memory addresses rather than as dictionary indexes but everything else is the same (and what’s in the video is where we’re headed in one or two more blog posts).

All of our definitions move to an op code dispatch:
 

letrec execute () =

    let instruction = mem.[p]

    p <- p + 1

    match instruction with

    | -1 -> ()

    | 0 -> next ()

    | 1 -> docol ()

    | 2 -> dosemi ()

    | 3 -> dolit ()

    | 4 -> dyadic (%)

    | 5 -> dyadic (fun a b -> ~~~(a &&& b))

    | 6 -> comp (>)

    | 7 -> comp (=)

    | 8 -> dot ()

    | 9 -> immediate ()

    | 10 -> create ()

    | 11 -> variable ()

    | 12 -> load ()

    | 13 -> store ()

    | 14 -> dyadic (/)

    | 15 -> branch ()

    | 16 -> zerobranch ()

    | 17 -> tick ()

    | 18 -> comma ()

    | 19 -> dointeractive ()

    | 20 -> docompiling ()

| 21 -> doforget ()

    | 22 -> comment '\n'

    | 23 -> comment ')'

    | 24 -> dovar ()

    | 25 -> dyadic (+)

    | 26 -> dyadic (*)

    | 27 -> appenddocolon ()

    | 28 -> appendsemi ()

| 29 -> cin ()

    | _ -> failwith "Unknown instruction"

    if instruction <> -1 then execute ()

 

Now we can point p into memory and execute directly as a normal VM.

We still need primitives to be named and to exist in the dictionary in order to be called by secondaries.

 

let header name = dict <- { Name = name; Def = mem.[h]; Immediate = ref false } :: dict

let primitive name code = header name; append code; append NEXT

primitive "@" LOAD

primitive "+" ADD

primitive "*" MULT

primitive "/" DIV

primitive "MOD" MOD

primitive "IMMEDIATE" IMMEDIATE

primitive "NAND" NAND

primitive ">" GREATER

primitive "=" EQUAL

primitive "." DOT

primitive "VARIABLE" VARIABLE

primitive "!" STORE

primitive "BRANCH" BRANCH

primitive "0BRANCH" ZEROBRANCH

primitive "'" TICK; immediate ()

primitive "," COMMA

primitive "FORGET" FORGET

primitive "\\" COMMENTLINE; immediate ()

primitive "(" COMMENTBLOCK; immediate ()

 

The new system for creating secondary words and variables is:
 

let create () = token () |> header

let DOVAR = 24

let NEXT = 9

let variable () = create (); append DOVAR; append 0; append NEXT

 

Heart Surgery

 

Finally we’ll get to the heart of the system. In fact, this is really the heart of Forth in general. We will replace our inner interpreter with an implementation in terms of primitive instruction op codes. After watching the video above, hopefully the below makes perfect sense!
 

letmutable i = 0

letmutable w = 0

letmutable p = 0

let next () = w <- mem.[i]; i <- i + 1; p <- w

let docol () = rpush i; i <- w + 1; next ()

let dosemi () = i <- rpop (); next ()

 

These are triumvirate of a Forth inner interpreter. Here they’re implemented as single functions bound to single op codes in the VM and with dedicated registers for i, w and p. It’s very convenient to have the liberty of designing a Forth-specific VM. On bare hardware however, these may need to be implemented as several machine code instructions each and mapped to available registers. In that case, it is very important that next be as fast as possible. Often it is an assembly macro folded in everywhere.

The new system for literals and variables is:
 

let dolit () = push mem.[i]; i <- i + 1

let dovar () = push p; p <- p + 1

 

Finally, the outer interpreter needs to change some to match. We will later be moving to a native outer interpreter to which the inner interpreter will return when complete. For now we’re initializing i to the address of a halt instruction so that things halt when the inner interpreter is done:
 

let rep input =

    out.Clear() |> ignore

    source <- input

    while not (Seq.isEmpty source) do

        let word = token ()

        if word.Length > 0 then

            match find word with

            | Some(d) ->

                if interactive || !d.Immediate

                thenp <- d.Def; w <- d.Def; i <- HALT_ADDR; execute ()

                elseappend d.Def

            | None ->// literal?

                let number, value = Int32.TryParse word

                if number then

                    if interactive then push value elseappend LIT_ADDR; append value

                else word + "?" |> failwith
 

Further Reading

Loeliger’s book, Threaded Interpretive Languages: Their Design and Implementation is excellent; a must read for serious Forthers.

Brad Rodriguez’s series, Moving Forth is very nice and starts off covering various threading methods along with very good pros and cons of each.

Next>