Forth: Inside Out

In which we turn the interpreter inside out

My Forth interpreter is working fairly well so far, but I think there is some unnecessary complexity. The interpreter is event driven, which means that everything it does is driven by some event, which is always receipt of some text input. This works OK and I chose the design based on the thought that I might want to incorporate the interpreter into a graphical program and these are traditionally event driven.

Unfortunately, it also makes my main execution loop a bit messy. If the code requires somer input text for any reason, it has to record its current state and drop out of the several layers of loops. It would be so much easier to acquire input in a synchronous way. Furthermore, with the introduction of Swift concurrency, the problem of interfacing to a graphical program will probably go away just with the application of actors and await/async.

As another bonus, of the few core words to be implemented, abort is one. In a normal Forth machine, it can be implemented with a few primitives. In my Forth machine, we’d almost certainly have to write something custom.

The current code is pretty complex:

while inputParser.hasIterators
{
    while let word = inputParser.next()
    {
        currentWord = word

        if shouldExitToParser
        {
            shouldExitToParser = false
            try resume()
        }
        else
        {
            assert(wordNumber == 0, "Interpreting new word with non zero word number")
            let words = try lookup(word: word)
            if words[0].isImmediate || !isCompiling
            {
                if !isCompiling && words[0].isCompileOnly
                {
                    throw Error.interpretingACompileOnlyWord("\(description(of: words[0]))")
                }

                wordLists[0] = WordList(name: "interpeter", words, cellsNeeded: 1)
                wordIndex = 0
                try resume()
            }
            else
            {
                compile(words: words)
            }
        }
    }
    if !shouldExitToParser && !returnStackIsEmpty
    {
        // We should only get here when we've finished evaluating a
        // string. We need to carry on with what we were doing
        // before.
        wordLists[0].removeAll()	// Make sure we don't accidentally execute the evaluated word again
        try doExit()			// Return to where we were in `evaluate`
        try resume()			// Go!
    }
}

The above is the main parse/execution loop. This is quite ugly code because it has to handle a number of possible states. For example, the if statement at the end of the loop is required to handle evaluate. It shouldn’t be necessary. We also have various flags that denote state: shouldExitToParser is needed to flag when a word asks for the next word of input.

There’s a lot of evidence that I had trouble with this code. The assert and the comments show I had trouble reasoning about the code, even when I was writing it.

Refactoring Strategy

Firstly, all the work done is on a separate branch in the git repository. The branch is inside-out for anybody wanting to follow along. I’m writing this blog post as I go along doing the changes, by the way. If there are strange changes of tense, it’s because I haven’t edited it properly at the end.

The overall objective is to simplify the parse loop to the point that it can be implemented in Forth.

The first task was to remove shouldExitFromParser. Any words that use that flag would, henceforth, get the word they need directly from the input stream. Any word? There’s only one: parseWord. Henceforth, parseWord would parse a word directly from input. If there isn’t any input, we need to handle that. For now, I chose just to throw an exception.

The result of removing shouldExitToParser are tagged at blog-1810-1.The code is marginally cleaner and possibly a little easier to understand. However, four of the evaluate core tests were now broken. I decided to ignore this for now, because, I hoped the full transformation will make the bug a lot easier to fix.

The next stage was to remove the double while loop:

while inputParser.hasIterators
{
    while let word = inputParser.next()
    {

inputParser.hasIterators (renamed to hasCharacters) just tells us if the current line buffer is used up, but inputParser.next() does the same thing. This I proved by changing the inside while to an if and everything continued working. Furthermore, these two loops are called from a function that loops through all the line buffers so we had three levels of looping just to read through the input stream.

Some Time Later…

I have refactored the main loop. It now looks something like this:

while let word = inputParser.next()
{
    currentWord = word

    assert(wordNumber == 0, "Interpreting new word with non zero word number")
    let words = try lookup(word: word)
    if words[0].isImmediate || !isCompiling
    {
        if !isCompiling && words[0].isCompileOnly
        {
            throw Error.interpretingACompileOnlyWord("\(description(of: words[0]))")
        }

        wordLists[0] = WordList(name: "interpeter", words, cellsNeeded: 1)
        wordIndex = 0
        try resume()
    }
    else
    {
        compile(words: words)
    }
}
if !returnStackIsEmpty
{
    // We are at the end of the current buffer. if we are
    // evaluating, we need to return to the previous buffer.
    wordLists[0].removeAll()
    try doExit()
    try resume()
}

OK, so it’s not very different, but for me, it seems to be a lot clearer what is actually happening. I’ve eliminated the outer loop with its thoroughly misleadingly named condition based on hasIterators. The boolean shouldExitToParser is now gone because words that need to find a word in the input buffer do it “in line”. The code after refactoring the loop is tagged blog-1810-2.

Having made the above changes, I found another boolean I didn’t need anymore, which was the hasIterators (renamed to hasCharacters) property on InputParser. So that went. I also, removed th C Forth stuff because it was getting in the way. The final cleaners up code is tagged blog-1810-3.

Unfortunately, the changes broke the evaluate core tests. This is because, when we hit the end of an evaluate string and we restore the previous input buffer, we don’t know we haven’t reached the end of it. So, unless the evaluate was the last word in that buffer, we will fail to interpret any of the remaining words. This was clearly the reason why I had two levels of loop in the first place. Even though it was necessary, it was still bad code because the reason for having it was not obvious and caused me to introduce a regression.

I’m not going to fix the regression because I intend to redesign the entire read-execute loop to remove this issue altogether. Because it is now more than a month since I started writing this blog, this will be the subject of my next blog.

One thought on “Forth: Inside Out

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.