Go Forth and Leave

In which I prematurely exit loops

The basic DO loop is done. Now we need to implement LEAVE. This is effectively a “go to the end of the loop”. It seems simple enough but there is a problem in that LEAVE can occur – in fact, almost certainly will occur – inside other control structures, most likely IF. For example:

: GD5 123 SWAP 0 
DO 
    I 
    4 > 
    IF 
        DROP 234 LEAVE 
    THEN 
LOOP ;

The DO puts both a .orig and a .dest on the control stack. The IF puts another .orig on, then the LEAVE also needs to put a .orig on, and the THEN will try to resolve the LEAVE .orig not the IF .orig.

I decided to resolve this by creating a .orig for the live and the tucking it under all the other control flow items since the DO was executed. Here is an example. The below represents the control stack from the above code when LEAVE is called.

control stack: <4> .dest(12) .orig(10) .orig(22) .orig(28)
                    ^         ^         ^         ^
                    |         |         |         +-- .orig for LEAVE
                    |         |         +-- .orig created by IF
                    |         +- .orig created from branch in DO to the end of the loop
                    +-- .dest to loop back to after the index test

We need to manipulate the stack to look like this:

control stack: <4> .orig(28) .dest(12) .orig(10) .orig(22)

At the end of the loop, we’ll just be left with a string of .origs that each represent the forward jump for a LEAVE and we just resolve them all.

This is relatively simple if we maintain a count of how many LEAVE .origs we have and how many items there are on the stack since we entered the DO loop. The LEAVE count is easy: in the DO we put a count on the data stack and increment it each time we encounter LEAVE. The second part is harder because we don’t know what was on the control stack before we encountered the DO.

I’ve chosen a fairly generic means of achieving remembering the state of the stack when we started the DO loop. We use the concept of a stack frame. A stack frame is merely a region of the stack relative to one of the elements on it. In practice that means there is a pointer which references a certain element on the stack and other elements may be referenced by means of an offset to the frame pointer.

Stack top
+-------------------+
| Newer             | 
| Elements          |
|                   |
+-------------------+
| old frame pointer | <-- frame pointer
+-------------------+
| Older elements    |
|                   |

Stack frames may themselves be stacked, for example, in the case of nested DO loops. It is for this reason that I adopted the convention that the element pointed to by the frame pointer contains the previous frame pointer from before the creation of the stack frame. This effectively sets up a linked list of frame pointers within the stack.

To create a new stack frame, you simply push the current frame pointer onto the stack and set the frame pointer to point to this new element.

To remove a frame, you pop everything above the saved frame pointer element and then pop the saved frame pointer element into the frame pointer.

To make this work, you need to be able to convert the frame pointer (which is an Int) to and from a stack element. For this reason, I created a protocol called Frameable which contains functions for converting elements to and from frame pointers. If the Element type of the stack has adopted this protocol, the stack frame functionality is automatically available.

Returning to DO. The DO word creates a stack frame on the control stack using a new primitive .csFrame before it does anything else. It also pushes a zero LEAVE count. At the end of DO we have this:

stack: <1> 0
control stack: <1> .frame(old: -1)

Note that the frame pointer is initialised to -1 which means that it the stack starts with a frame that cannot be popped without a stack underflow.

When we encounter the LEAVE in the above example Forth code, the stacks look like this:

stack: <1> 0
control stack: <4> .frame(old: -1) .dest(12) .orig(10) .orig(22)

There’s a .dest and a .orig for the DO and a .orig for the IF. The LEAVE adds a new .orig and moves everything after the .frame to above it. There’s another new primitive to help us with this: .csFrameSize which tells us how many elements are on the stack above the frame pointer.

Once the LEAVE is compiled, the stacks look like this:

stack: <1> 1
control stack: <5> .frame(old: -1) .orig(28) .dest(12) .orig(10) .orig(22)

In the LOOP at the end after resolving the IF and the DO, we are left with

stack: <1> 1
control stack: <2> .frame(old: -1) .orig(28)

Another new primitive called .csPopFrame restores the saved frame pointer.

The code for the above is tagged blog-1224.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.