In which I prematurely exit loops
The basic DO
loop is done. Now we need to implement LEAVE
. This is effectively a “go to the end of the loop”. It seems simple enough but there is a problem in that LEAVE
can occur – in fact, almost certainly will occur – inside other control structures, most likely IF
. For example:
: GD5 123 SWAP 0
DO
I
4 >
IF
DROP 234 LEAVE
THEN
LOOP ;
The DO
puts both a .orig
and a .dest
on the control stack. The IF
puts another .orig
on, then the LEAVE
also needs to put a .orig
on, and the THEN
will try to resolve the LEAVE
.orig
not the IF
.orig
.
I decided to resolve this by creating a .orig
for the live and the tucking it under all the other control flow items since the DO
was executed. Here is an example. The below represents the control stack from the above code when LEAVE
is called.
control stack: <4> .dest(12) .orig(10) .orig(22) .orig(28)
^ ^ ^ ^
| | | +-- .orig for LEAVE
| | +-- .orig created by IF
| +- .orig created from branch in DO to the end of the loop
+-- .dest to loop back to after the index test
We need to manipulate the stack to look like this:
control stack: <4> .orig(28) .dest(12) .orig(10) .orig(22)
At the end of the loop, we’ll just be left with a string of .orig
s that each represent the forward jump for a LEAVE
and we just resolve them all.
This is relatively simple if we maintain a count of how many LEAVE
.orig
s we have and how many items there are on the stack since we entered the DO
loop. The LEAVE
count is easy: in the DO
we put a count on the data stack and increment it each time we encounter LEAVE
. The second part is harder because we don’t know what was on the control stack before we encountered the DO
.
I’ve chosen a fairly generic means of achieving remembering the state of the stack when we started the DO
loop. We use the concept of a stack frame. A stack frame is merely a region of the stack relative to one of the elements on it. In practice that means there is a pointer which references a certain element on the stack and other elements may be referenced by means of an offset to the frame pointer.
Stack top
+-------------------+
| Newer |
| Elements |
| |
+-------------------+
| old frame pointer | <-- frame pointer
+-------------------+
| Older elements |
| |
Stack frames may themselves be stacked, for example, in the case of nested DO
loops. It is for this reason that I adopted the convention that the element pointed to by the frame pointer contains the previous frame pointer from before the creation of the stack frame. This effectively sets up a linked list of frame pointers within the stack.
To create a new stack frame, you simply push the current frame pointer onto the stack and set the frame pointer to point to this new element.
To remove a frame, you pop everything above the saved frame pointer element and then pop the saved frame pointer element into the frame pointer.
To make this work, you need to be able to convert the frame pointer (which is an Int
) to and from a stack element. For this reason, I created a protocol called Frameable
which contains functions for converting elements to and from frame pointers. If the Element
type of the stack has adopted this protocol, the stack frame functionality is automatically available.
Returning to DO
. The DO
word creates a stack frame on the control stack using a new primitive .csFrame
before it does anything else. It also pushes a zero LEAVE
count. At the end of DO
we have this:
stack: <1> 0
control stack: <1> .frame(old: -1)
Note that the frame pointer is initialised to -1
which means that it the stack starts with a frame that cannot be popped without a stack underflow.
When we encounter the LEAVE
in the above example Forth code, the stacks look like this:
stack: <1> 0
control stack: <4> .frame(old: -1) .dest(12) .orig(10) .orig(22)
There’s a .dest
and a .orig
for the DO
and a .orig
for the IF
. The LEAVE
adds a new .orig
and moves everything after the .frame
to above it. There’s another new primitive to help us with this: .csFrameSize
which tells us how many elements are on the stack above the frame pointer.
Once the LEAVE
is compiled, the stacks look like this:
stack: <1> 1
control stack: <5> .frame(old: -1) .orig(28) .dest(12) .orig(10) .orig(22)
In the LOOP
at the end after resolving the IF
and the DO
, we are left with
stack: <1> 1
control stack: <2> .frame(old: -1) .orig(28)
Another new primitive called .csPopFrame
restores the saved frame pointer.
The code for the above is tagged blog-1224.