Stacking Forth Returns

In which I create a real return stack

So far, the return stack has been a hybrid of the Swift stack for returning from code and a stack implemented within the code for temporary values and loop control. It’s time to implement the whole stack in the code. I’m not sure if this will be beneficial or detrimental to performance. If it has a really bad effect, I may back the change out. For this reason, I am working on a new branch in git.

As preparation, I have tightened up the requirements for the collection passed into the interpret function. It now has to be a random access collection and the index has to be an Int. This is because I need to store the index on the return stack. The code is tagged blog-1229-1.With this change in, the performance is as follows:

Mean time = 2.2451142259999997)
Mean mips = 2.41133254482349
Time standard deviation = 0.05344836229779769
words executed: 5413717

The initial attempt at storing return addresses on the return stack is tagged blog-1229-2. Some points to note about it:

  • The mechanism for returning the loop indexes I and J is broken because the they are defined as ordinary words and when they are called, they put a return address on the return stack above the loop index.
  • There’s a failing unit test in testQuote
  • We have lost some performance. It’s approximately 12% slower.
Mean time = 2.7090813193999996)
Mean mips = 1.998358986580372
Time standard deviation = 0.11162057920377298

The first problem is easily fixed. The definitions for I goes from

: i
	r.loop> 			( -- e i )
	2dup				( e i -- e i e i )
	>r.loop				( e i e i -- e i )
	swap drop			( e i -- i )
	;

to

: i
	postpone r.loop> 	( -- e i )
	postpone 2dup		( e i -- e i e i )
	postpone >r.loop	( e i e i -- e i )
	postpone swap
	postpone drop		( e i -- i )
	; immediate compile-only

which means that, I now compiles the instructions directly into the current compiling word instead of a call to I. A similar change can be made to J to fix that too.

The failing unit test in testQuote is because the EXECUTE primitive always tries to execute the execution token on the stack as if it was a primitive. EXECUTE needs to be restructured to call non primitives rather than execute them. This bug wasn’t picked up because execute takes a CompiledWord as its first argument, not a primitive and just ignores the word if it isn’t a primitive. So, we’ll make it so it has to be a primitive and save ourselves a check in execute.

The code that implements the two fixes is tagged blog-1229-3. Note that, as part of the fix for the quote issue, I have moved wordNumber and index out of the interpret loop to be properties of the Forth machine. This has had no serious effect on performance. On the other hand, I can probably remove the ipAdjustment fix and change index directly for branches etc.

Mean time = 2.6311988872)
Mean mips = 2.0575096114307905
Time standard deviation = 0.02964115736226729

Having made the above changes, stack performance is now critical, so I took the decision to reimplement the Toolbox stack directly in the MyForth module. This opens the code up to more optimisations and the performance effect is dramatic:

Mean time = 1.06199634395)
Mean mips = 5.097679507882454
Time standard deviation = 0.030844099259880086

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.