In which I refactor word lists and primitives
I’ve made a lot of progress with the core test suite, but unfortunately, I have come to the test case for '
and EXECUTE
. These tests assume you can put an execution token on the data stack. An execution token is a number which refers to a particular word which will then cause that word to be executed when the EXECUTE
word is encountered. I need to figure out a way for an ordinary integer to refer to either a word list or a primitive.
I could create a token on the fly to use that references the word list or primitive, but there might be garbage collection problems with that, or an overhead caused by having to manage the collection and reuse existing tokens.
An alternative might be to store the word lists and primitives in an array and use the array index as the execution token. This seems more attractive to me because it will probably make it easier later on to make a more low level machine with threaded code. Also, the index could be the reference and the word list could be converted to a struct
. This might give a small performance improvement due to the elimination of reference counting.
Stage 1 is to implement the array of word lists and measure the performance loss or gain.
Currently, the dictionary looks like the image below. Defined words have a definition including a pointer to a WordList
which is a Swift reference.

The first stage looks like this:

WordList
s will be stored in an array and definitions, instead of containing an associated WordList
will contain an associated index into the array.
Having made the change (tagged blog-1135-1), everything is still working. Unfortunately, I lost a mip of performance. I managed to pull some of it back by converting WordList
to a struct
, so the Fibonacci test case now runs at 1.7 mips. roughly.
Time of run: 2.787402232s, words executed: 4778189, mips: 1.7142086438567508
["121393"]
The next immediate problem is that I need to allocate some numbers to the primitives so we can quote them as well. My current thought is to use negative integers for them, starting from Int.min
. That way, a primitive is like a word list but with the top bit set.
Having implemented execution tokens for wordlists and primitives, there is a problem: As well as '
we need to implement EXECUTE
which will execute the execution token on the top of the stack. This is fine for a word list: we just interpret the word list at its index in the dictionary. For primitives, it is more tricky: there is no easy way to map back from the execution token to the primitive. The solution is to base the execution switch cases on execution tokens rather than the compile word itself. There are four CompiledWords
that can’t be accounted for in this way because they have associated values. These are .definition
, .undefined
, .redefined
and .parsedLiteral
. .definition
is accounted for because its word list index will be used by EXECUTE
. .parsedliteral
is only used to place a literal on the stack and the other two are used only to pass words to be defined. So they’re OK.
The changes required for this have a bad effect on performance, probably mainly due to the switch not being as optimisable as it was.
Time of run: 4.679748481s, words executed: 4778189, mips: 1.0210354294466195
["121393"]
But we now have everything in place for '
and EXECUTE
.
Everything, except one thing: how to execute a token. The obvious way is to factor out the execution switch from the interpreter loop and call it recursively for EXECUTE
. This, together with some minor changes needed to make it work with postfix words (words like :
and '
with a parameter after them in the input stream), actually gave me a slight performance boost.
Time of run: 3.217441131s, words executed: 4778189, mips: 1.4850897982133118
["121393"]
The code is tagged blog-1135-2.