In which I continue my war against associated values
In my previous post, I ended by trying to eliminate the CompiledWord
cases that use associated values. These are:
case parsedLiteral(Int)
case undefined(Word)
case redefined(Word)
case definition(Cell)
I’ve got rid of the last of these, but it was a difficult struggle. I did it by introducing a new word .jsr
similar to .pushNext
. So, .definition(n)
becomes .jsr .parsedLiteral(n)
. Unfortunately, I hit many issues as a result. In particular, it transpired I needed to implement two versions of .jsr
, one for normal words and one for immediate words, otherwise, even immediate words got compiled. I am somewhat suspicious of my solution, as a result. It doesn’t feel particularly clean.
Anyway, the code is tagged blog-1176-1. For those interested, the performance hasn’t suffered much
Time of run: 3.510276212s, words executed: 4778203, mips: 1.3612042789298315
["121393"]
The next step was to remove .undefined
and .redefined
. This turned out to be much simpler than .definition
. They were only there to allow me to feed the raw string into the interpreter function as a compiled word. Once all the processing done in the needsWordName
state had been factored out into a separate function, they could go. Timings were pretty much unchanged:
Time of run: 3.488767197s, words executed: 4778203, mips: 1.3695964018776572
["121393"]
The code is tagged blog 1176-2.
That just left .parsedLiteral
to remove. It was only there to wrap an arbitrary integer as a CompiledWord
which is an enumeration. We could remove it if we used a different kind of type as CompiledWord
. For example, it could be an alias for Int
. Or, it could be encoded as an OptionSet
. This seems more interesting in the short term because it opens the possibility for flag easy flag manipulation, for example, the fact that a word is a primitive is already encoded in the top bit. There might be a performance advantage in encoding the immediate and compile only flags there too and also small constants. My plan was
- Change
CompiledWord
to anOptionSet
type with arawValue
ofInt
type - Use a raw valued enumeration to allocate numbers to the primitives.
.parsedLiteral
is just anyCompiledWord
treated as itsrawValue
.
Having made the changes (and eliminated the token
property in favour of rawValue
), performance was disappointingly slightly down. The loss seemed mostly due to the elimination of token
. I can only guess that token
, being read only, allowed extra optimisations where it was used.
Time of run: 3.649850963s, words executed: 4778203, mips: 1.3091501676201458
["121393"]
The code is tagged blog-1176-3.
Changing CompiledWord
to an OptionSet
opened up a number of possibilities. For example, we can store the compile-only and immediate bits directly in CompiledWord
. We can also eliminate .jsr
and .jsrImmediate
as primitives and use the combination of the primitive bit and the immediate bit to tell us which one to do. The layout of a CompiledWord
at this point looks like:
63 62 61 60 0
+--+--+--+--------------------------------------------+
| | | | Word list index (61 bits) |
+--+--+--+--------------------------------------------+
^ ^ ^
| | +- compile only
| +---- immediate
+------- primitive
Implementing the flags as shown above improves performance considerably.
Time of run: 2.575161858s, words executed: 4778203, mips: 1.8554961837276482
["121393"]
The code is tagged blog-1176-4