ROM Addresses: Getting BASIC to do What You Want
How does BASIC do what you tell it to do? Clues to the language's subservience lie in ROM.
By JAKE COMMANDER
Portable l00 September 1983, pg. 24
How on earth does Basic know what to do? All those statements, commands, and functions, yet the interpreter untiringly plods through your code always knowing what's required next. Just how does it do it?
Well, if the answer were simple, everybody would be writing BASIC interpreters and putting Microsoft out of business. But it is possible to follow at least some of the pathways Basic uses to perform its duties.
Most addresses of the ROM routines which comprise BASIC are held in two tables. These can be unraveled to give a list of routines used to perform various tasks.
One table contains jump addresses for the commands (or verbs, as it were) which will always be the first thing the interpreter picks up from a statement. The whole repertoire of such commands is catered for the table located at 0262 hex.
BASIC gets the appropriate jump address by using the token number for the command it's about to execute. All tokens are numbers from 128 to 255; therefore subtracting 128 gives numbers from zero to 127. As each jump address in the table is two bytes long, the token (minus 128) is multiplied by two to give an offset into the table. This points straight at the address which is needed. The two-byte address is picked up and jumped to ? and we're now executing a BASIC command in pure machine code.
What happens next depends entirely on the machine code for the command itself. Various syntaxes are allowed for some commands but not for others. For instance, the print command would allow an expression such as
TAB(22);l/3, so would an
LPRINT. But a
LET would have none of that.
LET X = TAB(22); 1/3 would have you on the carpet in no time.
Also various combinations of tokens can do different things. The comparison operators, for example, can be used pretty much interchangeably. These operators, >? <> = < , etc are all OK syntactically. This versatility means a table for such a wide set of possibilities is nigh impossible.
However, there is a second table at location 004E in the ROM. This contains many addresses used in the evaluation of Basic math functions and expressions. These are extracted and jumped to in a similar fashion to the first table.
Any BASIC word excluded from either of these tables is handled separately by the interpreter according to its particular use. However, out of a possible 128 tokens, these two tables give us a mechanism by which we can follow the machine-code execution of many of them. It is the combination of these routines and the syntax checking required to logically execute them that makes up an interpreter.
The following list has been compiled from the two tables I've described and a disassembly of other parts of the ROM. It shows the entry points for all important BASIC statements and functions. Certain functions can have more than one possible syntactic use and the list does not cover all such uses. (An example is the statement
OFF, which can be
SOUND OFF or
MOTOR OFF etc.) The list is in four columns. The first is the address in ROM where the BASIC word occurs in the vocabulary table. The second entry is the word itself. Third is the token assigned to that word when it is encoded by the BASIC interpreter.
The fourth column contains the address the interpreter jumps to to execute the token representing the statement or function desired. Once again, some statements can have more than one use such as
LH$=MID$(RH$). In these cases, two addresses are given: one for use on the left hand side of the equals sign and one for the right hand side of the sign.
Perhaps unsurprisingly, things get a little more complicated with the mathematical functions in BASIC. It's not simply a matter of taking an address for, say, a multiply routine and then jumping to it. The BASIC interpreter has to know the numeric type of operator it has to work on. For instance, with the addition operator, BASIC has four choices:
- Signed integer
- Single precision
- Double precision
None of the other binary operators allow string manipulation, so they're limited to the numeric variable types only.
The addresses of these binary operators can be confirmed (if you need confirmation) from three short tables in ROM, one each for double precision, single precision, and integer numbers respectively.
The tables contain six addresses apiece for addition, subtraction, multiplication, division, exponentiation, and comparison. Rather than clutter the token-addresses table, these addresses are contained separately at the end.
In a following article, I'll be looking at ways to use some of these addresses in your own machine-code programs. For the more adventurous, an experiment will probably prove irresistible. Remember, though, in a RAM-file machine such as the Model 100, a lock-up may cost you all your files. Use caution.
Jake's ROM Addresses For BASIC Keywords
|0106||+||D0||See table 2|
|01D7||-||D1||See table 2|
|01D8||*||D2||See table 2|
|01D9||/||D3||See table 2|
|010A||\||D4||See table 2|
+ - * / > Cmpr D.P. 2B78 2B69 2CFF 2DC7 3D8E 34FA S.P. 37F4 37FD 3803 380E 3D7F 3498 INT 3704 36F8 3725 OFOD 3DF7 34C2 String 28CC 270C