ROM Addresses: Getting BASIC to do What You Want
How does BASIC do what you tell it to do? Clues to the language's subservience lie in ROM.
By JAKE COMMANDER
Portable l00 September 1983
How on earth does Basic know what to do? All those statements, commands, and functions, yet the interpreter untiringly plods through your code always knowing what's required next. Just how does it do it?
Well, if the answer were simple, everybody would be writing BASIC interpreters and putting Microsoft out of business. But it is possible to follow at least some of the pathways Basic uses to perform its duties.
Most addresses of the ROM routines which comprise BASIC are held in two tables. These can be unraveled to give a list of routines used to perform various tasks.
JUMP ADDRESSES
One table contains jump addresses for the commands (or verbs, as it were) which will always be the first thing the interpreter picks up from a statement. The whole repertoire of such commands is catered for the table located at 0262 hex.
Basic gets the appropriate jump address by using the token number for the command it's about to execute. All tokens are numbers from 128 to 255; therefore subtracting 128 gives numbers from zero to 127. As each jump address in the table is two bytes long, the token (minus 128) is multiplied by two to give an offset into the table. This points straight at the address which is needed. The two-byte address is picked up and jumped to ? and we're now executing a Basic command in pure machine code.
What happens next depends entirely on the machine code for the command itself. Various syntaxes are allowed for some commands but not for others. For instance, the print command would allow an expression such as TAB(22);l/3
, so would an LPRINT
. But a LET would have none of that. LET X = TAB(22); 1/3
would have you on the carpet in no time.
Also various combinations of tokens can do different things. The comparison operators, for example, can be used pretty much interchangeably. These operators, >? <> = < , etc are all OK syntactically. This versatility means a table for such a wide set of possibilities is nigh impossible.
SECOND TABLE
However, there is a second table at location 004E in the ROM. This contains many addresses used in the evaluation of Basic math functions and expressions. These are extracted and jumped to in a similar fashion to the first table.
Any BASIC word excluded from either of these tables is handled separately by the interpreter according to its particular use. However, out of a possible 128 tokens, these two tables give us a mechanism by which we can follow the machine-code execution of many of them. It is the combination of these routines and the syntax checking required to logically execute them that makes up an interpreter.
The following list has been compiled from the two tables I've described and a disassembly of other parts of the ROM. It shows the entry points for all important BASIC statements and functions. Certain functions can have more than one possible syntactic use and the list does not cover all such uses. (An example is the statement OFF
, which can be SOUND OFF
or MOTOR OFF
etc.) The list is in four columns. The first is the address in ROM where the BASIC word occurs in the vocabulary table. The second entry is the word itself. Third is the token assigned to that word when it is encoded by the Basic interpreter.
FOURTH COLUMN
The fourth column contains the address the interpreter jumps to to execute the token representing the statement or function desired. Once again, some statements can have more than one use such as MID$(LH$)=RH$
, and LH$=MID$(RH$)
. In these cases, two addresses are given: one for use on the left hand side of the equals sign and one for the right hand side of the sign.
Perhaps unsurprisingly, things get a little more complicated with the mathematical functions in BASIC. It's not simply a matter of taking an address for, say, a multiply routine and then jumping to it. The BASIC interpreter has to know the numeric type of operator it has to work on. For instance, with the addition operator, Basic has four choices: signed integer, single precision, double precision, and string. None of the other binary operators allow string manipulation, so they're limited to the numeric variable types only.
The addresses of these binary operators can be confirmed (if you need confirmation) from three short tables in ROM ? one each for double precision, single precision, and integer numbers respectively.
The tables contain six addresses apiece for addition, subtraction, multiplication, division, exponentiation, and comparison. Rather than clutter the token-addresses table, these addresses are contained separately at the end.
In a following article, I'll be looking at ways to use some of these addresses in your own machine-code programs. For the more adventurous, an experiment will probably prove irresistible. Remember, though, in a RAM-file machine such as the Model 100, a lock-up may cost you all your files. Use caution.
JAKES ROM ADDRESSES FOR BASIC KEYWORDS
0080 => END = 80 @ 409F
0083 => FOR = 81 @ 0726
0086 => NEXT = 82 @ 4174
008A => DATA = 83 @ 099E
008E => INPUT = 84 @ 0CA3
0093 => DIM = 85 @ 478B
0096 => READ = 86 @ OCD9
009A => LET = 87 @ 09C3
009D => GOTO = 88 @ 0936
OOA1 => RUN = 89 @ 090F
OOA4 => IF = 8A @ OB1A
OOA6 => RESTORE = 8B @ 407F
00AD => GOSU8 = 8C @ 091E
0082 => RETURN = 8D @ 0966
OOB8 => REM = 8E @ 09AO
OOBB => STOP = 8F @ 409A
OOBF => WIDTH = 9D @ 1DC3
OOC4 => ELSE = 91 @ 09AO
OOC8 => LINE = 92 @ OC45
OOCC => EDIT = 93 @ 5E51
OODO => ERROR = 94 @ OBOF
00D5 => RESUME = 95 @ OABO
OODB => OUT = 96 @ 11OC
OODE => ON = 97 @ OA2F
OOEO => DSKOS = 98 @ 5071
OOE5 => OPEN = 99 @ 4CCB
OOE9 => CLOSE = 9A © 4E20
OOEE => LOAD = 98 @ 4D70
OOF2 => MERGE = 9C @ 4D71
OOF7 => FILES = 9D @ 1F3A
OOFC => SAVE = 9E © 4DCF
0100 => LFILES = 9F @ 506F
0106 => LPRINT = AO @ OB4E
010C => DEF = A1 @ 0872
010F => POKE = A2 @ 128B
0113 => PRINT = A3 @ 0B56
0118 => CONT = A4 @ 40DA
011C => LIST = A5 @ 1140
0120 => LLIST = A6 @ 113B
0125 => CLEAR = A7 @ 40F9
012A => CLOAD = A8@ 2377
012F => CSAVE = A9 @ 2280
0134 => T1ME$ = AA @ 19AB 1904
0139 => DATES = AB @ 19BD 1924
013E => DAYS = AC © 19F1 1955
0142 => COM = AD @ 1A9E
0145 => MDM = AE @ 1A9E
0148 => KEY = AF @ 1BB8
014B => CLS = BO @ 4231
014E => BEEP = B1 @ 4229
0152 => SOUND = B2 @ 1DC5
0157 => LCOPY = B3 @ 1E5E
015C => PSET = B4 @ 1C57
0160 => PRESET = B5 @ 1C66
0166 => MOTOR = B6 @ 1DEC
016B => MAX = B7 @ 7FOB 19DB
016E => POWER = B8 @ 1419
0173 => CALL = B9 @ 1DFA
0177 => MENU = BA @ 5797
017B => IPL = BB @ 1A78
017E => NAME = BC @ 2037
0182 => KILL = BD @ 1F91
0186 => 3CREEN = BE @ 1E22
018C => NEW = BF @ 20FE
018F => TAB( = CO @ OC01 ? 0193 => TO = C1 @ 076B
0195 => USING = C2@ 4991
D19A => VARPTR = C3 @ OF7E
01AO => ERL = C4@ OF56
01 A3 => ERR = C5@ OF47
01 A6 => STING$ = C6@ 296D
01 AD => INSTR = C7 @ 2A37
0182 => DSKI$ = C8 @ 5073
01B7 => INKEY$ = C9@ 4BEA
01BD => CSRLIN = CA@ 1D90
01C3 => OFF = C8 @ various
01C6 => HIMEM = CC @ 1DB9
01CB => THEN = CD® OB2A
01CF => NOT = CE @ 1054
01D2 => STEP = CF @ O783
01 06 => + = D0 *
01D7 => - = D1 * See
01D8 => * = D2 * table
01 D9 => / = D3 * at end
010A => /\ = D4 *
01 DB => AND = D5 @ 1097
01 DE => OR = 06 @ 108C
O1EO => XOR = D7 @ 10A2
01E3 => EQV = D8 @ 10AD
01E6 => IMP = D9 @ 10B5
01E9 => MOD = DA@ 37DF
01 EC => \ = DB @ 377E
01ED => > = DC@ OE29
01EE => = = DD@ OE29
01EF => < = DE @ OE29
01FO => SGN = DF @ 3407
01F3 => INT = E0 @ 3654
OIF6 => ABS = E1 @ 33F2
01F9 => FRE = E2 @ 2B4C
01FC => INP = E3 @ 1100
01FF => LPOS = E4 @ 10C8
0203 => POS = E5 @ 10CE
0206 => SQR = E6 @ 305A
0209 => RND = E7 @ 313E
020C => LOG = E8 @ 2FCF
020F => EXP = E9 @ 30A4
0212 => COS = EA @ 2EEF
0215 => SIN = EB @ 2F09
0218 => TAN = EC @ 2F58
021B => ATN = ED @ 2F71
021E => PEEK = EE @ 1284
0222 => EOF = EF @ 1889
0225 => LOG = F0 @ 506D
0228 => LOF = F1 @ 506B
022B => CINT = F2 @ 3501
022F => CSNG = F3 @ 352A
0233 => CDBL = F4 @ 35BA
0237 => FIX = F5 @ 3645
023A => LEN = F6 @ 2943
023D => STR$ = F7 @ 273A
0241 => VAL = F8 @ 2A07
0244 => ASC = F9 @ 294F
0247 => CHR$ = FA @ 295F
0248 => SPACE$ = FB @ 298E
0251 => LEFT$ = FC @ 29AB
0256 => RIGHT$ = FD @ 29DC
025C => MID$ = FE @ 2AC2 29E6
0260 => ' = FF @ OA90
+ - * / > Cmpr
D.P. 2B78 2B69 2CFF 2DC7 3D8E 34FA
S.P. 37F4 37FD 3803 380E 3D7F 3498
INT 3704 36F8 3725 OFOD 3DF7 34C2
Siring 28CC 270C
Portable l00 September 1983