ROM Addresses: Getting BASIC to do What You Want: Difference between revisions

From Bitchin100 DocGarden
Jump to navigationJump to search
No edit summary
No edit summary
Line 1: Line 1:
[[:Category:Model 100 Classics]]
How does Basic do what you tell it to do? Clues to the language's subservience lie in ROM.
 
== By JAKE COMMANDER<br/>Portable l00 September 1983</big>==
 
How on earth does Basic know what to do? All those state­ments, commands, and functions, yet the interpreter untiringly plods  through  your  code  always knowing what's required next. Just how does it do it?
 
Well, if the answer were simple, everybody would be writing Basic interpreters and putting Microsoft out of business. But it is possible to follow at least some of the pathways Basic uses to perform its duties.
 
Most addresses of the ROM routines which comprise Basic are held in two tables. These can be unravelled to give a list of routines used to perform various tasks.
 
== JUMP ADDRESSES ==
 
One table contains jump addresses for the commands ( or verbs, as it were) which will always be the first thing the interpreter picks up from a statement. The whole repertoire of such commands is catered for the table located at 0262 hex.
 
Basic gets the appropriate jump address by using the token number for the command it's about to execute. All tokens are numbers from 128 to 255; therefore subtracting 128 gives numbers from zero to 127. As each jump address in the table is two bytes long, the token (minus 128) is multiplied by two to give an offset into the table. This points straight at the address which is needed. The two-byte address is picked up and jumped to ? and we're now executing a Basic command in pure machine code.
 
What happens next depends entirely on the machine code for the  command itself. Various syntaxes are allowed for some commands but not for others. For instance, the print command would allow an expression such as <code>TAB(22);l/3</code>, so would an <code>LPRINT</code>. But a LET would have none of that. <code>LET X = TAB(22); 1/3</code> would have you on the carpet in no time.
 
Also various combinations of tokens can do different things. The comparison operators, for example, can be used pretty much interchangeably. These operators, >? <> = < , etc are all OK syntactically. This versatility means a table for such a wide set of possibilities is nigh impossible.
 
== SECOND TABLE ==
 
However, there is a second table at location 004E in the ROM. This contains many addresses used in the evaluation of Basic math functions and expressions. These are extracted and jumped to in a similar fashion to the first table.
 
Any BASIC word excluded from either of these tables is handled separately by the interpreter according to its particular use. However, out of a possible 128 tokens, these two tables give us a mechanism by which we can follow the machine-code execution of many of them. It is the combination of these routines and the syntax checking required to logically execute them that makes up an interpreter.
 
The following list has been compiled from the two tables I've described and a disassembly of other parts of the ROM. It shows the entry points for all important BASIC statements and functions. Certain functions can have more than one possible syntactic use and the list does not cover all such uses. (An example is the statement <code>OFF</code>, which can be <code>SOUND OFF</code> or <code>MOTOR OFF</code> etc.) The list is in four columns. The first is the address in ROM where the BASIC word occurs in the vocabu­lary table. The second entry is the word itself. Third is the token assigned to that word when it is encoded by the Basic interpreter.
 
== FOURTH COLUMN ==
 
The fourth column contains the address the interpreter jumps to to execute the token representing the statement or function desired. Once again, some statements can have more than one use such as <code>MID$(LH$)=RH$</code>, and <code>LH$=MID$(RH$)</code>. In these cases, two addresses are given: one for use on the left hand side of the equals sign and one for the right hand side of the sign.
 
Perhaps unsurprisingly, things get a little more complicated with the mathematical functions in BASIC. It's not simply a matter of taking an address for, say, a multiply routine and then jumping to it. The BASIC interpreter has to know the numeric type of operator it has to work on. For instance, with the addition operator, Basic has four choices: signed integer, single precision, double precision, and string. None of the other binary operators allow string manipulation, so they're limited to the numeric variable types only.
 
The addresses of these binary operators can be confirmed (if you need confirmation) from three short tables in ROM ? one each for double precision, single precision, and integer numbers respectively.
 
The tables contain six addresses apiece for addition, subtraction, multiplication, division, exponentiation, and comparison. Rather than clutter the token-addresses table, these addresses are contained separately at the end.
 
In a following article, I'll be looking at ways to use some of these addresses in your own machine-code programs. For the more adventur­ous, an experiment will probably prove irresistible. Remember, though, in a RAM-file machine such as the Model 100, a lock-up may cost you all your files. Use caution.
 
JAKES ROM ADDRESSES FOR BASIC KEYWORDS
0080  =>  END =  80 @ 409F
 
0083  => FOR =  81 @ 0726
 
0086  => NEXT  =  82 @ 4174
 
008A  =>  DATA =  83 @ 099E
 
008E  => INPUT =  84 @ 0CA3
 
0093  => DIM  =  85 @ 478B
 
0096  => READ =  86 @ OCD9
 
009A => LET =  87 @ 09C3
 
009D =>  GOTO =  88 @ 0936
 
OOA1  => RUN =  89 @ 090F
 
OOA4  =>  IF =  8A @ OB1A
 
OOA6 => RESTORE =  8B @ 407F
 
00AD => GOSU8 =  8C @ 091E
 
0082  => RETURN =  8D @ 0966
 
OOB8  =>  REM =  8E  @ 09AO
 
OOBB  =>  STOP =  8F  @ 409A
 
OOBF  => WIDTH =  9D @ 1DC3
 
OOC4 => ELSE = 91  @ 09AO
 
OOC8 =>  LINE =  92 @ OC45
 
OOCC => EDIT = 93 @ 5E51
 
OODO => ERROR = 94 @ OBOF
 
00D5  =>  RESUME =  95 @ OABO
 
OODB =>  OUT =  96 @ 11OC
 
OODE => ON =  97 @ OA2F
 
OOEO  =>  DSKOS =  98 @ 5071
 
OOE5  => OPEN =  99 @ 4CCB
 
OOE9  =>  CLOSE =  9A © 4E20
 
OOEE  =>  LOAD =  98 @ 4D70
 
OOF2  => MERGE =  9C @ 4D71
 
OOF7  =>  FILES = 9D @ 1F3A
 
OOFC  =>  SAVE =  9E  © 4DCF
 
0100  => LFILES =  9F @ 506F
 
0106  =>  LPRINT =  AO @ OB4E
 
010C => DEF =  A1  @ 0872
 
010F  =>  POKE =  A2 @ 128B
 
0113  => PRINT =  A3 @ 0B56
 
0118  => CONT  =  A4 @ 40DA
 
011C  =>  LIST =  A5 @ 1140
 
0120  => LLIST =  A6 @ 113B
 
0125  => CLEAR =  A7 @ 40F9
 
012A  => CLOAD =  A8@ 2377
 
012F  => CSAVE =  A9 @ 2280
 
0134  =>  T1ME$ = AA @ 19AB 1904
 
0139  =>  DATES = AB @ 19BD 1924
 
013E  =>  DAYS =  AC © 19F1 1955
 
0142  => COM =  AD @ 1A9E
 
0145  =>  MDM =  AE @  1A9E
 
0148  =>  KEY =  AF @ 1BB8
 
014B  =>  CLS =  BO @ 4231
 
014E  => BEEP =  B1  @ 4229
 
0152  =>  SOUND =  B2  @ 1DC5
 
0157  => LCOPY = B3 @ 1E5E
 
015C  => PSET =  B4 @ 1C57
 
0160  =>  PRESET =  B5 @ 1C66
 
0166  =>  MOTOR =  B6 @ 1DEC
 
016B  =>  MAX =  B7 @ 7FOB  19DB
 
016E  => POWER =  B8 @ 1419
 
0173  =>  CALL =  B9 @ 1DFA
 
0177  =>  MENU =  BA @ 5797
 
017B  =>  IPL =  BB @ 1A78
 
017E  => NAME =  BC @ 2037
 
0182  => KILL =  BD @ 1F91
 
0186  =>  3CREEN =  BE  @ 1E22
 
018C  =>  NEW =  BF @ 20FE
 
018F  => TAB( = CO @ OC01
?
0193  => TO = C1 @ 076B
0195  => USING = C2@ 4991
D19A  => VARPTR = C3 @ OF7E
01AO  => ERL = C4@ OF56
01 A3  => ERR = C5@ OF47
01 A6  => STING$ = C6@ 296D
01 AD  => INSTR = C7 @ 2A37
0182  => DSKI$ = C8 @ 5073
01B7  => INKEY$ = C9@ 4BEA
01BD  => CSRLIN = CA@ 1D90
01C3  => OFF = C8 @ various
01C6  => HIMEM = CC @ 1DB9
01CB  => THEN = CD® OB2A
01CF  => NOT = CE @ 1054
01D2  => STEP = CF @ O783
01 06  => + = D0 *
01D7  =>  - = D1 *  See
01D8  =>  * = D2 * table
01 D9  => / = D3 * at end
010A  =>  /\ = D4 *
01 DB  => AND = D5 @ 1097
01 DE  => OR  = 06 @ 108C
O1EO  => XOR  = D7 @ 10A2
01E3  => EQV  = D8 @ 10AD
01E6  => IMP  = D9 @ 10B5
01E9  => MOD  = DA@ 37DF
01 EC  => \  = DB @ 377E
01ED  => >    = DC@ OE29
01EE  => =    = DD@ OE29
01EF  => <    = DE @ OE29
01FO  => SGN  = DF @ 3407
01F3  => INT  = E0 @ 3654
OIF6  => ABS  = E1 @ 33F2
01F9  => FRE  = E2 @ 2B4C
01FC  => INP  = E3 @ 1100
01FF  => LPOS = E4 @ 10C8
0203  => POS  = E5 @ 10CE
0206  => SQR  = E6 @ 305A
0209  => RND  = E7 @ 313E
020C  => LOG  = E8 @ 2FCF
020F  => EXP  = E9 @ 30A4
0212  => COS  = EA @ 2EEF
 
0215  => SIN  = EB @ 2F09
0218  => TAN  = EC @ 2F58
021B  => ATN  = ED @ 2F71
021E  => PEEK = EE @ 1284
0222  => EOF  = EF  @ 1889
0225  => LOG  = F0  @ 506D
0228  => LOF  = F1  @ 506B
022B  => CINT = F2  @ 3501
022F  => CSNG = F3  @ 352A
0233  => CDBL = F4  @ 35BA
0237  => FIX  = F5  @ 3645
023A  => LEN  = F6  @ 2943
023D  => STR$ = F7  @ 273A
0241  => VAL  = F8  @ 2A07
0244  => ASC  = F9  @ 294F
0247  => CHR$ = FA  @ 295F
0248  => SPACE$ = FB  @ 298E
0251  => LEFT$  = FC @ 29AB
0256  => RIGHT$ = FD @ 29DC
025C  => MID$  = FE  @ 2AC2  29E6
0260  => '      = FF  @ OA90
 
 
+      -      *      /        >    Cmpr
D.P.  2B78    2B69  2CFF  2DC7  3D8E    34FA
S.P.  37F4    37FD  3803  380E  3D7F    3498
INT    3704    36F8  3725  OFOD  3DF7    34C2
Siring 28CC                                270C
Portable l00 September 1983
 
[[Category:Model 100 Classics]]

Revision as of 22:50, 22 January 2009

How does Basic do what you tell it to do? Clues to the language's subservience lie in ROM.

By JAKE COMMANDER
Portable l00 September 1983

How on earth does Basic know what to do? All those state­ments, commands, and functions, yet the interpreter untiringly plods through your code always knowing what's required next. Just how does it do it?

Well, if the answer were simple, everybody would be writing Basic interpreters and putting Microsoft out of business. But it is possible to follow at least some of the pathways Basic uses to perform its duties.

Most addresses of the ROM routines which comprise Basic are held in two tables. These can be unravelled to give a list of routines used to perform various tasks.

JUMP ADDRESSES

One table contains jump addresses for the commands ( or verbs, as it were) which will always be the first thing the interpreter picks up from a statement. The whole repertoire of such commands is catered for the table located at 0262 hex.

Basic gets the appropriate jump address by using the token number for the command it's about to execute. All tokens are numbers from 128 to 255; therefore subtracting 128 gives numbers from zero to 127. As each jump address in the table is two bytes long, the token (minus 128) is multiplied by two to give an offset into the table. This points straight at the address which is needed. The two-byte address is picked up and jumped to ? and we're now executing a Basic command in pure machine code.

What happens next depends entirely on the machine code for the command itself. Various syntaxes are allowed for some commands but not for others. For instance, the print command would allow an expression such as TAB(22);l/3, so would an LPRINT. But a LET would have none of that. LET X = TAB(22); 1/3 would have you on the carpet in no time.

Also various combinations of tokens can do different things. The comparison operators, for example, can be used pretty much interchangeably. These operators, >? <> = < , etc are all OK syntactically. This versatility means a table for such a wide set of possibilities is nigh impossible.

SECOND TABLE

However, there is a second table at location 004E in the ROM. This contains many addresses used in the evaluation of Basic math functions and expressions. These are extracted and jumped to in a similar fashion to the first table.

Any BASIC word excluded from either of these tables is handled separately by the interpreter according to its particular use. However, out of a possible 128 tokens, these two tables give us a mechanism by which we can follow the machine-code execution of many of them. It is the combination of these routines and the syntax checking required to logically execute them that makes up an interpreter.

The following list has been compiled from the two tables I've described and a disassembly of other parts of the ROM. It shows the entry points for all important BASIC statements and functions. Certain functions can have more than one possible syntactic use and the list does not cover all such uses. (An example is the statement OFF, which can be SOUND OFF or MOTOR OFF etc.) The list is in four columns. The first is the address in ROM where the BASIC word occurs in the vocabu­lary table. The second entry is the word itself. Third is the token assigned to that word when it is encoded by the Basic interpreter.

FOURTH COLUMN

The fourth column contains the address the interpreter jumps to to execute the token representing the statement or function desired. Once again, some statements can have more than one use such as MID$(LH$)=RH$, and LH$=MID$(RH$). In these cases, two addresses are given: one for use on the left hand side of the equals sign and one for the right hand side of the sign.

Perhaps unsurprisingly, things get a little more complicated with the mathematical functions in BASIC. It's not simply a matter of taking an address for, say, a multiply routine and then jumping to it. The BASIC interpreter has to know the numeric type of operator it has to work on. For instance, with the addition operator, Basic has four choices: signed integer, single precision, double precision, and string. None of the other binary operators allow string manipulation, so they're limited to the numeric variable types only.

The addresses of these binary operators can be confirmed (if you need confirmation) from three short tables in ROM ? one each for double precision, single precision, and integer numbers respectively.

The tables contain six addresses apiece for addition, subtraction, multiplication, division, exponentiation, and comparison. Rather than clutter the token-addresses table, these addresses are contained separately at the end.

In a following article, I'll be looking at ways to use some of these addresses in your own machine-code programs. For the more adventur­ous, an experiment will probably prove irresistible. Remember, though, in a RAM-file machine such as the Model 100, a lock-up may cost you all your files. Use caution.

JAKES ROM ADDRESSES FOR BASIC KEYWORDS

0080 => END = 80 @ 409F

0083 => FOR = 81 @ 0726

0086 => NEXT = 82 @ 4174

008A => DATA = 83 @ 099E

008E => INPUT = 84 @ 0CA3

0093 => DIM = 85 @ 478B

0096 => READ = 86 @ OCD9

009A => LET = 87 @ 09C3

009D => GOTO = 88 @ 0936

OOA1 => RUN = 89 @ 090F

OOA4 => IF = 8A @ OB1A

OOA6 => RESTORE = 8B @ 407F

00AD => GOSU8 = 8C @ 091E

0082 => RETURN = 8D @ 0966

OOB8 => REM = 8E @ 09AO

OOBB => STOP = 8F @ 409A

OOBF => WIDTH = 9D @ 1DC3

OOC4 => ELSE = 91 @ 09AO

OOC8 => LINE = 92 @ OC45

OOCC => EDIT = 93 @ 5E51

OODO => ERROR = 94 @ OBOF

00D5 => RESUME = 95 @ OABO

OODB => OUT = 96 @ 11OC

OODE => ON = 97 @ OA2F

OOEO => DSKOS = 98 @ 5071

OOE5 => OPEN = 99 @ 4CCB

OOE9 => CLOSE = 9A © 4E20

OOEE => LOAD = 98 @ 4D70

OOF2 => MERGE = 9C @ 4D71

OOF7 => FILES = 9D @ 1F3A

OOFC => SAVE = 9E © 4DCF

0100 => LFILES = 9F @ 506F

0106 => LPRINT = AO @ OB4E

010C => DEF = A1 @ 0872

010F => POKE = A2 @ 128B

0113 => PRINT = A3 @ 0B56

0118 => CONT = A4 @ 40DA

011C => LIST = A5 @ 1140

0120 => LLIST = A6 @ 113B

0125 => CLEAR = A7 @ 40F9

012A => CLOAD = A8@ 2377

012F => CSAVE = A9 @ 2280

0134 => T1ME$ = AA @ 19AB 1904

0139 => DATES = AB @ 19BD 1924

013E => DAYS = AC © 19F1 1955

0142 => COM = AD @ 1A9E

0145 => MDM = AE @ 1A9E

0148 => KEY = AF @ 1BB8

014B => CLS = BO @ 4231

014E => BEEP = B1 @ 4229

0152 => SOUND = B2 @ 1DC5

0157 => LCOPY = B3 @ 1E5E

015C => PSET = B4 @ 1C57

0160 => PRESET = B5 @ 1C66

0166 => MOTOR = B6 @ 1DEC

016B => MAX = B7 @ 7FOB 19DB

016E => POWER = B8 @ 1419

0173 => CALL = B9 @ 1DFA

0177 => MENU = BA @ 5797

017B => IPL = BB @ 1A78

017E => NAME = BC @ 2037

0182 => KILL = BD @ 1F91

0186 => 3CREEN = BE @ 1E22

018C => NEW = BF @ 20FE

018F => TAB( = CO @ OC01  ? 0193 => TO = C1 @ 076B

0195 => USING = C2@ 4991

D19A => VARPTR = C3 @ OF7E

01AO => ERL = C4@ OF56

01 A3 => ERR = C5@ OF47

01 A6 => STING$ = C6@ 296D

01 AD => INSTR = C7 @ 2A37

0182 => DSKI$ = C8 @ 5073

01B7 => INKEY$ = C9@ 4BEA

01BD => CSRLIN = CA@ 1D90

01C3 => OFF = C8 @ various

01C6 => HIMEM = CC @ 1DB9

01CB => THEN = CD® OB2A

01CF => NOT = CE @ 1054

01D2 => STEP = CF @ O783

01 06 => + = D0 *

01D7 => - = D1 * See

01D8 => * = D2 * table

01 D9 => / = D3 * at end

010A => /\ = D4 *

01 DB => AND = D5 @ 1097

01 DE => OR = 06 @ 108C

O1EO => XOR = D7 @ 10A2

01E3 => EQV = D8 @ 10AD

01E6 => IMP = D9 @ 10B5

01E9 => MOD = DA@ 37DF

01 EC => \ = DB @ 377E

01ED => > = DC@ OE29

01EE => = = DD@ OE29

01EF => < = DE @ OE29

01FO => SGN = DF @ 3407

01F3 => INT = E0 @ 3654

OIF6 => ABS = E1 @ 33F2

01F9 => FRE = E2 @ 2B4C

01FC => INP = E3 @ 1100

01FF => LPOS = E4 @ 10C8

0203 => POS = E5 @ 10CE

0206 => SQR = E6 @ 305A

0209 => RND = E7 @ 313E

020C => LOG = E8 @ 2FCF

020F => EXP = E9 @ 30A4

0212 => COS = EA @ 2EEF

0215 => SIN = EB @ 2F09

0218 => TAN = EC @ 2F58

021B => ATN = ED @ 2F71

021E => PEEK = EE @ 1284

0222 => EOF = EF @ 1889

0225 => LOG = F0 @ 506D

0228 => LOF = F1 @ 506B

022B => CINT = F2 @ 3501

022F => CSNG = F3 @ 352A

0233 => CDBL = F4 @ 35BA

0237 => FIX = F5 @ 3645

023A => LEN = F6 @ 2943

023D => STR$ = F7 @ 273A

0241 => VAL = F8 @ 2A07

0244 => ASC = F9 @ 294F

0247 => CHR$ = FA @ 295F

0248 => SPACE$ = FB @ 298E

0251 => LEFT$ = FC @ 29AB

0256 => RIGHT$ = FD @ 29DC

025C => MID$ = FE @ 2AC2 29E6

0260 => ' = FF @ OA90


+ - * / > Cmpr D.P. 2B78 2B69 2CFF 2DC7 3D8E 34FA S.P. 37F4 37FD 3803 380E 3D7F 3498 INT 3704 36F8 3725 OFOD 3DF7 34C2 Siring 28CC 270C

Portable l00 September 1983