Introduction
BBC BASIC includes a powerful inline assembler allowing you to include Z80 machine code directly in you BBC BASIC programs. '[' enters assembler mode and ']' exits assembler mode. You can then call assembled Z80 code from BASIC with the CALL or USR keywords.
Statements
An assembly language statement consists of three elements; an optional label, an instruction and an operand. A comment may follow the operand field. The instruction following a label must be separated from it by at least one space. Similarly, the operand must also be separated from the instruction by a space.
Statements are terminated by a colon (:) or end of line.
Labels
Any BBC BASIC (Z80) numeric variable may be used as a label. These (external) labels are defined by an assignment (count=23 for instance). Internal labels are defined by preceding them with a full stop. When the assembler encounters such a label, a BASIC variable is created containing the current value of the Program Counter (P%). (See The Program Counter for more information).
In the example shown later under the heading The Assembly Process, two internal labels are defined and used. Labels have the same rules as standard BBC BASIC (Z80) variable names; they should start with a letter and not start with a keyword.
Comments
You can insert comments into assembly language programs by preceding them with a semi-colon (;) or a back-slash (\). In assembly language, a comment ends at the end of the statement. Thus, the following example will work (but it's a bit untidy).
[;start assembly language program etc LD A,C ;In-line comment:POP BC ;start add JR NZ,loop ;Go back if not finished:RET ;Return etc ;end assembly language program:]
Language Syntax
All standard Zilog mnemonics are accepted: ADD, ADC and SBC must be followed by A or HL. For example, ADD A,C is accepted but ADD C is not. However, the brackets around the port number in IN and OUT are optional. Thus both OUT (5),A and OUT 5,A are accepted. The instruction IN F,(C) is not accepted, but the equivalent code is produced from IN (HL),C.
Constants
You can store constants within your assembly language program using the define byte (DEFB), define word (DEFW) and define string (DEFM) pseudo-operation commands. These will create 1 byte, 2 byte or string items respectively.
Define Byte - DEFB
DEFB can be used to set one byte of memory to a particular value. For example,
.data DEFB 15 DEFB 9
will set two consecutive bytes of memory to 15 and 9 (decimal). The address of the first byte will be stored in the variable 'data'.
Define Word - DEFW
DEFW can be used to set two bytes of memory to a particular value. The first byte is set to the least significant byte of the number and the second to the most significant byte. For example,
.data DW &90F
will have the same result as the Define Byte - DEFB example.
Define String - DEFM
DEFM can be used to load a string of ASCII characters into memory. For example,
JP continue; jump round the data .string DEFM "This is a test message" DEFB &D .continue; and continue the process
will load the string 'This is a test message' followed by a carriage-return into memory. The address of the start of the message is loaded into the variable 'string'. This is equivalent to the following program segment:
JP continue; jump round the data .string; leave assembly and load the string ] $P%="This is a test message" REM starting at P% P%=P%+LEN($P%)+1 REM adjust P% to next free byte [ OPT opt; reset OPT .continue; and continue the program
Reserving Memory
The Program Counter
Machine code instructions are assembled as if they were going to be placed in memory at the addresses specified by the program counter, P%. Their actual location in memory may be determined by O% depending on the OPTion specified. You must make sure that P% (or O%) is pointing to a free area of memory before your program begins assembly. In addition, you need to reserve the area of memory that your machine code program will use so that it is not overwritten at run time. You can reserve memory by using a special version of the DIM statement or by changing HIMEM or LOMEM.
Using DIM to Reserve Memory
Using the special version of the DIM statement to reserve an area of memory is the simplest way for short programs which do not have to be located at a particular memory address. (See the keyword DIM for more details.) For example,
DIM code 20: REM Note the absence of brackets
will reserve 21 bytes of code (byte 0 to byte 20) and load the variable 'code' with the start address of the reserved area. You can then set P% (or O%) to the start of that area. The example below reserves an area of memory 100 bytes long and sets P% to the first byte of the reserved area.
DIM sort% 99 P%=sort%
Moving HIMEM to Reserve Memory
If you are going to use a machine code program in a number of your BBC BASIC (Z80) programs, the simplest way is to assemble it once, save it using *SAVE and load it from each of your programs using *LOAD. In order for this to work, the machine code program must be loaded into the same address each time. The most convenient way to arrange this is to move HIMEM down by the length of the program and load the machine code program in to this protected area. Theoretically, you could raise LOMEM to provide a similar protected area below your BBC BASIC (Z80) program. However, altering LOMEM destroys all your dynamic variables and is more risky.
Length of Reserved Memory
You must reserve an area of memory which is sufficiently large for your machine code program before you assemble it, but you may have no real idea how long the program will be until after it is assembled. How then can you know how much memory to reserve? Unfortunately, the answer is that you can't. However, you can add to your program to find the length used and then change the memory reserved by the DIM statement to the correct amount.
In the example below, a large amount of memory is initially reserved. To begin with, a single pass is made through the assembly code and the length needed for the code is calculated (lines 100 to 120). After a CLEAR, the correct amount of memory is reserved (line 140) and a further two passes of the assembly code are performed as usual. Your program should not, of course, subsequently try to use variables set before the clear statement. If you use a similar structure to the example and place the program lines which initiate the assembly function at the start of your program, you can place your assembly code anywhere you like and still avoid this problem.
100 DIM free -1, code HIMEM-free-1000 110 PROC_ass(0) 120 L%=P%-code 130 CLEAR 140 DIM code L% 150 PROC_ass(0) 160 PROC_ass(2) - - - Put the rest of your program here. - - - 1000 DEF PROC_ass(opt) 10010 P%=code 10020 [OPT opt - - - Assembler code program. - - - 11000 ] 11010 ENDPROC
Initial Setting of the Program Counter
The program counters, P%, and O% are initialised to zero. Using the assembler without first setting P% (and O%) is liable to corrupt BBC BASIC (Z80)'s workspace and possibly result in a crash (see the appendix entitled Format of Program and Variables in Memory).
The Assembly Process
OPT
The only assembly directive is OPT. It controls the way the assembler works, whether a listing is displayed and whether errors are reported. OPT should be followed by a number in the range 0 to 7. The way the assembler functions is controlled by the three bits of this number in the following manner.
Bit 0 - LSB
Bit 0 controls the listing. If it is set, a listing is displayed.
Bit 1
Bit 1 controls the error reporting. If it is set, errors are reported.
Bit 2
Bit 2 controls where the assembled code is placed. If bit 2 is set, code is placed in memory starting at the address specified by O%. However, the program counter (P%) is still used by the assembler for calculating the instruction addresses.
Assembly at a Different Address
In general, machine code will only run properly if it is in memory at the addresses for which it was assembled. Thus, at first glance, the option of assembling it in a different area of memory is of little use. However, using this facility, it is possible to build up a library of machine code utilities for use by a number of programs. The machine code can be assembled for a particular address by one program without any constraints as to its actual location in memory and saved using *SAVE. This code can then be loaded into its working location from a number of different programs using *LOAD.
OPT Summary
Code Assembled Starting at P%
The code is assembled using the program counter (P%) to calculate the instruction addresses and the code is also placed in memory at the address specified by the program counter.
- OPT 0 reports no errors and gives no listing.
- OPT 1 reports no errors, but gives a listing.
- OPT 2 reports errors, but gives no listing.
- OPT 3 reports errors and gives a listing.
Code Assembled Starting at O%
The code is assembled using the program counter (P%) to calculate the instruction addresses. However, the assembled code is placed in memory at the address specified by O%.
- OPT 4 reports no errors and gives no listing.
- OPT 5 reports no errors, but gives a listing.
- OPT 6 reports errors, but gives no listing.
- OPT 7 reports errors and gives a listing.
How the Assembler Works
The assembler works line by line through the machine code. When it finds a label declared it generates a BBC BASIC variable with that name and loads it with the current value of the program counter (P%). This is fine all the while labels are declared before they are used. However, labels are often used for forward jumps and no variable with that name would exist when it was first encountered. When this happens, a 'No such variable' error occurs. If error reporting has not been disabled, this error is reported and BBC BASIC (Z80) returns to the direct mode in the normal way. If error reporting has been disabled (OPT 0, 1, 4 or 5), the current value of the program counter is used in place of the address which would have been found in the variable, and assembly continues. By the end of the assembly process the variable will exist (assuming the code is correct), but this is of little use since the assembler cannot 'back track' and correct the errors. However, if a second pass is made through the assembly code, all the labels will exist as variables and errors will not occur.
The example below shows the result of two passes through a (completely futile) demonstration program. Ten bytes of memory are reserved for the program. (If the program was run, it would 'doom-loop' from line 50 to 70 and back again). The program disables error reporting by using OPT 1.
10 DIM code 9 20 FOR opt=1 TO 3 STEP 2 30 P%=code 40 [OPT opt 50 .jim JP fred 60 DEFW &2345 70 .fred JP jim 80 ] 90 NEXT
This is the first pass through the assembly process (note that the 'JP fred' instruction jumps to itself):
C07A OPT opt C07A C3 7A C0 .jim JP fred C07D 45 23 DEFW &2345 C07F C3 7A C0 .fred JP jim
This is the second pass through the assembly process (note that the 'JP fred' instruction now jumps to the correct address):
C07A OPT opt C07A C3 7F C0 .jim JP fred C07D 45 23 DEFW &2345 C07F C3 7A C0 .fred JP jim
Generally, if labels have been used, you must make two passes through the assembly language code to resolve forward references. This can be done using a FOR...NEXT loop. Normally, the first pass should be with OPT 0 (or OPT 4) and the second pass with OPT 2 (OPT 6). If you want a listing, use OPT 3 (OPT 7) for the second pass. During the first pass, a table of variables giving the address of the labels is built. Labels which have not yet been included in the table (forward references) will generate the address of the current op-code. The correct address will be generated during the second pass.
Conditional Assembly and Macros
Introduction
Most machine code assemblers provide conditional assembly and macro facilities. The assembler does not directly offer these facilities, but it is possible to implement them by using other features of BBC BASIC (Z80).
Conditional Assembly
You may wish to write a program which makes use of special facilities and which will be run on different types of computer. The majority of the assembly code will be the same, but some of it will be different. In the example below, different output routines are assembled depending on the value of 'flag'.
DIM code 200 FOR pass=0 TO 3 STEP 3 [OPT pass .start - - - - - - code - - - - - - :] : IF flag [OPT pass: - code for routine 1 -:] IF NOT flag [OPT pass: - code for routine 2 - :] : [OPT pass .more_code - - - - - - code - - - - - -:] NEXT
Macros
Within any machine code program it is often necessary to repeat a section of code a number of times and this can become quite tedious. You can avoid this repetition by defining a macro which you use every time you want to include the code. The example below uses a macro to expand a value from 8 bits to 16. Conditional assembly is used within the macro to select a different register depending on the value of op_flag.
It is possible to suppress the listing of the code in a macro by forcing bit 0 of OPT to zero for the duration of the macro code. This can most easily be done by ANDing the value passed to OPT with 6. This is illustrated in PROC_extend_hl and PROC_extend_de in the example below.
DIM code 200 op_flag=TRUE FOR pass=0 TO 3 STEP 3 [OPT pass .start - - - - - - code - - - - - - : OPT FN_select(op_flag); Include code depending on op_flag : - - - - - - code - - - - - -:] NEXT END : : REM Include code depending on value of op_flag : DEF FN_select(op_flag) IF op_flag PROC_extend_hl ELSE PROC_extend_de =pass REM Return original value of OPT. This is a REM bit artificial, but necessary to insert REM some BBC BASIC (Z80) code in the assembly code. : DEF PROC_extend_hl [OPT pass AND 6 LD L,A ADD A,A SBC A,A LD H,A:] ENDPROC : DEF PROC_extend_de [OPT pass AND 6 LD E,A ADD A,A SBC A,A LD D,A:] ENDPROC
The use of a function call to incorporate the code provides a neat way of incorporating the macro within the program and allows parameters to be passed to it. The function should return the original value of OPT.