Interfacing Assembly Language Routines with dBASE by Ralph Davis Creating Assembler Programs with DEBUG DEBUG is the assembly language programmer's best friend. It is a powerful tool for exploring the computer's memory, testing assembly language programs, studying program listings, and creating new programs. Additionally, it can be used to rebuild corrupted data files, convert hidden files to accessible files, or simply analyze file structures. Our main interest in DEBUG here is to create assembly language routines for use with dBASE II and dBASE III. It is tempting to use DEBUG because of its interpreter-like qualities. You can quickly enter code and then see if it works.If it does, you call it .COM and write it to disk. If it doesn't, you trace through the old code, enter new code, and try again. Eventually, you come up with a program that works through trial-and-error. However, this can lead to sloppy programming habits and inefficient code, so it is important to bear in mind what you want a particular program to accomplish. DEBUG has some limitations. Most importantly, it only recognizes absolute addresses. When you write a program for submission to an assembler, you label the instructions and data you will need to refer to, then refer to them with the label. You don't need to know the actual addresses. DEBUG, on the other hand, obliges you to look through your program listing and find addresses whenever you refer to them. For instance, instead of entering JMP EXIT, you must enter JMP 02FC. Instead of CALL HEXPRINT, you use CALL 05AE. Instead of MOV BX, OFFSET DATA, you need MOV BX, 0105. If your routine is small, this does not present a problem. But as you add features and it becomes larger, this becomes a serious impediment. If you add or alter instructions, thereby changing an absolute address, you have to change every reference to it. And the only way to find the references is to page through the entire program, line by line. For this reason, DEBUG is best for creating short utility programs. Most often, programs created with DEBUG use BIOS or DOS interrupts to manipulate the hardware. Some typical functions that appear in this issue are setting the cursor (see the example on page 4-72C of the Developer's Release Reference Manual and the program listed in this issue), manipulating the shift keys, or swapping printer ports. Programs of this type should not contain any subroutines. DEBUG has another important limitation: it only understands hexadecimal numbers. There is simply nothing you can do to make it accept decimal numbers. This is not a problem when entering addresses or interrupt numbers, as most assembly language programmers think these values in hexadecimal anyway. But very few programmers think in hex when doing calculations. DEBUG is therefore not a good tool for doing number-crunching of even intermediate complexity. Although there are utilities available to assist in this process, such as Sidekick, this is still a major obstacle to doing extensive calculations within DEBUG. Another problem with DEBUG is that code produced with it can be extremely obscure. Trying to decipher the flow of a program where you have only absolute addresses and hexadecimal numbers to guide you can be very frustrating. In addition, DEBUG does not support comments. So when you read a DEBUG listing, you are, for all intents and purposes, reading "machine English." The machine expresses its own language in cryptic English-like symbols, making a few grudging concessions to your desire to understand it. All of this reinforces what we suggested earlier: keep DEBUG routines short. The program from the Developer's Release Reference Manual mentioned above is a good example of a program appropriate for DEBUG. The listing on page 4-72C is as follows: _PROG SEGMENT BYTE ASSUME CS:_PROG ; CURSOR PROC FAR ; Force a far return. ; MOV CX,[BX] ; Get two HEX digits. MOV AH,1 ; Set cursor type. INT 10H ; Video interrupt. RET ; Do a far return. ; CURSOR ENDP ; _PROG ENDS END This is a terse routine that converts the dBASE III cursor to a full-sized box when CHR(18) passed as a parameter to it. Notice one thing about this code: it has six lines of assembler directives (the first three and the last three), and only four lines of machine instructions. In a short program like this one, there is no advantage to assembling, linking, and converting it using MASM, LINK, and EXE2BIN. DEBUG is faster and easier. Here is a DEBUG session that enters this program as a .COM file. (The DEBUG commands are explained in Chapter 8 of the PC/MS-DOS manual. Page numbers which follow refer to it.) D>debug First give DEBUG the 'A' (assemble) command (page 8-15) and enter the program. -A 6257:0100 MOV CX,[BX] 6257:0102 MOV AH,1 6257:0104 INT 10 6257:0106 INT 20 6257:0108 Notice that 'INT 20' is our last instruction, not 'RET' as the manual indicates. We will explain this shortly. The address following the last instruction is 108. Therefore, enter eight into CX using the 'R' (register) command [page 8-41]. This tells DEBUG the number of bytes to write to disk. -RCX CX 0000 :8 Name the program CURSOR.COM using the 'N' command [page 8-37], and write it to disk using 'W' [page 8-55]. -NCURSOR.COM -W Writing 0008 bytes This is the basic procedure for creating a .COM file from DEBUG. CURSOR.COM will yield unpredictable results executed from PC/MS-DOS, since the registers are not preserved, and we have no way of knowing what is being passed in DS:BX. (When we tested it, the cursor simply vanished.) Nor, in its present form, will it work in dBASE III. It needs a couple of changes to make it work, but this point deserves some attention. PC/MS-DOS .COM files and dBASE LOAD modules require slightly different specifications. A .COM file must be ORGed (originated) at address 100H, and it must end with a command like INT 20H (terminate) or INT 27H (terminate and stay resident); a simple RET will not return correctly. dBASE III, on the other hand, requires LOAD modules to be ORGed at address 0 and to return to dBASE III with a far return, RETF. If you load a conventional .COM file, ORGed at 100H and terminated with INT 20H, into dBASE III, and then call it, you will lock the system, even if it works from PC/MS-DOS. When DEBUG writes a program to disk, it writes a binary file -- that is, a file which contains nothing but the machine instructions you have given it. Therefore, we need not concern ourselves with ORGing programs correctly at this stage. We do have to terminate LOAD modules with RETF, however. Here is a DEBUG session that enters this program as a .BIN file which will execute from dBASE III. D>debug Type 'A' for assemble. Terminate with a RETF. -A 6346:0100 MOV CX,[BX] 6346:0102 MOV AH,1 6346:0104 INT 10 6346:0106 RETF 6346:0107 Place the number 7 in the CX register to save 7 bytes to disk. -RCX CX 0000 :7 Name the file, and write it. -NCURSOR.BIN -W Writing 0007 bytes Quit DEBUG. -Q The page of the Developer's Release Manual referred to above gives the following example of how to use Cursor: LOAD Cursor STORE CHR(18) TO shape CALL Cursor WITH shape The commands to convert the cursor back to its normal format are: LOAD Cursor STORE CHR(12) + CHR(11) to shape CALL Cursor WITH shape .COM Files vs. .EXE Files When creating programs with a full-featured assembler, we have two options: .COM files and .EXE files. Each has advantages and disadvantages. .COM files are an inheritance from the world of 8-bit CP/M. They are your only option if you have a CP/M machine. .COM files must adhere to a strictly defined structure. 1. They must fit entirely within one segment. All segment registers must point to the same address, and cannot be changed during the execution of the program. This means that all of our main program, subroutines, and data must fit in 64K. A 64K .COM file is a very large program -- each line of code assembles to between 1 and 6 bytes, so a 64K .COM file could have as many as 30,000 lines of source code. 2. They must be ORGed at 100H. When PC/MS-DOS loads a .COM file, it jumps to CS:100H and begins executing. 3. They must return control to their calling routine with either INT 20H or INT 27H, or the equivalent INT 21H function calls, 4CH and 31H. .COM files load more quickly than .EXE files, since no addresses need to be calculated at load time. The assembly language programs that dBASE II and dBASE III can execute as subroutines (with the CALL command) are variations of the .COM file. We will discuss the specifics of their formats later. .EXE files are less limited structurally. The segment registers can be freely manipulated, and each one can point to an entirely different 64K segment. .EXE files can therefore be much larger than .COM files. .EXE files were designed to take better advantage of the actual architecture of 16-bit 8086-based microprocessors. Having data in one segment, code in another, and the stack in a third allows much greater utilization of the memory space available in today's machines. It also provides us the semblance of structured programming in assembly language. The SEGMENT, PROC, ENDS, and ENDP operators give a program listing a much more organized appearance than it has with JMP and DB statements interspersed throughout the code. .EXE files take longer to load than .COM files, as many of the absolute addresses are not computed until load time. They also take up more disk space than .COM files. However, since they use much more of the 8086 family's capabilities, they can be much more powerful programs. The commercial programs which were handed down from the CP/M world are all .COM files, whereas those which were created since the advent of 16-bit machines are mostly .EXE files. Having said this, we will leave .EXE files behind. You cannot LOAD .EXE files from dBASE II or dBASE III. You can execute them with QUIT TO in dBASE II or RUN(!) in dBASE III. If you want to pass parameters to and from .EXE files, you must pass them in text files (the SDF format is recommended). Adapting Assembly Language Programs to dBASE II or III As mentioned earlier, the format of a dBASE II or III assembly language subroutine most closely resembles that of a .COM file. Most importantly, it must reside in one segment. Since it is intended as a subroutine, not as a stand-alone program, it will differ somewhat from a standard .COM file. For one thing, a .COM file must be ORGed at 100H. However, ORGing a dBASE (II or III) subroutine at 100H will cause it to fail. A program intended for use in dBASE II must be ORGed high in the code segment -- the exact address depends on the version of dBASE II, the later the version, the higher the address. In version 2.43*, the ORG address should be above 61440 decimal. (See Robert Boies' article on swapping printer ports in the August issue of TechNotes for a good example of a dBASE II assembly language program.) A program intended for dBASE III must be ORGed at 0 (that is, it need not have an ORG statement). Secondly, .COM files return to their caller with interrupts (usually 20H or 27H), whereas dBASE II and dBASE III routines require RET (return) -- near for dBASE II, far for dBASE III. The procedure for converting assembly language source code into programs dBASE II or III can execute are as follows: 1. For dBASE II, you must assemble your program with an assembler that produces a file in Intel .HEX format. Intel's assemblers, ASM (for CP/M) and ASM86 (for CP/M-86), create such a file. For PC/MS-DOS, the Seattle Computer Products assembler generates a .HEX file. Refer to their manuals, as their assembly language syntax differs somewhat from Microsoft's and IBM's. 2. For dBASE III, use the IBM or Microsoft Macro-Assembler (MASM.EXE) to produce a .OBJ (object) file. Enter the command as follows: MASM ; The third parameter will cause MASM to produce a listing file with a .LST extension, which is very useful for debugging. 3. Use the linker utility (LINK.EXE) that comes both with PC/MS-DOS and with the assembler. This will create an .EXE file. The command is: LINK Press Return three times in response to the prompts. 4. Use EXE2BIN.EXE to convert the program to .COM or .BIN format. If you are creating a .BIN file, you need only enter one parameter in the command line: EXE2BIN If you are creating a .COM file, you need to specify the full target filename: EXE2BIN .COM Using Conditional Assembler Directives Because the differences between .COM files and .BIN files are minor, it is possible to generate both using the same source code. The following program skeleton shows how to set this up. The EQU statements at the top inform the assembler whether we are assembling a program for PC/MS-DOS or dBASE III. In the present example, we have set COM equal to 0 (meaning false) and D3 equal to 1 (non-zero, meaning true). We then use conditional directives to tell the assembler how we want the program created. Conditional directives are statements in your assembly program to direct the assembler to assemble a block of instructions based on a variable value. For example, IF COM (if COM is not zero), ORG the program at offset 100H. Then at the end of the program, IF COM, exit with INT 20H; otherwise, exit with a far RET. .LFCOND ; List false conditionals, PAGE 60,132 ; page length 60, line 132. COM EQU 0 ; Assemble program as .BIN D3 EQU 1 ; file for dBASE III. CODESEG SEGMENT BYTE PUBLIC 'CODE' ROUTINE PROC FAR ASSUME CS:CODESEG,DS:CODESEG IF COM ORG 100H ENDIF PUSH DS ; Make sure DS points to PUSH CS ; the current POP DS ; segment. . . (program goes here) . . POP DS ; Restore caller's DS. IF COM INT 20H ; INT 20H if .COM file. ELSE RET ; Far return if dBASE III. ENDIF ROUTINE ENDP CODESEG ENDS END It is very important to load the DS register with the segment address contained in CS. PC/MS-DOS does this automatically for a .COM file, but dBASE III does not. Therefore, if your routine needs to access its own data, it will need to set DS correctly. Sample Program With Conditional Assembly Here is an program built on the skeletal structure which sets condensed print on an EPSON printer. ; Program ...: Printer.ASM ; Author ....: Ralph Davis ; Date ......: September 1, 1985 TITLE PRINTER.ASM -- sets condensed print .LFCOND PAGE 60,132 COM EQU 0 D3 EQU 1 CODESEG SEGMENT BYTE PUBLIC 'CODE' PRINTER PROC FAR ASSUME CS:CODESEG,DS:CODESEG IF COM ORG 100H ENDIF START: JMP SHORT ENTRY ; Jump past data. CODES DB 27,64,27,15 ; Printer control codes. CODELEN EQU $-CODES ; Length of string. ENTRY: PUSH AX ; Save registers. PUSH BX PUSH DS PUSH CS ; Set up DS POP DS ; with current segment. PUSH CX ; Save CX PUSH DX ; and DX. MOV BX,OFFSET CODES ; Point BX to codes. MOV CX,CODELEN ; Length of string. ; Controls the loop. GET_CODE: MOV DL,BYTE PTR [BX] ; Get code to send. MOV AH,5H ; PC/MS-DOS function 5H, INT 21H ; (send char to printer). INC BX ; Point to next code LOOP GET_CODE ; and print it. POP DX ; Restore registers. POP CX POP DS POP BX POP AX IF COM INT 20H ; INT 20H if .COM file. ELSE RET ; Far return to dBASE III. ENDIF PRINTER ENDP CODESEG ENDS END START ; End assembly. Assemble this program according to the instructions given earlier. To run it from dBASE II or dBASE III versions 1.0 and 1.1, assemble it as a .COM file, and enter the following commands: dBASE II: QUIT TO 'Printer' dBASE III: RUN Printer To run it from the Developer's Release of dBASE III, assemble it as a .BIN file, and use these commands: LOAD Printer CALL Printer