The BASIC/Assembly Language Connection (Personal Computer Age March 1984 by Dan Rollins) With all its variables, arrays, strings, control structure and flexible graphics and I/O commands, BASIC is a very convenient programming language. But BASIC programs tend to execute slowly. Assembly language is very fast, but it is not convenient to use, nor is it easy to write. Fortunately, we don't necessarily need to choose between the two. We can write hybrid programs that are mostly written in BASIC, but which use machine language subroutines to speed up the time-critical parts of an application. The commands PEEK, POKE, BLOAD, CLEAR, CALL, and USR are provided in the BASIC language for just that purpose. There is an entire appendix in the BASIC manual describing how to make the connection, but a few short examples will go a long way toward explaining these concepts. The first step is to decide which part of your program is to be coded in assembly language. Is there any function essential to your program that BASIC doesn't supply? You might consider writing a routine that upshifts (forces into uppercase) all the characters in a string. Or perhaps you need to emulate the Applesoft calls that "clear to end of line" and "clear to end of screen." The ROM BIOS contains more powerful "window" management routines but BASIC has no commands to make use of them. BASIC supplies no way to change the default drive, but that can be accomplished by invoking a DOS service -- in all of 10 bytes of machine code. Thus, you can use machine language to add new functions to BASIC. But the most important use of machine language subroutines is to increase the speed of a program. A single CALL command can sort an array of strings at least 100 times faster than a corresponding BASIC subroutine. Many BASIC programs do a lot or printing on the screen. If you have compared the speed of the PC's PRINT command to that of other personal computers, you know that this is one of the main bottlenecks in many programs. For instance, many applications programs display a "form" for the user to fill in. Normally the form must be constructed with dozens of LOCATE and PRINT commands. This may take so much time that by the end of the day, the operator has lost an accumulated total of 30 minutes just waiting for the computer. This slack time can be minimized with a short assembly language routine which duplicates some functions the PRINT command. QPRINT.ASM, when assembled into QPRINT.COM, is a program that displays characters 10 times as quickly as the BASIC PRINT command. The secret of QPRINT is that it bypasses most of the ROM BIOS overhead by throwing characters directly onto the screen. The PC screen is a "memory-mapped" device. To display a character on the screen, we need only to place the ASCII value of that character somewhere in a particular block of memory. This block of "video memory" begins at one of two different places, depending upon which display adapter is being used. For the Monochrome Adapter, video memory begins at segment B000H. Color Graphics Adapter video memory begins at the B800H segment. To keep this discussion centered around the machine language interface, we won't go into all of the details of multiple pages, graphics modes, etc. For now, just use this rote formula to determine where to store a byte to display a character on the screen: addr = (row-1) * 160 + (column-1) * 2 where ROW and COLUMN are values that are normally used in a LOCATE statement. The following simple BASIC program will get you started on how video memory works. It is actually a BASIC implementation of the QPRINT procedure: 10 VID.SEG=&HB000 'For CGA, use &HB800 20 INPUT "Enter string, row, column :",A$,ROW,CLM 30 ADDR=(ROW-1)*160+(CLM-1)*2 40 DEF SEG=VID.SEG 50 FOR J=1 TO LEN(A$) 60 POKE ADDR,ASC(MID$(A$,J<1)) 70 ADDR=ADDR+2 80 NEXT 90 GOTO 20 This program just takes each character of the string and places it into video memory at the desired location. In line 70, the ADDR variable is incremented twice, skipping the address of the display attribute, a byte which controls the color, blink, bold, and underline attribute of the character. The first part of QPRINT reads the arguments passed from BASIC. After obtaining the screen coordinates, it calculates an address using an assembly language version of the formula presented above. Then it determines the desired video segment by invoking the ROM BIOS EQUIPMENT CHECK service and deciphering the returned code. Finally, the program uses the REP MOVSB command to copy the characters of the string variable directly to the video memory. To make the codes simple, several things have been omitted which you might want to incorporate in an enhanced version. For one thing, QPRINT assumes that the video card is in 80-column text mode. If you have a color card and have executed a SCREEN 1 or SCREEN 2 command, QPRINT will not function as expected. Also, color card users will soon become aware of another omission; as the characters are displayed, the screen will be disturbed by "video snow" -- flecks of white scattered around the screen. This is relatively easy to avoid, but it's a subject for later consideration. You might also want your enhanced version to do something about the display attribute. As written, QPRINT simply ignores the attribute byte -- each character displayed takes on the color, blink, etc. of whatever character was previously at the position. By tacking another parameter to the CALL, you could pass a selected attribute byte to the routine. Add two parameters and include even more code to the QPRINT logic, and you could pass both a foreground and a background color. Note that the program is contained entirely in one segment, CODE_SEG, and the END pseudo-op specifies the start of the code as the starting address. These are requirements of the EXE2BIN utility which is used in one step of converting the code to BLOAD format. The comments at the top of QPRINT.ASM contain a BASIC program which uses QPRINT. Before you enter these into a BASIC program, the listing must be assembled, linked, processed with the EXE2BIN utility, and processed with the BIN2BLD program below. This sound like a lot of steps, but they can all be automated with a batch file: MASM %1; LINK %1; EXE2BIN %1 BASIC BIN2BLD Name the batch file ASM2BLD.BAT and invoke it with: A>ASM2BLD QPRINT The linker will display a "No STACK segment" warning, but this can be ignored for all COM, BIN, and BLOAD files. One of the most confusing parts of Appendix C of the BASIC manual is the explanation of how to use the BLOAD command to load a program from memory. The method involves interactive work with DEBUG, providing plenty of opportunities for error. The process can be simplified. The goal is to create a BLOAD module. Appendix C does this by using the BSAVE command, but there are alternatives. A BLOAD module is a binary image of the code to be executed, just like that produced by the EXE2BIN utility. The only difference is that a 7-byte "header" is tacked onto the front of the file to identify it as a BLOAD module, supply a default load address, and indicate the length of the module. This format is not documented in the BASIC literature, but it's easy to decipher with any file utility. The format is: 1 byte -- FDH - BLOAD file ID 2 bytes -- default load address segment 2 bytes -- default load address offset 2 bytes -- length of file Thus, given a BIN file produced by EXE2BIN, all we need to do is discover the length, write out a header, and then write out the bytes of the file itself. BIN2BLD.BAS is short and simple with no bells or whistles. It automatically sets the default load address to B000:0000. This default is never to be used. It was chosen so that if by accident you forget to specify a load address in the BLOAD command, your mistake will be immediately apparent. There are several alternative methods of placing the BLOAD header at the start of a file. One way is to load it with DEBUG, use the Move command to place it 7 bytes forward in memory, use the E command to create the header, use the R command to change CX to make the file length 7 bytes longer and then use the W command to write the file back out to disk. A trickier method is to use DB and DW commands to place the 7 bytes of the header right in the assembly language listing. You must take care to realize that this will bias all addresses referred to in the listing by 7 bytes. That only matters when you refer to messages and variable storage with the module -- CALLs and JMPs are self-relative, so they are not affected by being mixed around. A machine language routine almost always needs one or more parameters. In the case of QPRINT, two integers and a string variable are required. BASIC never passes a value directly to a machine language subroutine. Instead, it passes a "pointer," or indirect reference, telling the subroutine where to find the value. In other words, it passes the VARPTR of each variable. BASIC passes these VARPTR values by PUSHing them onto the stack before making an intersegment (far) CALL to the machine language code. The best way to access these values is by setting the BP register to point to the top of the stack, and then reading the values therein by using the base-relative stack memory addressing mode -- accessing memory at offsets from BP. Remember that the routine was reached via far CALL, so the parameters will be offset from the top of the stack by at least 4 bytes. For instance, if a routine used only one parameter, an integer variable, the following sequence could be used to obtain the value of that variable: 100 CALL MY_PROC(VAR%) . . mov bp,sp ;point to top of the stack mov bx,[BP+4] ;BX is the address of VAR% mov ax,[BX] ;AX is value VAR% . . ret 2 ;DEBUG A command: use RETF 2 The final instruction is a special form of the RET opcode which Intel created for exactly this application; i.e., clearing the stack of arguments passed to subroutines. The number specified after the RET mnemonic is the number of bytes that must be discarded from the stack. Since each parameter passed by BASIC is exactly 2 bytes long, the RET mnemonic should be followed by the number of arguments times 2. Since QPRINT requires 3 arguments, it uses RET 6 to exit back to BASIC. Be sure that you understand how to access a single integer argument, because the next part gets trickier. First, the Intel standard is set up to accommodate reentrant procedures. That's a procedure which might call itself, or be invoked "simultaneously" in a multi-tasking environment. In that case, each invocation of the procedure must not affect or be affected by any other invocation. So each invocation is associated with a "frame" of data which must be kept separate from all others. In order to keep them separate, the BP register "frame pointer" should be saved when the program begins executing. Thus, the first opcode executed is a PUSH BP and the last opcode before the RET is a POP BP. The PUSH BP adds just one complication -- all the arguments end up 2 bytes lower in the stack. Keeping this in mind, we make a slight revision to the previous example: push bp ;save the frame pointer mov bp,sp ;point to top of the stack mov bx,[BP+6] ;BX is the address of VAR% mov ax,[BX] ;AX is value VAR% . . pop bp ;restore the frame pointer ret 2 ;DEBUG A command: use RETF 2 Notice that the address of VAR% is now found at [BP+6] instead of [BP+4]. The next step is learning to access more than one argument. The best way to think of this is from the viewpoint of BASIC. It interprets each line of statements from left to right. So in the line: 100 CALL QPRINT(A$,ROW%,CLM%) it sees the CALL statement and begins to prepare for the CALL. It ascertains the value of the variable QPRINT which, in conjunction with the currently active DEF SEG, will be used as the address to call. Next, it encounters the variable A$. It looks up the address of A$ and pushes that value onto the stack. Next, it pushes the address of ROW% and then it pushes the address of CLM%. Finally, it makes a far CALL to the QPRINT routine. The significance is this: since the address of A$ is pushed first, it will be farthest from the top of the stack. The address of CLM% will be closest to the top of the stack because it was pushed last. This, after the machine language routine saves BP and copies SP into it, the address of CLM% will be at [BP+6]. It follows that the address of A$ will be even lower in the stack at [BP+10]. Thus, when you write your code, you calculate the offsets from BP by reading the CALL line from right to left, starting with an offset of 6 and adding 2 for each argument. You might find it useful to set up some equates early in the program. For instance, accessing the three variables of QPRINT could be simplified with: clm_addr equ [BP+6] ;rightmost (6 + 0 * 2) row_addr equ [BP+8] ;center (6 + 1 * 2) a$_addr equ [BP+10] ;leftmost (6 + 2 * 2) . . mov bx,clm_addr ;get the address of CLM% mov ax,[bx] ;get the value of CLM% . . Then, if you change the number of positions of the arguments, you need only change the equates at the top of the listing. Experienced programmers may choose to use the STRUC pseudo-op (MASM only) to set up the offsets. This technique is especially valuable for Pascal programmers who need to pass complex data types to the machine language code. Finally, there's one more complication. When your routine processes numeric variables, BASIC passes the address where the value of the variable may be found. But when you process string variables, BASIC adds another layer of indirection, passing the address of a string descriptor block. The string descriptor block contains two items. The first byte is the length of the string. The following two bytes give the address of the first character of the string. So, accessing the character of A$ is a four-step process: mov bx,[bp+10] ;BX has address of descriptor mov cl,[bx] ;CL is the length mov si,[bx+1] ;SI is address of first character mov al,[si] ;AL is the first character Once you have SI pointing to the characters of the string and you have the length of the string in CL, you've got it licked. In QPRINT, those characters are just copied from the BASIC work area into video memory. Another program could compare the characters with those of other strings as part of a sorting process. Or each character could be forced into uppercase just by modifying the byte at [SI] and working through the bytes for the length specified in CL. In a more sophisticated application, the string could be scanned and interpreted as a command line to perform any of a number of special-purpose functions. You could write a version of QPRINT which looks for special sequences which cause it to alter the color of the following bytes, or even the direction of cursor motion. You could write your own version of the DRAW command that works with text-mode screens. QPRINT.ASM: ; QPRINT subroutine CALLed from BASIC. This routine prints a BASIC ; string on the video display. It works for color or monochrome in ; cards in 80-column text mode only. Called from BASIC via: ; ; CALL QPRINT(VAR$,ROW%,CLM%) ; Where: ; CLM% is an integer variable name (value: 1-80) ; ROW% is an integer variable name (value: 1-25) ; VAR$ is a string variable name ; ; VAR$ is displayed beginning at position CLM% of line ROW%. If it's ; too long, it will wrap around to the next line. ; ; Example use from BASIC: ; ; 10 CLEAR,60000!:QPRINT=60000! ' use 3000 for 64K machines ; 20 BLOAD "qprint.bld",QPRINT ' load at clear area in BASIC segment ; 30 FOR J=1 TO 255 ' once for each ASCII character ; 40 CLM%=1:VAR$=STRING$(80,J) ' 80-byte string of that character ; 50 FOR ROW%=1 TO 25 ' for each screen line ; 60 CALL QPRINT(VAR$,ROW%,CLM%) ' display the 80 bytes ; 70 NEXT ' next line (fill screen) ; 80 NEXT ' next character code ; code_seg segment assume CS:code_seg,DS:nothing,ES:nothing qprint proc far push bp ;save the frame pointer mov bp,sp ;point to arguments on stack mov bx,[bp+6] ;get addr of CLM% storage mov di,[bx] ;get the column value mov bx,[bp+8] ;get addr of ROW% storage mov dx,[bx] ;get the screen line value into DL mov bx,[bp+10] ;get ptr to string descriptor mov ch,0 ;string length is 1 byte mov cl,[bx] ;fetch the length mov si,[bx+1] ;point SI to first character of VAR$ cmp cx,0 ;null string? je exit ;if so, do nothing. Else, ; -- calculate the address in video memory from the ROW,CLM arguments ; -- using the formula: addr = (row-1)*160+(clm-1)*2 dec dx ;adjust ROW from LOCATE format mov al,160 mul dl ;AX=(row-1)*160 dec di ;adjust column shl di,1 ;DI=(clm-1)*2 add di,ax ;DI has correct offset into video memory ; -- find segment of the active display card mov bx,0B800H ;assume color/graphics card int 11H ;invoke EQUIPMENT-CHECK service and ax,30H cmp ax,30H ;is it the B/W card? jne card_ok ;no, go mov bx,0B000H ;yes, set for monochrome card_ok: mov es,bx ;point ES to video ; -- DS:SI points to BASIC variables area ; -- ES:DI points to video card memory ; -- CX is the length of the string ; -- Now copy VAR$ to video memory, ignoring the display attribute next: movsb ;DS:[SI] -> ES:[DI] ;SI=SI+1, DI=DI+1 inc di ;get past attribute byte loop next ;do for entire length of VAR$ exit: push ds pop es ;restore BASIC segment regs pop bp ;restore frame pointer ret 6 ;FAR return to BASIC, discarding 3 arguments qprint endp code_seg ends end qprint ;needed for .BIN file conversion BIN2BLD.BAS: 5 'BIN2BLD.BAS: Program converts a BIN-format file into a BLOAD module 10 CLS:PRINT "--- Convert BIN file to BLOAD format ---":PRINT 20 INPUT "Filename: ",F$:IN.FILE$=F$+".BIN" 30 IN.FILE$=F$+".BIN":OUT.FILE$=F$+".BLD" 40 OPEN IN.FILE$ AS #1 LEN=1 50 OPEN OUT.FILE$ AS #2 LEN=1 60 FIELD #1,1 AS IN.B$:FIELD #2,1 AS OUT.B$ 70 SIZE=LOF(1) ' Size of BIN file 80 IF SIZE=0 THEN PRINT "Can't find input file.":CLOSE:GOTO 10 90 'Place the 7-byte header in the output file 100 LSET OUT.B$=CHR$(&HFD):PUT #2 ' BLOAD file ID byte 110 LSET OUT.B$=CHR$(0):PUT #2 ' Segment LSB 120 LSET OUT.B$=CHR$(&HB0):PUT #2 ' Segment MSB 130 LSET OUT.B$=CHR$(0):PUT #2 ' Offset LSB 140 LSET OUT.B$=CHR$(0):PUT #2 ' Offset MSB 150 LSET OUT.B$=CHR$(SIZE AND 255):PUT #2 ' Length LSB 160 LSET OUT.B$=CHR$(SIZE\256):PUT #2 ' Length MSB 190 'Copy the rest of the input file to the output file 200 FOR J=1 TO SIZE 210 GET #1:LSET OUT.B$=IN.B$:PUT #2 220 NEXT 230 CLOSE 240 PRINT:PRINT "File: ";OUT.FILE$;" is ";SIZE;" bytes long." ----------------------------------------------------------------- Passing Filenames to Compiled BASIC (BYTE Magazine November 1986 by Bruce Hubanks) When you start an application program, it is often desireable to be able to specify a filename on the DOS command line. For instance, given a hypothetical data encryption program named Encode, you might like to be able to process a data file named Filex by typing: ENCODE FILEX Many language compilers provide a library function to retrieve secondary filenames and parameters from the command line. Microsoft's BASIC Compiler does not. The assembly language subroutine called GETSPEC remedies this deficiency. To make use of the routine from a BASIC application program, you simply include a couple of lines at the beginning of your program: 10 '"FILENAME.EXT" 20 F$=" " 30 CALL GETSPEC(F$) 40 PRINT "Text remaining on command string is: ";F$ 50 END Then compile the program and link it to GETSPEC. The result will be an executable application. (If you use BASCOM's IO option, you won't even need BASRUN to run the program.) Because GETSPEC obeys the Microsoft conventions for parameter passing, it could also be used to retrieve filenames for other language compilers that obey these conventions. (GETSPEC cannot be used with Microsoft's BASIC Interpreter.) A brief explanation of how DOS handles command-line information will help you understand GETSPEC.ASM. When a program is invoked from the DOS command line, DOS creates a bookkeeping area called the program segment prefix (PSP) at the lowest available memory location. Within this area, the data transfer area (DTA) contains all the characters typed after the program name. On entry to the program, the stack pointer (SP) gives the code offset address of the calling program. This address enables the subroutine to retrieve parameters from the calling program and pass values back to it. GETSPEC starts by copying the all-important SP value into BP. [BP]+4 gives the code segment address. The starting address of the PSP can be derived from this code segment address by subtracting 16. Adding 128 gives the start of the DTA, which contains a length byte followed by the specified number of bytes that were typed before the end of the line. The GETSPEC subroutine uses the stack information to locate the command-string text in the DTA. The program searches for the first nonblank character and then reads the text up to the first blank or end of text, whichever comes first. The nonblank characters are copied into a string that was defined in the main BASIC program. The string storage location is at the address given by [BP]+6. In the BASIC demonstration program above, F$ receives the filename from the subroutine. The demonstration program allocates 12 bytes to F$ -- enough to contain a filename -- but you can allocate up to 255 bytes to F$ if your application program needs to retrieve additional parameters from the DOS command line. To incorporate GETSPEC into an existing program, use GETSPEC.ASM to generate an .OBJ file. At the beginning of your application program, define a string constant to contain 12 blanks (you can use more if they are needed). Use the statement: CALL GETSPEC(F$) to call the subroutine and get the necessary text into F$. Compile your BASIC program to produce an .OBJ file. Finally, link the two .OBJ files into a single .EXE file. Here is a typical command sequence, given two source files named DEMO.BAS (above) and GETSPEC.ASM (below): MASM GETSPEC BASCOM DEMO/O LINK DEMO+GETSPEC The /O parameter tells BASCOM to create a single executable file called DEMO.EXE that contains all needed libraries. Typing DEMO FILENAME.EXT activates the program, which should produce the message "Text remaining on command string is: FILENAME.EXT." ; Routine to get a filename from the DOS command line ; using a call from a compiled BASIC program ; const segment word public 'const' const ends ; data segment word public 'data' data ends ; dgroup group data,const ; code segment byte public 'code' public getspec assume cs:code,ds:dgroup ; getspec proc far ; push bp mov bp,sp push ax push si push di push dx push cx push bx push es push ds ; mov dx,[bp]+4 sub dx,10h mov ds,dx ; ; Set up pointer to dta to get parameters mov si,0080h mov cl,[si] mov ch,0 inc si ; ; Scan past the spaces mov al,' ' getspec1: cmp [si],al jne getspec2 inc si loop getspec1 jmp getspec3 ; ; move the rest into place getspec2: mov bx,[bp]+6 pop ds mov di,[bx]+2 push ds mov ds,dx cld rep movsb clc jmp getspecexit ; getspec3: mov ax,20 stc getspecexit: pop ds pop es pop bx pop cx pop dx pop di pop si pop ax pop bp ret 1*2 ; getspec endp code ends end ----------------------------------------------------------------- ROM Calls from BASIC (BYTE Magazine November 1986 Best of BIX) ROM calls are easily coded in assembler. Other languages have a general purpose call where the INT # is passed as well as the value of certain registers. Following is a first pass at it in BASIC. It does not currently return the new register values after the interrupt nor does it return the FLAG settings (used by some DOS calls). However, it should suffice for many uses. of the 3 "DEF FNxx" statements, the one for FNCC$ is used to assign to a string an assembler routine that is built on the fly using the INT value passed as well as various register values. The other 2 FNxx's merely break up FNCC$ into 2 pieces for easier handling. The program below issues a video INT 16 (&H10) with AH set to 6 for scroll up. When AL=0 (as it is in this example) the entire window is blanked. CX gives the upper left corner as 0,0 and DX gives the lower right corner as 10,40. BH specifies the attribute used to fill in new lines. More experienced programmers can craft a version like: 100 CALL IR(IN,AX,BX,CX,DX,etc....) Such a version could also return the register values after the interrupt as well as the flag settings. 10 DEF FNAA$(DS,BX,CX,DX,BP,SI,DI)=MKI$(&H5655)+MKI$(&H61E)+CHR$(&HBB)+MKI$(BX)+CHR$(&HB9)+MKI$(CX)+CHR$(&HBA)+MKI$(DX)+CHR$(&HBD)+MKI$(BP)+CHR$(&HBE)+MKI$(SI)+CHR$(&HBF)+MKI$(DI)+CHR$(&HBB)+MKI$(DS) 20 DEF FNBB$(AX,ES,IN)=MKI$(&HD88E)+CHR$(&HB8)+MKI$(ES)+MKI$(&HC08E)+CHR$(&HB8)+MKI$(AX)+CHR$(&HCD)+LEFT$(MKI$(IN),1)+MKI$(&H1F07)+MKI$(&H5D5E)+CHR$(&HCB) 30 DEF FNCC$(IN,AX,BX,CX,DX,BP,SI,DI,DS,ES)=FNAA$(DS,BX,CX,DX,BP,SI,DI)+FNBB$(AX,ES,IN) 40 GOTO 60 50 I=VARPTR(S$):J=CVI(CHR$(PEEK(I+1))+CHR$(PEEK(I+2))):CALL J:RETURN REM AH=6,AL=0 to clear all,BH=attr, Dh=line 10, DL=column REM 40:clear window REM Video INT 16 (Hex 10) 60 S$=FNCC$(16,&H600,&H4E00,0,&HA28,0,0,0,0,0):GOSUB 50 70 END To issue Shift-PrtSc from BASIC code: 60 S$=FNCC$(5,0,0,0,0,0,0,0,0,0):GOSUB 50