From: cs94169@assn013.cs.ualberta.ca (David Bond) Newsgroups: rec.games.programmer Subject: VESA SVGA - line code and info Date: 6 Feb 1995 18:39:08 GMT Hello everyone! This is a mini-tutorial, and code, relating to VESA SVGA programming. The code includes a line procedure which is based upon Bresenham's algorithm. It is not blazingly fast, but hopefully it'll work on all SVGA cards with VESA support, and it is pretty compact - no special cases for slopes. Many people try to begin programming in SVGA modes straight from mode 13h, or Xmode variants. They quickly encounter the problem of only 64K of vid mem being accessable - falls a little short of the 300K required for 640x480x8bit! The special address space A000h - AFFFh must be mapped to different parts of the video memory to make use of it. This can be done via VESA functions. If you don't yet have 'vesasp12.txt' (PCGPE contains this document) then I suggest you get it from x2ftp.oulu.fi /pub/msdos/programming/specs/vesasp12. This document details the VESA BIOS extensions used to get info on video modes, set video modes, pan across a larger virtual screen, and set the CPU window (A000-AFFF) to map to different places in video mem. Even though VESA provides a common interface for SVGA cards, there are still some specifics that have to be dealt with. The 'granularity' of the window is the smallest amount by which it can be moved. A 64KB granularity with 1MB video memory means the CPU window can be mapped to one of 16 'chunks' in this memory. A 4KB granularity has more potential mappings - the window is still 64KB in size, but it can be positioned on any 4K boundary in video. I know granularitys of 4K, 16K, 32K, and 64K exist. Some cards are switchable (actually the only chipset I'm familiar with that has this option is Cirrus Logic - defaults to 4K, can be set to 16K. I think this is necessary for accessing >1MB). I see two ways to manage this discrepancy. Code can assume 64K granularity always, and the 'bank-switching' routines make sure the window is moved by this ammount (4K gran would require inc/dec by 16). The other way is to deal with each granularity differently - this is how the line code provided below operates. Finer granularity can speed up rendering. Line drawing will be used to illustrate. The linear start address is calculated. The low 16 bits of this address are mappable to the 64K window. The high order bits can be used to locate the position of the window. With a 64K granularity, the high word is our window location, and the low word is the displacement into the window. With 4K gran, The low 12 bits are the offset, and remaining high bits are the window location. If a line begins near the end of a 64K aligned chunk (linear position 123840, say), and continues down a short distance, It'll cross a 64K boundary. With 64K gran, the window will have to be moved. Using a 4K granularity, the initial offset into a window can be kept below 4096. So, lines that aren't too long can always be kept within the starting window. Another advantage that fine granularity provides is easier alignment with the edge of the screen. With a horizontal resolution of 640, 32 lines takes up 20KB, which is divisible by 4KB. If all windowing is then limited to be aligned on these 20KB bounds, one will never have to worry about overflowing past the end of the window while drawing across a scan-line. The windows are positioned so that the 'bottom' of the window is on these 20K bounds. Inner rendering loops that move across a horizontal line don't bother with checking for a 'page-cross'. The outer loop checks for overflow when it moves down to the next scan-line. A 64K granularity doesn't align until 512 vertical lines (320KB), which means the inner loop must check for page-crossings within the scan-line. Note: one way to create easy alignment with the edge is to change the length of a scanline to a power of 2 (say 1024). This wastes video memory, but it can be well worth it. Check vesasp12.txt for setting this. The line procedure, below, does take advantage of positioning the top of the window as close to the top of the line as it can. Thus window moving for mid-length lines is reduced for cards that have smaller granularities. It does not take advantage of alignment with the screen edge. The code is made to be fairly 'straight-forward', not much fancy is done - it's just simple, flexible, small, and I hope easy to understand. One easy optimization to add is to check if the endpoints lie in different window addresses - if not, a routine without a page-cross check can be called; otherwise the standard routine is called. Careful eyes may notice that lines are always rendered from top to bottom, but I have a macro to move the CPU window UP! A situation where this is needed: A window begins 382 pixels across on a scan-line. A line is started just two pixels into the window (at 383). The endpoint is on the far left of the screen (0), and 5 pixels down from start. The line is going to begin with a string of pixels straight to the left - passing BACKWARD through the window boundary. This occurance requires the 'PageUp' macro. If alignment is done with the screen edge, this isn't necessary. This code is provided for learning purposes, and may be used in any fashion desired - it's free! If the code doesn't work for you, please let me know. I haven't had opportunity to test it on other systems. It didn't get a rigorous test on mine either - paging is untested. Conversion to other resolutions is pretty simple. The linear address calculation is all that has to be modified (I think!?) - 'bx' may be too small at higher resoultions - use 'ebx'. This can be assembled with: tasm /m2 /ml tlink /3 Or pieces can be extracted, and interfaced to whatever you wish, however you wish. -Anthony Tavener 'Daoloth of MetaSentience' -cs94169@cs.ualberta.ca (Temporary - friend's account) ---CODE BEGIN--- .486 code segment para public use16 assume cs:code PgDown macro push bx push dx xor bx,bx mov dx,cs:winpos add dx,cs:disp64k mov cs:winpos,dx call cs:winfunc pop dx pop bx endm PgUp macro push bx push dx xor bx,bx mov dx,cs:winpos sub dx,1 mov cs:winpos,dx call cs:winfunc add di,cs:granmask inc di pop dx pop bx endm mov ax,seg stk ;\ mov ss,ax ;.set up program stack mov sp,200h ;/ call GetVESA ;init variables related to VESA support mov ax,4f02h ;\ mov bx,0101h ;.VESA mode 101h (640x480x8bit) int 10h ;/ mov ax,0a000h mov ds,ax mov eax,10h ;\ mov ebx,13h mov ecx,20bh ;test Lin procedure mov edx,1a1h mov ebp,21h call Lin ;/ mov ax,4c00h int 21h GetVESA proc ;This is just a hack to get the window-function address for a direct call, ;and to initialize variables based upon the window granularity. mov ax,4f01h ;\ mov cx,0101h lea di,buff ;.use VESA mode info call to.. push cs ;.get card stats for mode 101h pop es int 10h ;/ add di,4 mov ax,word ptr es:[di] ;get window granularity (in KB) shl ax,0ah dec ax mov cs:granmask,ax ; = granularity - 1 (in Bytes) not ax clc GVL1: inc cs:bitshift ;\ rcl ax,1 ;.just a way to get vars I need :) jc GVL1 ;/ add cs:bitshift,0fh inc ax mov disp64k,ax add di,8 mov eax,dword ptr es:[di] ;get address of window control mov cs:winfunc,eax ret buff label byte db 100h dup (?) endp Lin proc ;Codesegment: Lin ;Inputs: eax: x1, ebx: y1, cx: x2, dx: y2, bp: color ;Destroys: ax, bx, cx, edx, si, edi ;Global: winfunc(dd),winpos(dw),page(dw),granmask(dw),disp64k(dw),bitshift(db) ;Assumes: eax, ebx have clear high words cmp dx,bx ;\ ja LinS1 ;.sort vertices xchg ax,cx xchg bx,dx ;/ LinS1: sub cx,ax ;\ ja LinS2 ;.calculate deltax and neg cx ;.modify core loop based on sign xor cs:xinc1[1],28h ;/ LinS2: sub dx,bx ;deltay neg dx dec dx shl bx,7 ;\ add ax,bx ;.calc linear start address lea edi,[eax][ebx*4] ;/ mov si,dx ;\ xor bx,bx mov ax,cs:page ;\ shl ax,2 ;.pageOffset=page*5*disp64K add ax,cs:page mul cs:disp64k ;/ push cx ;.initialize CPU window mov cl,cs:bitshift ;.to top of line shld edx,edi,cl pop cx add dx,ax and di,cs:granmask mov cs:winpos,dx call cs:winfunc mov dx,si ;/ mov ax,bp mov bx,dx ;ax:color, bx:err-accumulator, cx:deltaX, dx:vertical count, ;di:location in CPU window, si:deltaY, bp:color LinL1: mov [di],al ;\ add bx,cx jns LinS3 LinE1: add di,280h jc LinR2 ;.core routine to inc dx ;.render line jnz LinL1 jmp LinOut LinL2: mov [di],al ;\ xinc1 label byte LinS3: add di,1 ;.this deals with jc LinR1 ;.horizontal pixel runs LinE2: add bx,si jns LinL2 ;/ jmp LinE1 ;/ LinR1: js LinS7 ;\ PgDown ;.move page down 64k.. mov ax,bp jmp LinE2 LinS7: PgUp ;.or up by 'granularity' mov ax,bp jmp LinE2 ;/ LinR2: PgDown ;\ mov ax,bp ;.move page down 64k inc dx jnz LinL1 ;/ LinOut: mov cs:xinc1[1],0c7h ret endp winfunc dd ? ;fullpointer to VESA setwindow function winpos dw ? ;temp storage of CPU window position granmask dw ? ;masks address within window granularity disp64k dw ? ;number of 'granules' in 64k page dw 0 ;video page (0,1,2 for 1MB video) bitshift db 0 ;used to extract high order address bits.. ;\ for setting CPU window ends stk segment para stack use16 'STACK' dw 100h dup (?) ends end ---CODE END---