Data Compression and Decompression in Tight Places
		    (C) Copyright 1992, Jeffrey R. Fox
		  By Jeffrey R. Fox (Contract Programmer)
			      802 Ithaca Dr.
			     Boulder, CO 80303
			       303-494-0343

	I describe  in	this  report  an implementation   of   a    general
purpose  data compression/decompression algorithm which is simple, achieves
good compression, and uses few memory resources. The primary liability is a
slow  compression  rate,  but  that  is  not  a  significant factor in many
applications, and optimizations which  trade  higher  memory  usage  during
compression  for  higher  compression  rates are referenced.  The   present
FORTH  language  implementation  is  particularly  targeted  at  programing
environments, such as embedded FORTH systems,  where  RAM  memory  is  very
limited.   Possible  applications  in  such  environments  are	loading  of
compressed source and binary overlays into memory,  compression  of  logged
data,  transfer  of compressed data over a bandwidth limited communications
link,  access  of  compressed  data  bases,  etc.  This  implementation  is
configured  to	compress and subsequently decompress DOS files in order  to
simply illustrate the basic functionality.

	Interest in data compression has been  rapidly	increasing  lately,
but  the  subject  is  older  than  the  computer  revolution  itself. Data
compression is achieved by removal  of	the  natural  redundancies  of	the
original  information,	and  that  is  also  a	key  step in the process of
creating realiable cyphers, that is, methods of  writing  secret  messages.
The   method   of   encryptation   used  by  Benedict  Arnold  during	the
revolutionary  war  is	illustative:  individual  words  of  a message were
replaced by codes indicating the location of the word in a particular  book
possessed  by  Arnold and by his British contact. If one considers the book
in question to be a dictionary or glossary, and the codes to be a index  to
the  words  of the dictionary, and if the average length of the index codes
is less than the average length  of  the  replaced  words,  a  very  simple
compression  scheme is apparent. For example, if the message or file has an
average word length of 6 ascii characters and the index  codes	consist  of
two  ascii characters, a dictionary of over 16000 entries can be indexed by
these codes and a 3 to 1 compaction could, in principle, be achieved.

	Dictionary  methods form one of the two main classes of compression
algorithms,  and  the  primary	references  of	interest  in  the   present
discussion are	the  two ground-breaking papers of Ziv and Lempel (1977 and
1978). The other main class  of  algorithms  are statistical in nature, and
based  upon  the  idea that since the individual characters (or words) of a
file or message often occur in vastly different proportions, the codes used
to  represent  the  more  frequently  occuring characters (or words) can be
chosen to be  much  shorter  than  those  representing	the  less  frequent
characters  (or  words), and significant compaction can result. An historic
example is  the  design  of  the  Morse  code;	the  most  frequently  used
characters in written English have the	shortest  codes.  In  English  text
files  in  the form of a sequence of 8-bit (extended ascii) characters, one
may find that as few as 10 or 15 frequent characters  such  a  space,  'e',
't',  's', etc., may account for half of the characters in the file. If the
first bit of a code indicates whether the code represents one of  the  most
frequent  16  characters  or  not,  then  these sixteen characters could be
represented by 5-bit codes, with all other characters represented  by 9-bit
codes. The result is an  average  of  7  bits  per  code  and  an  8  to  7
compaction.  But  the  process	may be continued by dividing the two groups
into subgroups based upon frequency, and using a second bit to	distinguish
among  the  subgroups,	etc.  This  is	a variant of the algorithm commonly
referenced as "Huffman encoding" and the  primary  references  are  Huffman
(1952) and Galleger (1978).

	The  design  of present day compression technology is characterized
by a number of sensitive tradeoffs between the computer resources, such  as
time  and  memory, required by the compression and decompression algorithms
and the degree of compaction achieved. The highest degree of compaction  is
reached  by  approaches  which	combine dictionary and statistical methods,
that is, schemes in which the codes  output  by  a  dictionary	method	are
statistically	processed  by  Huffman	encoding.   The  popular  archiving
programs  PKZIP  and  LHARC use these compound methods, and, interestingly,
Japanese hobbiests have been at the forefront in  the  development  of	the
best implementations.

	The  earlier  archivers,  in  particular  ARC  and PKARC and UNIX's
COMPRESS and TAR were based upon the second of the algorithms  of  Ziv	and
Lempel,  and  in  particular  on the popular variations introduced by Welch
(1984). In this report, we  describe  in  detail  an  implementation  of  a
modification   of   the  first	of  the  proposals  of Ziv and Lempel.	The
implementation is very close to that of Fiala and Greene (1989, method A1).
In  this  algorithm  the  dictionary  to  which  an  index code refers is a
"sliding window" composed of the previous 4096 characters (i. e., bytes) of
the  file  (in	its  uncompressed  form).  Thus  compression is achieved by
replacing the second occurrence of two identical strings of characters by a
reference  to  the  first. The "reference" is an index code consisting of 4
bits of string length information followed by  12  bits  of  positional  or
address  information.  Character  strings  for	which  a previous occurance
cannot be found are simply incorporated literally into the compressed file.
The  compressed  file  is  thus a sequence of literal strings and reference
codes with format:

First (upper) nibble of byte is either:

	0 -- meaning  a  literal  string  follows,  in which case the lower
nibble is the count (minus 1) of the string and the following  count  bytes
are the string itself (1 to 16 characters)

	1-15 -- meaning  the  following  12 bits are an offset address into
the sliding window and this nibble is the length (minus 1) of the string to
be copied from that address (2 to 16 characters). In this  case  the  lower
nibble	is  the  low 4 bits of the offset into the window and the next byte
contains the upper 8 bits.

	Ignoring  for  the moment the problems associated with initializing
buffers and starting the compression algorithm, consider the  central  loop
of  the FORTH word CRUNCH.  While there still remain more characters in the
uncompressed file as yet unprocessed, one  attempts,  via  MATCHSTRING,  to
find  a sequence of characters which  is  identical  to  the  sequence	 of
unprocessed characters from the input file which, for the  purpose  of	the
matching  mechanism,  have  been moved to a short buffer (STRINGBUF). If no
matching string is found, the first character of the unprocessed  input  is
appended (by ADD_TO_LITERAL) to  a  literal string buffer (LITERALBUF), and
if the maximum length for literal strings is thereby reached,  the  literal
string	is  output  by SEND_LITERAL_STRING. If a string match is found, any
pending literal string is first output, and then the code  for	the  string
match  is  output  by SEND_CODE.  In either case, once a character has been
processed, it is inserted into the sliding window  buffer  (SLIDING_WINDOW)
at  offset  position  WINDOW_POINTER by CHR->WINDOW.  The window pointer is
always wrapped at 4096, and thus the window is effectively a "ring buffer."
Finally, REFILL obtains an appropriate number of new input  characters	via
the I/O word GETCHR and appends them to the string in STRINGBUF.

	Initializing the compressor involves  filling  the  sliding  window
with  blanks,  setting	the  window  pointer  to zero, and initializing the
pointers  for  the  literal and I/O buffers.  When REFILL detects an end of
file condition, the central loop in CRUNCH is  exited,	and  all  remaining
unprocessed  characters are sent to the output compressed file as a literal
string.

	Decompression (UNCRUNCH) of a compressed file  involves  sequential
decoding of literal strings and index codes into output characters strings.
The  central loop always starts with  an  input  byte  (via  GETCHR)  known
to contain  the bits which determine the type and length of the next output
string.   If the upper nibble is zero, then the lower nibble determines the
length of a literal string,  which  is	then  retrieved  byte  by  byte  by
GETCHR_&_CHECK.  (If  a  file  error  or  end  of  file  occurs during this
operation, then decompression is aborted; this is a fatal error.)   If	the
upper  nibble  is  non-zero, then it determines the length, and the next 12
bits of input determine the offset of  a  string  to  be  copied  from	the
sliding window to the output  file  via  OUTSTRING.  In  either  case,	the
sliding  window  is  updated  with  each  processed   output  character  by
CHR->WINDOW.

	The  rate  limiting  step  of  the  compressor	is  the brute-force
string	search	performed  by the nonstandard primitive  word STRNDX. Since
this word is already in machine code, there is	little	speed  optimization
one  can  do  without introducing a  more  sophisticated algorithm for this
search. The binary tree algorithm  of  Bell  (1986) is used in the archiver
LHARC.	At  another  time  I  shall  perhaps present the code for a hashing
algorithm. But in any case the memory  requirements  for  compression  will
rise by at least 30K when such algorithms are used.

References:
  J. Ziv and A. Lempel, IEEE Trans., 23, 337-343 (1977).
  J. Ziv and A. Lempel, IEEE Trans., 24, 530-536 (1978).
  D. A. Huffman, Proc IRE, 40, 1098-1101 (1952).
  R. G. Gallager, IEEE Trans., 24, 668-674 (1978).
  T. C. Bell, IEEE Trans., 34, 1176-1182 (1986).
  T. A. Welch, Computer, 17, No. 6, 8-19 (1984).
  E. R. Fiala and D. H. Greene, Comm. of the ACM, 32, No. 4, 490-503 (1989).

P.S.   Don't   use  this  code	without  thorough  testing.   I  accept  no
responsibility for any usage.  An  effective  way  of debugging programs of
this  type  is	to  set  up  both  compressor  and  decompressor to operate
simultaneously, and to feed the output of the compressor directly into	the
decompressor, monitoring all the information streams.

.THEN

DOSINT \ this loads a DOS file interface
DECIMAL

\ flags for conditional compilation
FALSE CONSTANT DEBUGGING  \ true if you want the FCOPY test loaded

\ ************ FILE I/O ROUTINES ("stream i/o") ***********
0 .IF
       These are general purpose routines for the purpose of reading and writing
character streams from/to ms-dos files. There is nothing specific to the
compression/decompression algorithm here.
.THEN

HCB IFILE  ( handle control block for input file )
HCB OFILE  ( handle control block for output file )

\ necessary buffers and tables
4096 CONSTANT BUFSIZE
\ make this much smaller, ie 128, if you want to save memory
CREATE IBUFF  BUFSIZE ALLOT  ( buffer for input file )
CREATE OBUFF  BUFSIZE ALLOT  ( buffer for output file )

: INPUT-FILENAME ( -- ) \ read file name from console
  HERE 30 EXPECT
  HERE SPAN @ STRPCK SWAP NAME>HCB
;

: GET_FILE_NAMES  ( -- ) \ use an <esc> at the OK? prompt to abort
  BEGIN   CR ." Enter source file name: "
	  IFILE INPUT-FILENAME
	  CR ." Enter destination file name: "
	  OFILE INPUT-FILENAME
	  CR ." Source file	 = " IFILE HCB>N .FILE
	  CR ." Destination file = " OFILE HCB>N .FILE
	  CR ." OK? (use esc to abort)" KEY DUP
	  27 ( <esc> ) = IF DROP ." aborting " QUIT THEN
	  DUP EMIT 32 OR ASCII y = ( i.e. Y or y)
  UNTIL   CR ;

: OPEN_FILES
	  IFILE O_RD FOPEN
	  IF   CR ." Source file not found" CR QUIT
	  THEN OFILE 0 FMAKE
	  IF   CR ." Can't make destination file" CR
	       IFILE FCLOSE DROP QUIT
	  THEN ;

: .FILESIZE  ( hcb -- d )
\ print filesize in bytes of open file; moves file pointer!
    0. 2 FSEEK D.
;
: .FILESIZES ( -- )
    IFILE CR ." Source file =	   " .FILESIZE ." bytes"
    OFILE CR ." Destination file = " .FILESIZE ." bytes" ;

VARIABLE CHRS-IN-BUF \ the number of characters read into the input buffer
VARIABLE INPOINTER \ pointer to the next input character to be processed

: GETCHR ( -- c f)  ( get chr from source file,f=0 is EOF)
	CHRS-IN-BUF @ INPOINTER @ 2DUP <=
	   IF 2DROP
	      IFILE IBUFF BUFSIZE FREAD DUP CHRS-IN-BUF ! DUP IF ." I" THEN
	      INPOINTER OFF 0
	   THEN
	IBUFF + C@ SWAP 1 INPOINTER +!
;

VARIABLE OUTPOINTER \ points to next unused position in output buffer

: OBUFF-FLUSH ( -- ) \ flush all remaining chrs in OBUFF to output file
	OUTPOINTER @
	 IF OFILE OBUFF OUTPOINTER @ FWRITE ." O"
	    OUTPOINTER @ <> IF CR ABORT" ERROR: output to file failed" CR THEN
	    OUTPOINTER OFF
	 THEN
;

: PUTCHR ( c -- )  \ output byte to output file buffer, flushing when needed
	OUTPOINTER @ BUFSIZE >=
	   IF OBUFF-FLUSH THEN
	OUTPOINTER @ OBUFF + C! 1 OUTPOINTER +!
;

: CLOSE_FILES
	IFILE FCLOSE DROP
	OBUFF-FLUSH
	OFILE FCLOSE DROP ;

: INIT-BUFFS
	CHRS-IN-BUF OFF
	INPOINTER OFF
	OUTPOINTER OFF
;

DEBUGGING .IF

: FCOPY ( -- )	\ copy one file to another; test file i/o routines
	GET_FILE_NAMES
	OPEN_FILES
	INIT-BUFFS
	BEGIN
		 GETCHR
	WHILE  ( not at end of file )
		 PUTCHR
	REPEAT
	OBUFF-FLUSH
	.FILESIZES
	CLOSE_FILES
;

: PUTCHR ( c -- ) \ debugging replacement
  . 32 EMIT
;

: GETCHR ( -- c f ) \ debugging replacement
  KEY DUP 26 ( ^Z) = NOT
;
.THEN
\ ************ END OF FILE I/O ROUTINES ***********

\ the maximum size of a sliding window indexed by 12 bit codes is 4096
4096 CONSTANT MAXCODE

\ create sliding window
CREATE SLIDING_WINDOW  MAXCODE ALLOT \ sliding window of uncompressed text
  16 ALLOT \ string "spill" area -- makes string searches easier
VARIABLE WINDOW_POINTER \ pointer into the SLIDING_WINDOW ring buffer

CREATE STRINGBUF 16 ALLOT \ where incomming text is stored for matching
\ this buffer is used as a first in, first out (FIFO) structure
VARIABLE LENGTH \ length of string to be matched
VARIABLE PROCESSED \ # of processed chars in STRINGBUF

CREATE LITERALBUF 16  ALLOT \ where literal output string is built
VARIABLE LIT_LENGTH \ length of literal string thus far built

: CHR->WINDOW  ( c -- )
\ move one character to the SLIDING_WINDOW at the WINDOW_POINTER & bump pointer
 WINDOW_POINTER @ DUP 16 <
 IF 2DUP SLIDING_WINDOW + MAXCODE + C! THEN
\ this copies chars (0-15) to the spill area, a convenience for MATCHSTRING
 SLIDING_WINDOW + C! \ store the char in the SLIDING_WINDOW
 WINDOW_POINTER DUP @ 1+
 [ MAXCODE 1- ] LITERAL AND
 SWAP ! \ pointer updated and wrapped @ MAXCODE
;

: SEND_LITERAL_STRING
 LIT_LENGTH @ DUP 1- PUTCHR \ send the length-1
 LITERALBUF + LITERALBUF
 DO I C@ PUTCHR LOOP  \ send the string
 LIT_LENGTH OFF
;

: ADD_TO_LITERAL  ( -- )
\ add the 1st chr in stringbuf to the literal string
 STRINGBUF C@ DUP CHR->WINDOW
 LIT_LENGTH @ TUCK LITERALBUF + C!
 1 LIT_LENGTH +!
 15 =
   IF \ literal string at length limit
     SEND_LITERAL_STRING
   THEN
 STRINGBUF 1+ STRINGBUF 15 CMOVE
 1 PROCESSED +!
;

: MIN_MATCH_LENGTH ( --n ) LIT_LENGTH @ IF 3 ELSE 2 THEN ;

: MATCHSTRING  ( -- offset length )  ( length = 0 if no match)
 MIN_MATCH_LENGTH LENGTH !
 SLIDING_WINDOW MAXCODE
 BEGIN
   LENGTH @ 17 <
   IF
     2DUP STRINGBUF LENGTH @ STRNDX
     DUP -1 <>
   ELSE \ complete string match, look no further
     0 FALSE
   THEN
 WHILE 1 LENGTH +!
   TUCK - >R + R>
 REPEAT
 2DROP SLIDING_WINDOW -
 LENGTH @ 1- DUP
 MIN_MATCH_LENGTH < IF DROP 0 THEN
;

: STRING_SHIFT_LEFT ( n -- )
 STRINGBUF 2DUP + SWAP
 ROT 16 SWAP - CMOVE
;

: SEND_CODE  ( offset length -- )
\ send the matched string to the sliding_window,
\ and output the length and offset packed into 2 bytes
 DUP STRINGBUF + STRINGBUF
   DO I C@ CHR->WINDOW LOOP
 DUP PROCESSED +!
 DUP STRING_SHIFT_LEFT
 1- 4 SHIFT OVER 15 AND OR PUTCHR
 -4 SHIFT PUTCHR
;

: REFILL  ( -- f)
 ( replenish STRINGBUF from input file, FALSE flag means eof read)
 BEGIN
   PROCESSED @ DUP
   IF GETCHR 0=
     IF 2DROP FALSE \ it's the end of the input file
     ELSE TRUE
     THEN
   THEN
 WHILE ( not at end of file nor is STRINGBUF fully replenished)
   [ STRINGBUF 16 + ] LITERAL ROT - C!
   -1 PROCESSED +!
 REPEAT PROCESSED @ 0=
;

: INIT-WINDOW
\ fill the sliding window with blanks to start
 SLIDING_WINDOW [ MAXCODE 15 + ] LITERAL BLANK
 WINDOW_POINTER OFF
;

: SEND_FINAL_STRING
\ the last few input characters are sent as a literal string
 16 PROCESSED @ - ?DUP
 IF  \ any left?
   DUP 1- PUTCHR   \ the length
   0 DO I STRINGBUF + C@ PUTCHR LOOP
 THEN
;

: CRUNCH ( -- )
 INIT-BUFFS
 INIT-WINDOW
 LIT_LENGTH OFF
 16 PROCESSED !  \ force buffer to refill
 BEGIN
   REFILL
 WHILE ( not at end of input file)
   MATCHSTRING ?DUP  \ did we find a match?
     IF LIT_LENGTH @
       IF SEND_LITERAL_STRING THEN
       SEND_CODE
     ELSE DROP ADD_TO_LITERAL
     THEN
 REPEAT
 \ at this point we've already detected the end of
 \ the input file, so we finish up by flushing all
 \ remaining characters out as literal strings
 LIT_LENGTH @ IF SEND_LITERAL_STRING THEN
 SEND_FINAL_STRING
;

: GETCHR_&_CHECK  ( -- c) ( aborts execution if end of file detected)
 GETCHR 0=
 IF CR ." unexpected end of file -- aborting decompression " CR
    QUIT THEN
;

: OUTSTRING  ( address length -- )
\ must be careful not to overwrite what you are reading!
 DUP >R STRINGBUF SWAP CMOVE
 STRINGBUF R> + STRINGBUF 2DUP
 DO I C@ PUTCHR LOOP
 DO I C@ CHR->WINDOW LOOP
;

: UNCRUNCH ( -- )
 INIT-BUFFS
 INIT-WINDOW
 BEGIN
   GETCHR
 WHILE
   DUP 15 >
   IF ( starting a code)  ( -- byte [length-1]*16 )
     DUP -4 SHIFT 1+   \ this is the length
     SWAP 15 AND   \ this is the lower 4 bits of the offset
     GETCHR_&_CHECK 4 SHIFT OR
     SLIDING_WINDOW + ( -- length address )
     SWAP OUTSTRING
   ELSE ( it's a literal string)
     1+  \ the length
     0 DO GETCHR_&_CHECK DUP PUTCHR CHR->WINDOW LOOP
   THEN
 REPEAT DROP
;

: START  ( -- )
	CLS
	GET_FILE_NAMES
	OPEN_FILES
;

: FINISH  ( -- )
	OBUFF-FLUSH
	.FILESIZES
	CLOSE_FILES
;

: COMPRESS  ( -- )
	START CRUNCH FINISH
;

: DECOMPRESS  ( -- )
	START UNCRUNCH FINISH
;

.( LEMPEL-ZIV CODE LOADED)
QUIT
GLOSSARY of nonstandard words:
FREAD ( hcb adr len -- len') read len bytes from file to adr, len' was read
FWRITE ( hcb adr len -- len') write len bytes from adr to file, len' was written
FOPEN	( hcb mode -- status)  open file or device
O_RD   ( -- mode )  read only mode for FOPEN
FCLOSE	( hcb -- status)  close file indicated by hcb
0. 2 FSEEK ( hcb -- d) move file pointer to end of file, report position
-4 SHIFT means 16 /
4 SHIFT means 16 *
.IF  ( f -- ) if true, stop interpretation until .ELSE or .THEN is parsed
.ELSE	      stop or start interpretation
.THEN	      mark end of conditional interpretation/compilation
CLS  clear and home screen
0 CONSTANT TRUE
-1 CONSTANT FALSE
: OFF ( adr -- ) FALSE SWAP ! ;
: ON ( adr -- ) TRUE  SWAP ! ;
HCB <name>  make handle control block associated with <name>
: TUCK	( n1 n2 -- n2 n1 n2 )  SWAP OVER ;
: NIP  ( n1 n2 -- n2 )	SWAP DROP ;
STRNDX ( str_addr str_len patn_addr patn_len --- n ) searches memory starting
       at str_addr for a string matching that at address patn_addr and of length
       patn_len.  The argument returned is -1 if no match or the address of the
       match, an address between str_addr and [str_addr+str_len-patn_len].
 the SLIDING_WINDOW
 WINDOW_POINTER DUP @ 1+
 [ MAXCODE 1- ] LITERAL AND
 SWAP ! \ pointer updated and wrapped @ MAXCODE
;

: SEND_LITERAL_STRING
 LIT_LENGTH @ DUP 1- PUTCHR \ send the length-1
 LITERALBUF + LITERALBUF
 DO I C@ PUTCHR LOOP  \ send the string
 LIT_LENGTH OFF
;

: ADD_TO_LITERAL  ( -- )
\ add the 1st chr in stringbuf to the literal string
 STRINGBUF C@ DUP CHR->WINDOW
 LIT_LENGTH @ TUCK LITERALBUF + C!
 1 LIT_LENGTH +!
 15 =
   IF \ literal string at length limit
     SEND_LITERAL_STRING
   THEN
 STRINGBUF 1+ STRINGBUF 15 CMOVE
 1 PROCESSED +!
;

: MIN_MATCH_LENGTH ( --n ) LIT_LENGTH @ IF 3 ELSE 2 THEN ;

: MATCHSTRING  ( -- offset length )  ( length = 0 if no match)
 MIN_MATCH_LENGTH LENGTH !
 SLIDING_WINDOW MAXCODE
 BEGIN
   LENGTH @ 17 <
   IF
     2DUP STRINGB
