Clipper Support Bulletin #9 Clipper supported file structures Copyright (c) 1991, 1992 Nantucket Corporation. All rights reserved. Version: Clipper 5.0, version 5.01 Date: 20th February, 1992 Revised: 22nd May, 1992 Status: Active ================================================================================ This Support Bulletin covers the following topics: 1. DBFNTX/DBFNDX database (.dbf) file format 1.1. File description record 1.2. Field descriptor table 1.3. Character fields 1.4. Numeric fields 1.5. Logical fields 1.6. Date fields 1.7. Memo fields 2. DBFNTX/DBFNDX memo (.dbt) file format 3. DBFNTX index (.ntx) file format 4. Memory (.mem) file format ================================================================================ 1. DBFNTX/DBFNDX database (.dbf) file format Database files supported by the Clipper 5.0 DBFNTX and DBFNDX database drivers are standard (.dbf) files supported by all previous versions of Clipper as well as dBASE III and dBASE II PLUS. DBFNTX/DBNDX database (.dbf) files include a header describing the file structure and specifications field, data records, and an end of file mark (1Ah). The header consists of three sections: a file description record, one or more field descriptor records, and an end of header mark (0Dh). Note: All number values in this Support Bulletin are expressed in decimal unless otherwise noted. ---------------------------------------------------------------------------- 1.1. File description record The first record in a database (.dbf) header is 32 bytes in length and contains information describing the file as follows: Table: File description record ------------------------------------------------------------------ Offset Format Contents ------------------------------------------------------------------ 0 03 or 083h Signature Byte: 03h - (.dbf) with no memo (.dbt) file 083h- (.dbf) with memo (.dbt) file 1 Year Last update year without century 2 01 to 12 Month of last update 3 01 to 31 Day of last update 4-7 Long Number of records 8-9 Word Location in file where data begins (START) 10-11 Word Record length (field sizes plus 1) 12-31 N/A Reserved ------------------------------------------------------------------ Note: When a database file is created, bytes 12 thru 31 are NUL filled. Once a database (.dbf) file exists, care must be taken NOT to change reserved values since dBASE III PLUS uses these values where the Clipper DBFNTX/DBFNDX database drivers do not. ---------------------------------------------------------------------------- 1.2. Field descriptor table Following the file description record, there is a table of field descriptor records beginning at byte 32. Each field descriptor record is 32 bytes in length and defines the attributes of database field: name, data type, length, and decimals for numeric fields. Table: Field descriptor definition ------------------------------------------------------------------ Offset Format Contents ------------------------------------------------------------------ 0-10 Character Field Name (printable string; no spaces; NUL terminated; NUL padded) 11 Character Field Type (C=character, L=logical, M=memo, N=numeric, D=date) 12-15 N/A Reserved 16 Unsigned int Field length, including decimal for numerics (referred to in text as LENGTH) See offset 17 for information on character fields exceeding 256 bytes 17 Unsigned int Numeric fields--number of decimal places Character fields--most significant byte (MSB) of LENGTH for fields whose lengths exceed 256 bytes 18-31 N/A Reserved ------------------------------------------------------------------ The last field structure is followed by a constant 13 (0Dh) and a constant 0 (00h) indicating the end of the field structures. This differs from dBASE III PLUS, which does not write the 0 (00h) byte. At START (defined above), the record data begins. Each field is stored sequentially according to the order in the header. Before each record is a deleted flag which is either a space or an asterisk ("*"). If the deleted flag is an asterisk, the record is assumed to be deleted. The field length specified in the header includes the deleted flag. Below is a brief definition of each field type, and the method of storage employed. ---------------------------------------------------------------------------- 1.3. Character fields Character fields may contain any ASCII character from 0 to 255, and are always of a static length (defined by LENGTH in the field structure definition). Note that the string is not NUL terminated. An empty character field contains all spaces (32, 20h). ---------------------------------------------------------------------------- 1.4. Numeric fields Numerics are stored as character equivalents with the decimal included. There is no decimal character if the number of decimal places is zero. Empty numerics are padded with leading spaces, have a zero before the decimal point, and zero padding after the decimal point to the end of the field. An empty numeric of length 9 with 2 decimals would look like this: " 0.00". --------------------------------------------------------------------------- 1.5. Logical fields Logical fields are stored as a single character. "T" is stored for true. All other characters are assumed to equate to a false value (though "F" is most likely to be used). An empty logical contains an "F" character. ---------------------------------------------------------------------------- 1.6. Date fields Date fields are exactly eight characters in length. A date field is stored in the format YYYYMMDD where YYYY = Year with century, MM = Month, and DD = Day. 10/20/82 would be stored as "19821020." ---------------------------------------------------------------------------- 1.7. Memo fields Memo fields are always ten bytes in length. The ten bytes hold a pointer to the first 512 byte block in a (.dbt) file that contains the memo text. The pointer is in ASCII--all spaces indicates that there is no memo text for that field. ================================================================================ 2. DBFNTX/DBFNDX memo (.dbt) file format If a database (.dbf) file is defined containing a memo field, it has an accompanying memo file with the same name and a (.dbt) extension. The memo (.dbt) file contains the actual variable- length memo field data where the memo field in the database file contains the memo file record numbers where each memo field value begins in the memo (.dbt) file. DBFNTX/DBFNDX memo fields can be up 64K in length and are stored identically to dBASE III and dBASE III PLUS. Note that dBASE III and dBASE III PLUS memo field values can be up to 512K in length. DBFNTX/DBFNDX memo files consist of a series 512 byte records, a header record followed by one or more data records. The header record has the following format: Table: Memo header record ------------------------------------------------------------------ Offset Format Contents ------------------------------------------------------------------ 0-3 Long Number Of 512-byte blocks in the file, including the header (also the next available record) 4-511 Unused Reserved ------------------------------------------------------------------ Each memo field value is stored as a series of memo file records terminated with a Ctrl-Z (01Ah) character. If a memo field does not contain an even multiple of 512 bytes, the unused remainder of the last record is padded to 512 bytes with spaces. Note that the last record in the memo file is not padded and may be less than 512 bytes in length. ================================================================================ 3. DBFNTX index (.ntx) file format The Clipper DBFNTX database driver uses a modified B+ tree style index structure. Each index (.ntx) file consists of pages that are 1024 bytes long. The first page is a header with the following structure: Table: First (.ntx) page ------------------------------------------------------------------ Offset Format Contents ------------------------------------------------------------------ 0-1 Word Signature Byte: 03 = Index file 2-3 Word Clipper indexing version number 4-7 Long Offset in file for first index page 8-11 Long Offset to an unused key page 12-13 Word Key size + 8 bytes (distance between key pages) 14-15 Word Key size 16-17 Word Number of decimals in key (if numeric) 18-19 Word Maximum entries per page 20-21 Word Minimum entries per page or half page (The first, or root page of an index has a minimum of 1 entry regardless of this value) 22-277 256 bytes Key expression, followed by CHR(0) bytes 278 Byte 1 if index is unique, 0 if not 279 744 bytes Filler (pads to 1024) bytes ------------------------------------------------------------------ Subsequent index key pages consist of the following structure: Table: Subsequent index pages ------------------------------------------------------------------ Offset Format Contents ------------------------------------------------------------------ 0-1 Word Number of used entries on this page. This number will be between the Minimum and Maximums defined in the header, unless it is the root page. 2 Unsigned An array of unsigned longs begins here. The ptrs array length is equal to the maximum number of key entries per page +1. They contain offsets onto the page where the key values (ITEMS) are located. Remainder of page ITEM entries (described below). ------------------------------------------------------------------ Following the array of unsigned pointers to offsets in the page are the key value entries. These so-called ITEM entries describe a key value and its record's position in the database. The key value is always stored as a character string, regardless of its type. The structure of an ITEM entry is as follows: Table: ITEM entry ------------------------------------------------------------------ Offset Format Contents ------------------------------------------------------------------ 0-1 Long Pointer to a page in the index file, containing keys that are prior to this key 2-3 Long Record number in controlling database file 4 Character Key value. This field begins at offset 4 and continues for the length of the key. Numerics are padded with leading zeros ------------------------------------------------------------------ For more information about traversing Clipper (.ntx) files, you may want to reference the following books: Spence, Rick. Clipper Programming Guide, Second Edition. (Microtrend Books, Slawson Communications, Inc.; ISBN 0-915391-41-4) Tenenbaum, A.M. et al., Data Structures Using C. (Prentice-Hall). ================================================================================ 4. Memory (.mem) file format The value of each memory variable in a (.mem) file is preceded by a 32-byte identifier that has the following structure: Table: Memory variable identifier ------------------------------------------------------------------ Bytes Contents ------------------------------------------------------------------ 0-10 A null-terminated string containing the variable name 11 Variable type 12-15 Reserved 16 Numeric--Number of whole digits Character--Low-order byte of variable length 17 Numeric--Number of decimal digits Character--High-order byte of variable length 18-31 Reserved ------------------------------------------------------------------ Additional information about (.mem) files, including the C source code to dump a (.mem) file, can be found on page 575 of Rick Spence's book "Clipper Programming Guide - 2nd Edition." * * *