Title: RECIO DESIGN AND DEVELOPMENT NOTES Copyright: (C) 1994 William Pierpoint Version: 1.10 Date: March 28, 1994 1.0 DATA STRUCTURES 1.1 REC structure for each record stream * defined in recio.h. * one static REC for recin (included in ROPEN_MAX count). * allocate dynamic array of RECs dimensioned to ROPEN_MAX-1 in ropen(). * Each REC has two associated buffers: 1) record string buffer containing current record; allocate when first record read; reallocate if record becomes larger. 2) field string buffer containing current field; allocate when first field read; reallocate if field becomes larger. * deallocate dynamic RECs and associated buffers in rclose() and rcloseall() if all record streams closed; deallocate associated buffers for recin with an exit function registered with atexit(). 1.2 Accessing REC Members and Associated Buffers How do I * access the name of the record stream? rnames() * access the current context number? rcxtno() * access the current record number? rrecno() * access the current field number? rfldno() * access the current column number? rcolno() * access the record string buffer? rrecs() * access the field string buffer? rflds() * determine if column numbers start at 0 or 1 rbegcolno() * determine if there are more records left? reof() * determine if there is an error on the stream? rerror() * force an error on a record stream? rseterr() * clear an error on a record stream? rclearerr() * increase the size of the record string buffer? rsetrecsiz() * increase the size of the field string buffer? rsetfldsiz() * replace the data in the field string buffer? rsetfldstr() * set the field delimiter character? rsetfldch() * set the text delimiter character? rsettxtch() * set the context number? rsetcxtno() * set column numbering to start at 0 or 1 rsetbegcolno() 2.0 CODE STRUCTURES 2.1 Callback Error Function Skeleton if valid record pointer [isvalid(rp)] if past end of file [reof(rp)] else [error number set] switch error number [rerror(rp)] case data errors [R_ERANGE || R_EINVDAT || R_EMISDAT] switch context number [rcxtno(rp)] case RECIN switch field number [rfldno(rp)] case 1 ... endcase ... default [missing or unknown context number] endcase case out of memory [R_ENOMEM] case non-fatal errors [R_ENOREG] case fatal errors [R_EINVAL] default [possibly set by application with rseterr()] endcase endif else [invalid record pointer] switch error number [errno] case out of memory [ENOMEM] case out of record or file pointers [EMFILE] case permission denied [EACCES] case fatal errors [EINVAL] default [possibly set by application with rseterr()] endcase endif 2.2 Classes of Field Functions There are four classes of functions that return field values: rget - character delimited field, base 10 if numeric field rbget - numeric character delimited field, base 0 & 2-36 rcget - column delimited field, base 10 if numeric field rcbget - numeric column delimited field, base 0 & 2-36 2.3 How to Define and Declare New Field Functions You can define a new function to get data using one of these macros: macro: macro defined in: define new function in: ----------- ----------------- ----------------------- rget_fn() _rget.h rget.c rbget_fn() _rbget.h rbget.c rcget_fn() _rcget.h rcget.c rcbget_fn() _rcbget.h rcbget.c macro: declare new function in recio.h as: ----------- --------------------------------------------------------- rget_fn() rget?(REC *rp); rbget_fn() rbget?(REC *rp, int base); rcget_fn() rcget?(REC *rp, size_t begcol, size_t endcol); rcbget_fn() rcbget?(REC *rp, size_t begcol, size_t endcol, int base); where ? is one or more new unique letters All four macros have the same seven arguments: -------------------------------------------------- fn_type defined function return type fn_name defined function name fn_err defined function error return value cv_type conversion function return type cv_name conversion function name fn_min inclusive valid minimum value fn_max inclusive valid maximum value The commonly used conversion functions are: name: return type: ------- ------------- strtol long strtoul unsigned long strtod double strtoc character (portability note: strtoc violates ansi-c reserved namespace) Example: suppose you want to define a function rgetb() that gets a boolean value (unsigned char) and generates an ERANGE error if the value is not 0 or 1: /* definition to add to rget.c */ rget_fn(unsigned char, rgetb, 0, long, strtol, 0, 1) /* declaration to add to recio.h */ rgetb(REC *rp); --OR to generate an EINVDAT error if the value is not 0 or 1-- /* definition to add to rbget.c */ rbget_fn(unsigned char, rbgetb, 0, long, strtol, 0, 1) /* declaration to add to recio.h */ rbgetb(REC *rp, int base); /* macro to add to recio.h */ #define rgetb(rp) (rbgetb((rp), 2)) 3.0 DEVELOPMENT NOTES 3.1 fgets (Microsoft C 5.1) Previous notes of mine indicate that Microsoft's fgets function does not work correctly when it reads a line of text that consists of only a newline. However this can be worked around by first setting the string buffer to an empty string. You will need to test this if you plan on retaining the newline. The fgets function is used twice in the rgetrec function. If porting to Microsoft C, you may need to implement this fix: *rrecs(rp) = '\0'; /* just prior to the first fgets */ *str = '\0'; /* just prior to the second fgets */ 3.2 fopen (Borland C 3.1) fopen() calls __openfp() calls open(). Borland's "Library Reference" documents error numbers for open(), but not for fopen(). These error numbers are ENOENT, EMFILE, EACCES, and EINVACC. Because ropen() screens the access code, the EINVACC error will not occur from the recio library. 3.3 strtol & strtoul (Borland C 3.1) These functions stop consuming input once they overflow, setting ERANGE. Hence endptr can point into the middle of a sequence of valid characters having the expected form as given in ANSI X3.159-1989, Sections 4.10.1.5 and 4.10.1.6. IMHO this characteristic is not in conformance with the ANSI standard as endptr should only point to the first unrecognized character or to the terminating null character. Borland's strtod does not have this problem.