Title: STANDARD USAGE OF C LANGUAGE RECIO LIBRARY Copyright: (C) 1994 William Pierpoint Version: 1.10 Date: March 28, 1994 1.0 INTRODUCTION The implementation descibed by this standard usage is a superset of the recio specification. Enhancements are noted in the text. 1.1 Mnemonics The recio functions have been given a consistent mnemonic naming convention. All recio functions are in lower case and start with the letter r. Function names are analogous to functions. Mnemonics are as follows: Single letter (field functions) Multi-letter ---------------------------------------- ----------------- b - base (prefix) beg - beginning c - column (prefix), character (suffix) ch - character d - double (suffix) col - column f - float (suffix) cxt - context i - integer (suffix) eof - end of file l - long (suffix) err - error n - number fld - field buffer r - record pointer (first letter) fn - function s - string pointer (suffix) no - number u - unsigned (suffix) rec - record buffer siz - size of buffer str - string txt - text 1.2 Order The order in which the prefix mnemonics appear indicates the order in which the arguments appear in the function. The suffix mnemonics tell you what the function returns. For example, rbgetui(): arguments: r - record pointer b - base (radix) of input returns: ui - unsigned integer Note: c is used in the prefix of a function's name only once even if there are two column arguments. If the function returns a character, there is only one column argument; otherwise there are two. 2.0 ERROR CHECKING The functions declared in the header make use of the errno macro defined in section 4.1.3 of ANSI X3.159-1989. This mechanism was chosen because (1) the conversion functions (strtod(), strtol(), etc.) make use of this error reporting mechanism and (2) the functions make use of the conversion functions. In this implementation, errno can return the following macro constants: 0 - No error. EACCES - permission denied. EINVAL - invalid argument (usually null record pointer). EMFILE - too many open files. ENOENT - no such file or directory. ENOMEM - out of memory. Beginning with version 1.1, recio functions set errno when the record pointer is invalid and set an internal error number when the record pointer is valid. The recio error number is accessed through the rerror function. The rerror function can return the following macro constants: 0 - No error. R_EINVAL - invalid argument (not the record pointer). R_EINVDAT - invalid data. R_EMISDAT - missing data. R_ENOMEM - out of memory. R_ENOREG - unable to register exit function with atexit(). R_ERANGE - data out of range. 2.1 Define Callback Error Function First define a callback error function to be used by the recio functions. You may give the function any name you wish. In the sample function below, the name rerrfn is used. The function takes one argument, a record pointer (REC *). It returns nothing (void). The function must first check for a valid record pointer using the risvalid function. Other than that, you can customize it to do whatever you want. The recio functions use a callback error function in order to give the most flexibility in handling errors. This rerrfn function just sends information to stderr. You may wish to send information to a printer, a file, a window, or a dialog box. You might even want to give users the ability to examine errors and enter corrections. If the error is corrected, you will want to call the rclearerr function before your callback error function returns. When your callback error function is invoked, check rerror() or errno to determine the cause of the error. Symbolic errno constants: * EACCESS means that you don't have permission to access this file. All MSDOS files have read permission. * EINVAL indicates an invalid argument to a function, usually a NULL record pointer. This resulted from a programming error. * EMFILE means the program tried to open more files than the maximum allotted by ROPEN_MAX or FOPEN_MAX. If your program is interactive, the user can close one or more open record streams. Or you might decide that ROPEN_MAX or FOPEN_MAX needs to be a larger value. * ENOENT says that ropen() could not find the requested file to open. Perhaps the name of the file was misspelled, or your program looked in wrong directory. If your program was trying to read a configuration file, it could use internal default values when the configuration file does not exist. * ENOMEM indicates that the program ran out of heap space. You may be able to correct this if you are able to deallocate memory you no longer need. For example, you could reduce the size of buffers when the size only affects speed. Such buffers need to be flushed first. Buffers used by the recio library do not fit this criteria. Symbolic rerror() constants: * R_ENOREG means the program was unable to register the internal recio exit function with the ANSI atexit() function. The internal recio exit function ensures that all open record streams are closed and all dynamic memory allocated by the recio library is deallocated. This error is not fatal. EFAULT only has this meaning within the context of the recio library. * R_EINVDAT says the data is invalid. Invalid data is caused by an unrecognized character in the field. For example, rgetui() doesn't expect to see a negative sign, so a negative number will be flagged as invalid data. * R_EMISDAT says the data is missing. Missing data means the field is empty. If you expect a number, you could substitute either zero or some unique number to indicate an empty field. * R_ENOMEM indicates that the program ran out of heap space. You may be able to correct this if you are able to deallocate memory you no longer need. For example, you could reduce the size of buffers when the size only affects speed. Such buffers need to be flushed first. Buffers used by the recio library do not fit this criteria. * R_ERANGE tells you that the data is outside the range of the function. For instance, suppose you used rgeti() to get an integer and the data value is 32768. If a 16-bit integer has an upper limit of 32767, the value is too large. If the data is wrong, you can have the error function correct it. If the data is right, you have to correct the program. The main purpose of this sample callback error function is to show some of kinds of things you can do in a callback error function. Note that when an error occurs, the column number indicator rcolno() has moved just beyond the error. To make it clearer to the user where the error occurred, rerrfn() displays rcolno()-1, but not less than the column number for the first column of the record. For a more detailed callback error function, see the source code for one the test programs. /* define callback error function */ void rerrfn(REC *rp) { int errnum; /* error number */ /* if rp is a valid record pointer */ if (risvalid(rp)) { /* reof flag set */ if (reof(rp)) { fprintf(stderr, "ERROR reading %s: " "tried to read past end of file\n\n", rnames(rp)); /* rerror flag set */ } else { /* determine cause of error */ errnum = rerror(rp); switch (errnum) { /* data errors */ case R_ERANGE: case R_EINVDAT: case R_EMISDAT: /* print location of error */ fprintf(stderr, "DATA ERROR in FILE %s at LINE %ld," " FIELD %u, COLUMN %u\n", rnames(rp), rrecno(rp), rfldno(rp), max(rcolno(rp)-1, rbegcolno(rp))); /* warnings: non-fatal errors */ case R_ENOREG: fprintf(stderr, "WARNING: could not register exit function\n"); rclearerr(); break; /* fatal errors (R_EINVAL, R_ENOMEM) */ case R_EINVAL: fprintf(errout, "FATAL ERROR reading FILE %s: invalid argument", rnames(rp)); abort(); break; case R_ENOMEM: fprintf(errout, "FATAL ERROR reading FILE %s: out of memory", rnames(rp)); abort(); break; default: fprintf(errout, "FATAL ERROR reading FILE %s: unknown error", rnames(rp)); abort(); break; } } /* else invalid record pointer */ } else { switch (errno) { /* non-fatal errors */ case EACCES: case EMFILE: fprintf(errout, "WARNING: %s\n", strerror(errno)); break; /* fatal errors (EINVAL, ENOMEM) */ default: fprintf(errout, "FATAL ERROR: %s\n", strerror(errno)); abort(); break; } } } 2.2 Register Callback Error Function Once you have written your callback error function, you must let the other recio functions know that it exists. You use the rseterrfn function to register your callback error function. /* register rerrfn() as callback error function for recio */ rseterrfn(rerrfn); 3.0 OPEN FILE 3.1 Open File and Get Record Pointer Use the ropen function to open the file you want to read. Store the record pointer returned by the ropen function. To read from standard input, do not try to open recin. It is always open, so it does not need to be opened or closed. REC *rp = ropen("FILENAME.DAT", "r"); 3.2 Check Record Pointer Following the ropen function, you need to check to see if the file was opened correctly. If ropen returned a NULL pointer, then the file was not opened. Errors other than ENOENT are reported to your callback error function. ENOENT is not reported since you may want to use default values if the data file is not available. /* if ropen() failed */ if (!rp) { /* if it failed because file does not exist */ if (errno==ENOENT) { /* action to take when file does not exist */ ... } /* else ropen() succeeded */ } else { /* set up for read (see sections 3.3 and 3.4) */ ... /* read through file (see sections 4 and 5) */ ... /* close file (see section 6) */ rclose(rp); } 3.3 Set Field and Text Delimiters The space character is the default value for both the field and text delimiters. If you need to use something else, you need to explicitly set the values. Application maintenance may be easier if you always set the values. rsetfldch(rp, ','); /* set field delimiter character */ rsettxtch(rp, '"'); /* set text delimiter character */ 3.4 Set Field and Record Buffer Sizes Setting the field and record buffer sizes is optional. Buffers will be automatically reallocated as necessary. However if you set the field and record sizes in advance to the maximum value needed, you can reduce memory fragmentation. rsetfldsiz(rp, 41); /* set size of field buffer */ rsetrecsiz(rp, 133); /* set size of record buffer */ 3.5 Set Context Number If your application opens record streams with more than one data format, you will want to set a context number. You use the context number so that your callback error function can determine (using the rcxtno function) which data format it is dealing with. Each context number must be a positive integer; zero and negative numbers are reserved. #define SOILS_DB 1 #define BUILDINGS_DB 2 rsetcxtno(rp, SOILS_DB); /* set context number */ 3.6 Set Beginning Column Number The first column number in the record buffer defaults to zero. If you prefer column numbering to start at one, use the rsetbegcolno function. It is mainly useful if using column delimited data. If a number takes up the first ten columns of the record, the column numbering will be 0 to 9 if rsetbegcolno() is set to 0, or 1 to 10 is rsetbegcolno() is set to 1. rsetbegcolno(rp, 1); /* number first column as one */ 4.0 READ ALL RECORDS IN FILE 4.1 The rgetrec Function If all the records in a data file have the same format, you will want to loop through all the records until the end of file is reached. If each record has a different format, you must call the rgetrec function each time you want to get the next record. Calling rgetrec() is optional for the first record. /* loop through all records in file */ while (rgetrec(rp)) { /* Section 5 field functions go here ... */ } 4.2 The rrecs Macro To get a pointer to the start of the record buffer, use the rrecs macro. /* echo record contents to stdout */ printf("%s\n", rrecs(rp)); 4.3 The rrecno Macro To get the record number, use the rrecno macro. /* echo record number and record contents to stdout */ printf("%ld: %s\n", rrecno(rp), rrecs(rp)); 5.0 GET FIELD DATA FOR EACH RECORD The recio functions can handle records for two types of fields: (1) character delimited and (2) column delimited. 5.1 Character delimited fields 5.1.1 Character fields 5.1.1.1 The rgetc Function Use the rgetc function to get a field consisting of a single non-whitespace character. Any whitespace in the field is skipped. /* get one non-whitespace character */ int ch = rgetc(rp); 5.1.2 String fields String field functions return a pointer to the string buffer. The string buffer is overwritten each time a new string field is read. To save the string for later use, copy the string to a character array with sufficient space to hold the string (including the terminating null). 5.1.2.1 The rgets Function Use the rgets function to get a field consisting of a string. /* duplicate string in string buffer */ char *str = strdup(rgets(rp)); ... /* free string memory space when done with string */ free (str); 5.1.3 Floating point fields 5.1.3.1 The rgetd Function Use the rgetd function to get a field consisting of a double precision floating point number. /* get a double */ double result = rgetd(rp); 5.1.3.2 The rgetf Function Use the rgetf function to get a field consisting of a single precision floating point number. /* get a float */ float result = rgetf(rp); 5.1.4 Integer fields 5.1.4.1 Base 10 integer fields 5.1.4.1.1 The rgeti Macro Use the rgeti macro to get a field consisting of an decimal integer. /* get a decimal integer */ int result = rgeti(rp); 5.1.4.1.2 The rgetl Macro Use the rgetl macro to get a field consisting of a decimal long. /* get a decimal long */ long result = rgetl(rp); 5.1.4.1.3 The rgetui Macro Use the rgetui macro to get a field consisting of an unsigned decimal integer. /* get an unsigned decimal integer */ unsigned int result = rgetui(rp); 5.1.4.1.4 The rgetul Macro Use the rgetul macro to get a field consisting of an unsigned long decimal integer. /* get an unsigned decimal long */ unsigned long result = rgetul(rp); 5.1.4.2 Explicit base integer fields 5.1.4.2.1 The rbgeti Function Use the rbgeti function to get a field consisting of an integer in a specified radix. /* get a hexadecimal integer */ int result = rbgeti(rp, 16); 5.1.4.2.2 The rbgetl Function Use the rbgetl function to get a field consisting of a long integer in a specified radix. /* get a hexadecimal long integer */ long result = rgetl(rp, 16); 5.1.4.2.3 The rbgetui Function Use the rbgetui function to get a field consisting of an unsigned integer in a specified radix. /* get a hexadecimal unsigned integer */ unsigned int result = rgetui(rp, 16); 5.1.4.2.4 The rbgetul Function Use the rbgetul function to get a field consisting of an unsigned long integer in a specified radix. /* get a hexadecimal unsigned long integer */ unsigned long result = rgetul(rp, 16); 5.1.5 Other Functions 5.1.5.1 The rskipfld Macro If your application does not need the data in a field, you can skip over the field by using the rskipfld macro. /* skip over a field */ if (rskipfld(rp) != 1) printf("Unable to skip field.\n"); 5.1.5.2 The rskipnfld Function If your application does not need the data in several adjacent fields, you can skip over the fields by using the rskipnfld function. /* skip over three fields */ if (rskipnfld(rp, 3) != 3) printf("Unable to skip 3 fields.\n"); 5.2 Column delimited fields 5.2.1 Character fields 5.2.1.1 The rcgetc Function Use the rcgetc function to get a character from a specific column. /* get character from column number 12 */ int ch = rcgetc(rp, 12); 5.2.2 String fields String field functions return a pointer to a static string. This static string is overwritten each time a new string field is read. To save the string for later use, copy the string to a character array with sufficient space to hold the string (including the terminating null). 5.2.2.1 The rcgets Function Use the rcgets function to get a string between two column locations. /* duplicate string in string buffer */ char *str = strdup(rcgets(rp, 0, 9)); ... /* free string space when done with string */ free (str); 5.2.3 Floating point fields 5.2.3.1 The rcgetd Function Use the rcgetd function to get a double between two column locations. /* get a double between columns 0 and 9 */ double result = rcgetd(rp, 0, 9); 5.2.3.2 The rcgetf Function Use the rcgetf function to get a float between two column locations. /* get a float between columns 0 and 9 */ float result = rgetd(rp, 0, 9); 5.2.4 Integer fields 5.2.4.1 Base 10 integer fields 5.2.4.1.1 The rcgeti Macro Use the rcgeti macro to get a decimal integer between two column locations. /* get a decimal integer between columns 0 and 9 */ int result = rcgeti(rp, 0, 9); 5.2.4.1.2 The rcgetl Marco Use the rcgetl macro to get a decimal long integer between two column locations. /* get a decimal long between columns 0 and 9 */ long result = rcgetl(rp, 0, 9); 5.2.4.1.3 The rcgetui Macro Use the rcgetui macro to get a decimal unsigned integer between two column locations. /* get a decimal unsigned integer between columns 0 and 9 */ unsigned int result = rcgetui(rp, 0, 9); 5.2.4.1.4 The rcgetul Macro Use the rcgetul macro to get a decimal unsigned long between two column locations. /* get a decimal unsigned long between columns 0 and 9 */ unsigned long result = rcgetul(rp, 0, 9); 5.2.4.2 Explicit base integer fields 5.2.4.2.1 The rcbgeti Function Use the rcbgeti function to get an integer in a specified radix from between two column locations. /* get a hexadecimal integer between columns 0 and 9 */ int result = rcbgeti(rp, 0, 9, 16); 5.2.4.2.2 The rcbgetl Function Use the rcbgetl function to get a long in a specified radix from between two column locations. /* get a hexadecimal long between columns 0 and 9 */ long result = rcbgetl(rp, 0, 9, 16); 5.2.4.2.3 The rcbgetui Function Use the rcbgetui function to get an unsigned integer in a specified radix from between two column locations. /* get a hexadecimal unsigned integer between columns 0 and 9 */ unsigned int result = rcbgetui(rp, 0, 9, 16); 5.2.4.2.4 The rcbgetul Function Use the rcbgetul function to get an unsigned long in a specified radix from between two column locations. /* get a hexadecimal unsigned long between columns 0 and 9 */ unsigned long result = rcbgetul(rp, 0, 9, 16); 5.3 Other Functions 5.3.1 The reof Macro Use the reof macro to determine when the record stream has reached the end of file. /* if error or end of file reached */ if (rgetrec(rp)==EOF) { /* if end of file */ if (reof(rp)) { ... /* else error */ } else { ... } } 5.3.2 The rerror Macro Use the rerror macro to determine if an error has occurred on a record stream. The rerror macro returns the error number. It is a good practice to check for any errors just prior to closing a record stream. If the error indicator is clear, you have additional confidence that the stream was read correctly. if (rerror(rp)) printf("File %s not read correctly.\n", rnames(rp)); rclose(rp); 5.3.3 The rseterr Function If you write wrapper functions or other functions that interact with recio functions, your code will need to handle errors. If can use the rseterr function to set the error number and to call the record stream callback error function. /* get integer and validate range */ int rrgeti(REC *rp, int min, int max) { int result; result = rgeti(rp); if (result < min || result > max) { rseterr(rp, R_ERANGE); } return result; } 6.0 CLOSE FILE 6.1 Close File When finished reading a data file, close it. Do not attempt to close recin as it is always open. /* close record file */ rclose(rp); 6.2 Close All Files Rather than closing record files one at a time, one can close all open record files at once using the rcloseall function. /* all done */ rcloseall(); 7.0 INDEX errno macro ............ 2.0, 2.1, 3.2 rbegcolno macro ........ 2.1 rbgeti function ........ 5.1.4.2.1 rbgetl function ........ 5.1.4.2.2 rbgetui function ....... 5.1.4.2.3 rbgetul function ....... 5.1.4.2.4 rcbgeti function ....... 5.2.4.2.1 rcbgetl function ....... 5.2.4.2.2 rcbgetui function ...... 5.2.4.2.3 rcbgetul function ...... 5.2.4.2.4 rcgetc function ........ 5.2.1.1 rcgetd function ........ 5.2.3.1 rcgetf function ........ 5.2.3.2 rcgeti macro ........... 5.2.4.1.1 rcgetl macro ........... 5.2.4.1.2 rcgets function ........ 5.2.2.1 rcgetui macro .......... 5.2.4.1.3 rcgetul macro .......... 5.2.4.1.4 rclearerr macro ........ 2.1 rclose function ........ 6.1 rcloseall function ..... 6.2 rcolno macro ........... 2.1 rcxtno macro ........... 3.5 recin expression ....... 3.1, 6.1 reof macro ............. 2.1, 5.3.1 rerror macro ........... 2.1, 5.3.2 rflds macro ............ 2.1 rfldno macro ........... 2.1 rgetc function ......... 5.1.1.1 rgetd function ......... 5.1.3.1 rgetf function ......... 5.1.3.2 rgeti macro ............ 5.1.4.1.1 rgetl macro ............ 5.1.4.1.2 rgetrec function ....... 4.1 rgets function ......... 5.1.2.1 rgetui macro ........... 5.1.4.1.3 rgetul macro ........... 5.1.4.1.4 risvalid function ...... 2.1 rnames macro ........... 2.1 ropen function ........ 3.1 rrecs macro ............ 2.1, 4.2 rrecno macro ........... 2.1, 4.3 rsetbegcolno function .. 3.6 rsetcxtno function ..... 3.5 rseterr function ....... 5.3.3 rseterrfn function ..... 2.2 rsetfldch function ..... 3.3 rsetfldsiz function .... 3.4 rsetfldstr function .... 2.1 rsetrecsiz function .... 3.4 rsettxtch function ..... 3.3 rskipfld macro ........ 5.1.5.1 rskipnfld function ..... 5.1.5.2