Title: RECIO TIPS Copyright: (C) 1994-1996, William Pierpoint Version: 2.14 Date: June 14, 1996 1.0 COMMON PROGRAMMING ERRORS 1.1 Inadvertently typing a stdio function instead of a recio function Many functions in recio and stdio have similar names and similar uses. This makes it easier to learn the recio functions, but it also makes it easier to unintentionally type in a stdio function when you really want a recio function. When compiling it is best to have all warnings turned on. A mistyped function will give a "suspicious pointer conversion" warning; a mistyped macro will give an "undefined symbol" error. Exceptions: no error or warning if you use the fcloseall function instead of the rcloseall function, or strerror instead of rstrerror. Hint: in most cases you can use the rerrstr function instead of rstrerror. 1.2 Inadvertently typing the wrong symbolic name for an error constant Symbolic names for error constants are similar between recio and those supported by the compiler. You may have typed ENOMEM when you meant R_ENOMEM. Check to make sure that for valid record pointers, symbolic error constants start with "R_"; for invalid record pointers, symbolic error constants start with "E". 2.0 ERROR HANDLING 2.1 Callback Error Function The first use of any recio function should be rseterrfn() to register your callback error function for your application. 2.2 Explicit Conditions Not Reported to Callback Error Function There are two conditions that your code must explicitly handle as these are not reported as errors to the callback error function: 1) Test ropen() for NULL return and, if true, test errno for ENOENT. This indicates that the file could not be opened since it does not exist. Any other errors are reported to the callback error function (if registered); your code can handle them there. REC *rp = ropen("file", "r"); if (!rp) { if (errno == ENOENT) { /* file does not exist */ ... } } 2) Test the return value from rgetrec(). If it is a NULL return, then either end-of-file reached or an error occurred. You need to follow up on the NULL return value to determine which one happened. You can use either the reof or the rerror function. Any errors would have been reported to the callback error function; be careful that you don't report the same error twice. /* loop through all records in file */ while (rgetrec(rp)) { ... } if (!reof(rp)) { /* error occurred before all records read */ ... } 2.3 Make an Error Check Just Prior to the rclose Function Check for errors just before closing any record stream. This is a good safety check since (1) you might have forgotten to install your callback error function or (2) your callback error function failed to catch and correct all the errors. if (rerror(rp)) { /* file not completely read in */ ... } rclose(rp); If you use recin in your program, check for errors after your last use of recin or just before you exit your program. if (rerror(recin)) { /* error occurred on recin stream */ ... exit(EXIT_FAILURE); } exit(EXIT_SUCCESS); 2.4 The rsetfldstr Function Clears Error and End-of-File Indicators The rsetfldstr function has a side effect in that it internally calls the rclearerr function, which clears the error and end-of-file indicators. The rationale for this is as follows: 1. The rsetfldstr function is used in the callback error function to correct data errors. Even if used elsewhere it's purpose is to force-feed a data value to the program. 2. When the callback error function returns, the recio library functions will only read the replacement value if the error and end-of-file indicators are clear. 2.5 The rsetrecstr Function Clears Error and End-of-File Indicators Rationale is similar to rsetfldstr function, section 2.4 above. 3.0 FIELDS 3.1 Field and Text Delimiters Field and text delimiter characters must be ASCII and are used only with character delimited fields. If the null character '\0' is used as a delimiter on an output stream, that delimiter is not written to the stream. This can be sometimes be useful if you just want to write some data to the screen. If a delimiter is set to the space character, it is taken to mean any white space: space, tab, etc. See the documentation that comes with your compiler for the isspace() function. Delimiters around text are optional; no error is generated if they are missing. However text delimiters are needed if the string contains a field separator character. Also no harm is done if text delimiters are put around non-text fields. 3.2 String Fields Empty strings are legal; no error is generated if there is nothing in a string field. All other types of fields must have something in them or a missing data error is generated. If you do this: /* usually bad */ char *strptr = rgets(rp); strptr points to the string buffer which changes every time a new field is read. Instead copy the data into your string. But the method below could truncate your data if the field buffer has expanded. char str[FLDBUFSIZ+1]; /* could lose data if field buffer has expanded */ strncpy(str, rgets(rp), FLDBUFSIZ); str[FLDBUFSIZ] = '\0'; Instead you will need to dynamically allocate memory space for your strings. The macros scpys and scats allow you to dynamically copy and concatenate strings. To use the scpys and scats macros, you will need to (1) set all string pointers to NULL when declaring them, and (2) free your strings when finished with them. char *str=NULL; ... scpys(str, rgets(rp)); ... free(str); 4.0 FINE TUNING 4.1 Better Use of Heap Space If you are tight on memory, you can fine tune recio for your application by doing the following: 1. Set ROPEN_MAX to the minimum number of files you need open simultaneously. Note that recin is always open and must be included in the count. 2. Use rsetfldsiz() and rsetrecsiz() functions to set the maximum size record and field needed for that record stream. Use these functions before the first field or record is read from the file. To determine the maximum size record buffer, determine the number of characters in the longest line of the file to be read, including the newline. To determine the maximum size field buffer, determine the number of characters in the longest field in the record. If the longest field contains text delimiters, a trailing field delimiter, or white space between the trailing text delimiter and the trailing field delimiter, include these as part of the size. 5.0 IDEAS FOR EXPANDED CAPABILITIES 5.1 Additional Types of Input Functions The macros rget_fn, rcget_fn, etc are used to define functions that get numerical input. By developing the appropriate conversion functions, one could expand recio to get other types of data. 5.2 Wrapper Functions and Macros If you define wrapper functions or macros that supply a default value when the record pointer is NULL, then you can combine reading the file or reading a set of default values with the same section of code. REC *rp = ropen("file", "r"); if (rp || (!rp && errno==ENOENT)) { /* read data using wrapper functions with default value */ ... if (rp) rclose(rp); } 5.2.1 Default Value If your application cannot find a data file, you may want to use a set of built-in default values. This could be a good strategy if your application uses configuration files. Note that wrappers will be needed on almost every recio function that occurs after the ropen function, including rclose, rsetrecsiz, rsetfldsiz, rsetfldch, and rsettxtch, to prevent reporting a null record pointer to your callback error function (or you will first have to test for a null record pointer, such as "if (rp) rclose(rp)"). Example Function The rdgeti function gets an integer from an opened data file or gets the default value if the data file has not been opened (NULL pointer). /* if file open, read value from file; else use default value */ int rdgeti(REC *rp, int default) { return (rp ? rgeti(rp) : default); } Example Macro You can easily rewrite the rdgeti function as a macro. #define rdgeti(rp, default) ((rp) ? rgeti(rp) : (default)) 5.2.2 Validated Range In order to validate data, you need to make certain that the value read from the file is within established limits. You may want to add functions that post the range values to an internal data clipboard which your callback error function can access. If you are letting users correct data on the fly, convert the minimum and maximum values to strings, and post pointers to the strings. Example Function The rrgeti function gets a integer and validates that the integer is within the established range. int rrgeti(REC *rp, int min, int max) { int result; result = rgeti(rp); if (result < min || result > max) { rseterr(rp, R_ERANGE); } return result; } 5.2.3 Default and Range You may want to combine default value and range validation into one function. Example Function The rdrgeti function gets an integer from an opened data file or gets the default value if the data file has not been opened. The function validates that the integer is within the established range. int rdrgeti(REC *rp, int default, int min, int max) { int result; if (!rp) { result = default; } else { result = rgeti(rp); } if (result < min || result > max) { rseterr(rp, R_ERANGE); } return result; }