GENEALOGY DATA INDEXER Version 2.0 (c) 1991 Andrew J. Morris **********************INTRODUCTION*********************** Genealogy Data Indexer (or GDI for short) is a special indexing program, designed for indexing large collections of genealogical information. It has been designed to allow rapid data entry, in a flexible and functional format. Information may be sorted and output in a variety of useful forms. Despite its power and versatility, GDI is surprisingly simple to use. Experienced computer users will probably have little or no need to consult this manual, but they are urged to read this introduction, and the section on "Suggested Uses" for further information on applications. GDI may be used to create a simple index to a book or periodical, but it can do far more than that. GDI allows you to include the date and location for an event, besides the usual name and reference citation. It is ideal for indexing original records, such as census, parish registers, land records, etc. It may also be used to index a variety of sources, such as all the local history books for a specific locality. {NOTE: if you typed MANUAL to read this on your computer, just press the space bar after reading each screen to see the next one, or press CONTROL key and the letter "c" key at the same time, to end.} To keep GDI as flexible as possible, so that it will run on the largest variety of machines with as few limitations as possible, GDI uses DOS functions and capabilities for many of its operations. Graphics have been kept simple, so that no special EGA adaptors are required, for example, even though most computers nowadays have EGA adaptors. This program should work fine even on the older IBM PC's and compatibles. No attempt has been made to accomodate the wild variety of printers available, instead GDI creates "Print Files" (in ASCII) that you can print with almost any word proscessor, or the DOS print capabilities. This manual is divided into six chapters: 1. Introduction 2. Entering Information 3. Editing Information 4. Sorting Information 5. Printing Information 6. Suggested Uses GDI is a shareware program - it is protected by copyright, but users are authorized to distribute complete, unaltered copies. Users are encouraged to register, not only because it is the ethically correct thing to do, but because a wide variety of benefits accrue to registered users. We anticipate that as we get feedback from users, other programs will be developed to make GDI data files even more useful, and improvements will no doubt be made to the GDI program itself. Registered users will have access to these improvements, and other services outlined in the chapter on "Suggested Uses." FEES: REGISTRATION FEE: $20 (One time, good for life.) CURRENT DISK: $10 (Available to Registered users only.) PRINTED MANUAL: $5 (Identical to MANUAL.TXT file on current disk.) Please let us know the version # of GDI that you have when you register, so we can let you know about any updates. ****GETTING STARTED**** Copying the Disk and Running the Program: It is best to make a working disk by copying the original, then place the original in a safe place. If you have printed out these instructions, then only the program called "GDI.EXE" need be copied onto the working disk. If you have a hard disk, we recommend placing GDI.EXE in a sub-directory of its own. To copy the program from one floppy disk to another, put the original disk in drive A: and a blank, formatted disk in drive B:, then type at the A: prompt: A:> COPY GDI.EXE B: If you have a hard drive, it will probably be drive C:. You can create a sub-directory called GDI by typing at the C: prompt (make sure you are in the root directory): C:> MKDIR GDI Then enter the newly created sub-directory by typing: C:> CD \GDI Put the original program disk in drive A: and copy to your hard disk by typing: C:> COPY A:GDI.EXE Any of these commands may be modified to match the drives on your computer, by substituting the appropriate drive or path name. To run the GDI program, make sure you have the program in the same drive or sub-directory you are at, and type: GDI If you are copying the program to pass it on to others, don't forget to copy the other files on the disk: MANUAL.TXT and MANUAL.BAT *******************ENTERING INFORMATION******************* When you type GDI, the program begins to run by displaying an introductory screen. After reading that screen, simply press any key to continue. The next screen you see will be the MAIN MENU. This is your basic selection of available options. Whenever you are presented with a menu in GDI you have two ways of making your selection. One is to type the number corresponding to your choice. The choice you select will be highlighted. Then press the ENTER key to select that choice. The second way to make a selection, is to use the up and down pointing arrows (cursor controls). Each time you hit an arrow key the selector highlights the choice indicated, when the correct choice is highlighted, press the ENTER key to select it. There are six choices available from the main menu: 1. Enter Information 2. Edit/Display Information 3. Sort Information/ Create an Index 4. Create Print File 5. Set Function Keys 6. End Program Main menu choices #1 (Enter Information) and #5 (Set Function Keys) are explained in this chapter. Choice #2 (Edit/Display Information) will be explained in the chapter EDITING INFORMATION. Choice #3 (Sort Information/ Create an Index) is explained in the chapter SORTING INFORMATION. While choice #4 (Create Print File) is explained in the chapter PRINTING INFORMATION. Choice #6 (End Program) is self explanatory, you choose this option when you are done using the program, to return to DOS. ***FUNCTION KEYS*** Function keys are the set of 10 keys on your keyboard labled F1 through F10. In GDI, they are set to help save you typing time. A function key can take the place of any string of characters, up to a length of 15 letters, numbers or spaces. When you are entering information in GDI, the first 6 characters assigned to each function key are displayed along the bottom of your screen. GDI assigns values to keys F7 through F10 automatically. If you choose option #5 (Set Function Keys) from the Main Menu, you may assign the other 6 function keys values of your choice. Thus if you are entering information on the Messerschmidt family, you might assign the first key the value "Messerschmidt" - then when you are entering information, each time you want to type that name, you only have to hit the function key instead, and it is typed out automatically! When you press a function key, the characters it types for you are displayed on the screen. If they are exactly as you want them to appear, you may press the ENTER key to continue. However, you may also modify what the function key has typed for you. Thus if you have one key set to "Messerschmidt" but occasionally need to type "Messerschmitt" there is no need to type the whole thing. Press the function key to get "Messerschmidt" then press the backspace key twice and the last two letters "dt" will disappear. Type the letters you want there instead - "tt" and you have your 13 character name in five keystrokes instead of 13. Control keys may also be used in combination, so you can type a common first name and the middle name with just two keystrokes if your function keys are set to those names. (You probably would have to add a space between them - so 3 keystrokes - still a major time saver.) ***ENTERING THE INFORMATION*** When you select choice #1 (Enter Information) from the Main Menu, the program presents you with another menu with only three choices: 1. ADD to Existing Information File 2. CREATE a New Information File 3. Return to Main Menu Choice #3 (Return to Main Menu) is self explanatory. The first time you enter information you will choose #2 (CREATE a New Information File), but it is unlikely that you will enter all the information into that file in one session - unless you are making a very small index. So most of the time you will be choosing #1 (ADD to Existing Information File) to expand on a file that you began in an earlier session. Whether you choose #1 or #2 you will need to specify a file name as the next step. GDI data files always end with ".GDI" - so the program can display the existing data files for you, before you type the file name you want to use. If you are creating a new file (choice #2) then you will not want to duplicate the name of an existing file. If you are adding to an existing file (choice #1) then it should appear in the list the program shows you. ***CREATING A NEW FILE*** You can create a file with full genealogical information, or a simple index. No matter what you plan to put into the file, the initial procedure will be the same. The first time you put information into a file, you will be given the opportunity to describe the source of that information. You may type four lines of up to 80 characters each, to describe the source of your information. You should provide enough detail so that anyone using your index will understand where the original information came from. If you are indexing a book, then a full bibliographic citation is appropriate. If you are indexing an archival source, original records, or any other unique source, be sure to give enough information so that the user of your index can find the actual records you have indexed. You will be asked to provide a name for the file that will hold the information you enter. The name you provide should have no more than six characters. DOS allows up to eight characters, plus a period and three more characters - the period and last three characters will always be ".GDI" for information files created by this program. You can only use six of the eight available character spaces because GDI adds "-S" to file names to indicate that a file has been sorted. ***SELECTING FIELDS*** You will be asked to indicate which fields you wish to include in your information file. The six fields available are: Surname - Firstname(s) - Event - Date - Location - Reference These are the catagories of information a genealogist is likely to want included in an information file, but in fact they are just titles - except for Date and Reference, all fields are treated just as a string of characters, you could include any information you want in them. There are no restrictions on what information you type into a field, so you may use fields for information other than what the title suggests. The exceptions of Date and Reference fields are limited only in certain situations. If the Reference citation is less than 5 characters long, blank spaces will be added to "pad out" the string to 5 characters - see the "SORTING INFORMATION" chapter for the reasoning behind this. The date field will accept any sort of data, but if you try to sort a file by date, this is where the program looks for date information, so the sort might not work correctly if you have something else in that field. If you want to use the fields for something other than what they were designed for, just try out a small sample to see how it works. Here is how we envisaged these fields would be used: Surname - the last name only. Since names are sorted by this field first, names should appear as you want them to be in the index. Firstname(s) - for the first name or initials, and any middle names or initials, as well as any titles. If the title normally appears before the name (e.g. Dr., Sir, Mrs.) I like to add it in parenthesis after the name, so the name gets sorted by name, not title. Event - this is usually just a one or two letter code, (e.g. b., m., d. for birth, marriage and death). Date - the format for dates is day-month-year, with day and year in numbers, and month written out, or abbreviated with just the first three letters, such as 4 Jul 1776. If just the year is given, it may include a modifer, such as "pre" (before), "aft" (after) or "ca" (around). Location - may be as specific as needed, but should be kept short when possible, because long location names can use up your disk space rapidly. For American states, for example, it is best to use the two letter postal code. Reference - is usually a page number, but may refer to a folio, volume or any other criteria you want. You may use a code if appropriate, such as roman numerals to represent each of several books. See the chapter on "SORTING INFORMATION" to see how your choice of reference citations may affect the sorting sequence, and thereby the appearance of a subsequent print file. When initializing a new information file, GDI will ask which of these fields you wish to include. Each field will be listed, you respond by typing the letter "Y" or "N" (capital or small letters - it doesn't matter) for YES, include that field, or NO, do not include it. See the chapter on "SUGGESTED USES" for clues as to why you might want to include more fields than are immediately necessary for an index. ***ADDING TO AN EXISTING FILE*** When you add to an existing file, you need to name which file you are adding to. The program will list all the files available that end with ".GDI" - indicating a GDI information file. You can type in the name you used originally, you do not need to type the ".GDI" ending, though if you do so it will do no harm. You will not need to specify which fields are included, the program will determine which were used last time and use the same ones again. ***TYPING INFORMATION*** There are two formats for entering information, which will be presented to you in the usual menu format. The first option is recommended for first time users. If you find on using it that this method seems a bit slow, that you have to wait for the prompts to appear before you can type more information, you may want to try the second method next time. Most modern computers will be faster than your typing, but some of the older PC's and compatibles cannot process information so fast. The second method of data entry allows you to enter information rapidly even on these older machines. When entering information there are several shortcuts available. The use of function keys to enter information has been described above. Another useful feature of GDI is that it automatically repeats information for a field if you don't type any new information in, but just press the ENTER key. Thus if you are typing the names of a family, you don't have to keep typing the surname over and over, you type it the first time, then just press ENTER for the surname field and type in the new first names. Likewise, if you are indexing a book, you will not have to keep retyping the page numbers, just once when you enter the first name for that page. If the program automatically repeats the previous entry when you don't enter anything, then how can you enter a blank, or unknown? Well, GDI doesn't like blank entries, but you can type a dash, which is generally recognized as meaning that particular information is unknown or unavailable. There are two special codes you will use. Type the "@" symbol to edit a preceding entry. Enter an exclamation mark "!" to end a session. The "@" symbol will allow you to edit any of the preceding 5 entries - one entry being the contents of all six fields, or as many of those as you have chosen to use. To edit earlier entries, see the chapter on "EDITING INFORMATION." Entering "!" will return you to the main menu. ***************EDITING INFORMATION********************** When you select the "Edit/Display Information" option from the main menu, the first screen of the Edit section will display a message to this effect: "For reasons explained in your manual, there is no easy way to edit just one entry in your information file. You will usually find it easier to edit using EDLIN (see manual). Here you have to go through all entries in the file, and choose which ones you wish to edit. Keep these factors in mind when editing: MAKE SURE YOU HAVE ENOUGH DISK SPACE - YOU NEED AT LEAST AS MUCH FREE SPACE AS THE SIZE OF THE FILE TO BE EDITED." The information files created by GDI are of a type known as "sequential" files in computer jargon. The main advantage to using this type of file is that there is no wasted disk space, the information takes up just as much room as it needs and no more, and there are no pre-set size limits for each field of information. The disadvantage is that information must be retrieved in the order it was put into the file. If the file is sorted using the "Sort Information/Create an Index" option in the main menu, then the order of each entry is changed, but you still have to read each entry in order as it now appears. Database programs use files called "random acess" so that they can get any entry in the file just by specifying its position within the file, but for random files each entry takes up the same amount of space, so every entry in the database takes up as much space as the longest entry. There are tricks to get around these limitations, but they work efficiently only if you know in advance the exact contents of the database. Since GDI is intended for entering and indexing large collections of information, random files are not the best option, but without them, editing can be time consuming. (We are working on a program that will take a collection of GDI files and create a random acess database from them for an efficient and speedy database - registered users will be kept informed of progress.) So to edit a GDI file using GDI you will have to wait for the program to present each entry in the file, in order, 13 entries at a time. These will appear in a menu format, with the additional options of Continuing with the next set of records, or END the edit session. When the entry you want to modify appears in the menu, you simply select that entry. Then the fields are presented to you one at a time, and you can press enter to leave them unchanged, or type in a new value. You may delete the entry entirely by entering "$" for the first field. When you are done with that entry, the earlier menu re-appears, showing the entry with any changes, and the usual options to continue on or end. Try editing a short file to see how it works. This method of editing is not too cumbersome if you have only a few hundred entries in the file. If you have tens of thousands of entries, however, this method is just too slow. There is another way to edit your information. GDI files may be accessed by certain word processors, including the DOS text editor EDLIN. The instructions that came with your computer should include an explanation of how EDLIN works. There are characters embedded in GDI files by the program that may appear odd when viewed by a word processor or EDLIN, but as long as these characters are not erased, GDI should be able to continue to use the file. Some word processors, however, will add characters to the file, making it unusable by GDI. The results can be unpredictable, so always be sure to back up your file before editing. GDI saves each entry as a single line of information, with the fields separated by a character that does not appear on your keyboard. (For programmers: ASCII character 15). When you edit a GDI file using EDLIN, this character appears as two characters - an up pointing caret character followed by a capital O (^O). Be careful not to delete these characters when editing. One word processor that I have seen handle GDI files OK is the widely available shareware program called "PC-WRITE" by Quicksoft. The version I saw was limited to files of 60K however, and your GDI file may get much larger than that. Newer versions of PC-WRITE may handle larger files. When GDI files are edited through PC-WRITE, that program displays the dividing characters as a star shaped symbol (*), and each time it is encountered it reverses the appearance of text on the screen, from white on black to black on white and vice versa. Another editing option you have is first to create a print file, and do your editing there, using your word processor. Since print files do not have the special characters information files have, and since they will not be further processed by GDI anyhow, it does not matter if your particular word processor adds other control characters. Remember though that these changes were not made in the original GDI file, so if you print it again later, or use the GDI file as part of a database later, the changes will not have been made. Please let us know which word processors (include version #'s) work, or do not work with GDI files, so we can let other users know. *********************SORTING INFORMATION************************ When you choose the "Sort Information/ Create an Index" option from the main menu, you are presented with a menu that lets you choose between sorting by name, or by date. If you sort by name, the information will be sorted in this sequence: Surname - Firstname(s) - Date - Location - Reference - Event Thus if you have more than one listing for the same Surname and Firstname, the field is Date, the information will be in order by that field. It will not necesarily be chronological however - here the date is treated as a string of characters, so "2 June" will come before "2 March" since J comes before M. If you enter only the year, then the dates will be in the correct order. If you sort by date, then GDI breaks down the date entry, and puts the entire file in the correct chronological order. Dates that are entered as multiple years, such as 1654/5 - or years that have modifiers instead of day and month (sucy as ca, pre, aft) may not sort into the correct location. Let us know where you have problems and we will do our best to correct that for the next version. After you select the order for your sort, the program will have you choose the file to sort, just as in earlier sections. Then you will have the opportunity to choose all of the fields in the file, or any part of them. This is done in a fashion similar to that used to select fields when you first created the information file, except only those fields available in the file you selected will be presented. You may choose all available fields by simply pressing ENTER for the first one. When creating a sorted file, keep in mind the sequence for alphabetizing. Suppose you entered all six fields, for example, in order to create a database. You plan to publish an index with just the names and references from your information file (you will only be printing the fields Surname, Firstname(s) and Reference). If you sort using all six fields, then duplicate names will not be in order by reference number, but by date or other preceding criteria. In this case, you would want to create a sort file using just those three fields you intend to print, then they will be in the correct order. GDI uses the DOS SORT function to sort your information. This means that all entries are sorted alphabetically, without regard to case. That is to say, a capital letter is treated the same as if it were small. In this computer sorting, numbers come before letters. Here is a typical sorted sequence: DeLaguna, Charles deLaguna, Mary Delaguna, Peter Delmont, 123 Delmont, Anna Of course one doesn't usually see numbers for the first name, but when names match and the date becomes significant, this numerical priority may explain the results you observe. The time it takes your computer to sort a file depends on how large the file is, and how fast your computer is. Sorting takes a great deal of disk space, and involves a lot of file manipulation - be sure to make backup copies of your file before sorting. The program does everything necesary automatically, and gives you messages as it goes to indicate that it is working. Allow plenty of time to sort large files. Another quirk of computer sorting is that spaces come before any character in the sequence. Thus if your reference page numbers were simply left as you enter them, they would sort something like this: 1 12 17 18 2 23 231 25 To avoid this problem, spaces are added to the LEFT of any reference you type that is fewer than five characters in length. This will allow you to use numbers up to 99999. If you are adding a code before the page number, such as when indexing a series of books, remember to add the necesary spaces yourself as you enter the information. If you indexing books A B and C, and each has fewer than 1000 pages, make sure each reference totals four spaces (A 13, A123, B 3, etc.). ***PROGRAMMERS NOTE*** Programmers may be interested in this explanation of how GDI sorts large files - if it doesn't make sense to you, don't worry about it - the computer is taking care of all this: First, a sort file must be created. Even if you are sorting all the fields in the file, we must first strip away the lines at the beginning of each file that describe the source of information. Second, DOS sorts the file we just created for that purpose. Since DOS will only sort about 60K however, large files must be sorted one part at a time. Next, if there are more sections to be sorted (i.e. it is a large file) then new sort files are created and sorted. These sorted files must then be combined into larger files - this is easier than sorting however, since we know they are each in sorted order, they only need to be blended together. Finally, the initial source information must be recombined with the final information in sorted order. *******************PRINTING INFORMATION******************** When you choose "Create a Print File" from the Main Menu, a menu appears with four choices: 1. Create Regular Print File 2. Create Index - Print File 3. Change Print File Default values 4. Return to Main Menu Option #4 is self explanatory, it takes you back to the Main Menu. If you choose #1 or #2, you begin by typing in the name of the file to be printed, in a process that should be familiar by now - first the program presents available GDI files to choose from, and you can type in a name (with or without the ".GDI" ending) or just press ENTER without typing anything if you want to return to the Main Menu. The difference between a "Regular Print File" and an "Index - Print File" is simply that the Index file combines references when all the other details in an entry match. The program assumes that the references are in the correct order, and just adds them to the printed line with spaces and commas between the references. Some examples should clarify this. Suppose you are printing the fields Surname, Firstname(s), Location and Reference for a sorted file. Here is how a few entries might appear in a "Regular Print File": Smith, Michael (NY) 13 Smith, Michael (NY) 18 Smith, Nathan (NY) 11 Smith, Nathan (NYC) 10 Smith, Nathan (NYC) 14 The same information in an "Index - Print File" will appear: Smith, Michael (NY) 13, 18 Smith, Nathan (NY) 11 Smith, Nathan (NYC) 10, 14 The print file GDI creates is stored on your disk in ASCII format, almost any word processer has the ability to use these files, under the name of ASCII or DOS. You can then use the word processer's capabilities to edit, format, or print the file. If you choose #3 from the print file menu (Change Print File default values) you will have the opportunity to change any of eight values that control the layout of your print file. These are the default values: Total Lines Per Page = 55 (this is the number of lines available for printed information) Blank Header Lines = 3 (lines left blank at the top of each page) Blank Lines Between Text = 0 (allows you to double or triple space between each line printed) Beginning Page # = 1 (page #'s printed at the bottom of each page, if you set this to 0 the program will not print any page numbers) Footer Lines = 4 (lines left blank at the bottom of the page, except that the page #, when used, will be printed toward the center of these) Left Margin Spaces = 5 (blank spaces on the left of each line) Characters per line = 65 (this is the # of characters that will be printed on one line) Space to indent wrapped lines = 5 (spaces on the left, in addition to the Left Margin Spaces, that are printed only when one entry continues onto a second line) *******************SUGGESTED USES************************* Our first suggestion is that you try out the format and content you plan to use for a data file on a small scale first, to make sure the results are as you expected. Sort and print the file to see how it looks before you spend a lot of time entering information. Second, always, always, always make backup copies of your work. If you do not know how to copy a file to make a backup, check the information that came with your computer, or ask someone familiar with computers to demonstrate the procedure for you. It is actually quite simple. ***APPLICATIONS*** There are thousands of ways this program can be put to good use. Here are just a few ideas. Index the local history sources for your county. Find all the local history books at your local library, and create an index to them all. You can assign each book a letter code (or two letters if there are more than 26) that precedes the page number in the reference field. Index all the references you can find for a particular surname in a particular area. The size of the area can be anything from the whole country to a certain town, depending on how common or rare the surname is. Index primary records, such as census, church records, land records, etc. Enter the primary events (b, m & d) for everyone mentioned in the research papers you have accumulated for your family. Many people who collected information for years before they computerized never get around to entering all the details they have into a genealogy computer program. You can at least keep track of it all - just number each page of information you have and enter the data in a GDI file. When you create an index to one small record, you may want to include information for all the fields in GDI, even though the information may seem repetitive. When you have indexed a group of such records, the information may be combined into a database. For example, suppose you index names in the 1850 census and include the event "l" for living, and the year "1850" in every entry. These fields are not included in your sort or print files since they're so repetitive, but when you (or someone else) index the 1860, 1870 and 1880 censuses, these various files are ready to be combined into a database for the county. Add the marriage records and/or birth records, and you soon have a very comprehensive database of county information. ***SPECIAL OFFER*** How would you like to have a laser typeset printout of your index or information file? Can you imagine getting a camera-ready master copy of your publication for just the cost of a photocopy? Well that is our special offer to registered users of GDI. If you plan to publish the information from one or more GDI files, we will print for you a professional quality laser - typeset master copy. Your cost is just ten cents per page of print out, plus $2 postage. You do not give up any of your copyrights, we will not be selling copies of your data without your permission. We do reserve the right to keep one copy of the data, for our own use in ways that do not infringe on your copyright. ***DATABASES*** We expect to have the capacity to produce large databases from GDI files within the year. Actually, we produced an early version for the earlier version of this program (which was not comercially released) and tested it on a relatively small database with 60,000 entries. The program to create databases will not be sold, but the databases are themselves programs that will run on their own. Users will be able to have proprietary databases made, for their own use or sale. ***DATA EXCHANGE** We also propose to run an exchange to facilitate the distribution of GDI files. These may be in the form of individual files, or groups of files condensed into databases as described above. Anyone who contributes files that are included in larger databases under this exchange system will receive free copies of the complete database. Contributors will also have the opportunity to receive other files in exchange for those they submit. Since information files may be of widely different sizes, exchanges will be based on approximate parity - either in the number of entries, or the total size of files. Details are yet to be worked out, your input is encouraged. ***FEEDBACK*** Please let us know what you think of GDI. Suggestions for improvements are welcome, along with any other comments, complaints or compliments. We have a few ideas for other programs for genealogists that get away from the fill-in-the-family-group-sheet mold, your support will ensure further developments get done. ***SUPPORT*** We cannot provide telephone support for this product, but will be happy to answer any letters that are accompanied by a SASE. We suggest that novice users consult someone with a bit of computer experience before writing for help, they can probably show you what you need to know in a matter of minutes. If you do need help, please describe exactly what the problem is - if possible include sample files or print-outs. Andrew J. Morris P.O. Box 535 Farmington, Michigan 48332