Database Design and Implementation Basics This document describes why and how to use Alpha Four. It is intended for people who are relatively new to database design. The document will explain uses of Alpha Four, show a step-by-step outline of how to implement the program, and provide a description of the necessary terms and concepts. Alpha Four, The Relational Database for Non-programmers, is a tool used to store, process, and provide information quickly and easily. Data may be imported from other programs, such as spreadsheets, word processors, and other database programs. It can also be typed in with the option of using powerful "field rules" to help speed data entry and prevent mistakes. Once stored, information may easily be searched, sorted, filtered, changed, ordered, and otherwise manipulated. Lastly, data may be provided or "output" as letters, reports, mail labels, and other forms. Before you start to put your data into Alpha Four, a structure (or structures) must first be created called a database. The better you design your databases, the easier it will be to input, process, and output your data. Explanation of Terms and Concepts A database is a structure which holds certain types of information about people or things. An example of a manual database is a checkbook register. Number Date Description T Deposit Withdrawal ______ ________ ___________________ ___ _______ __________ 1001 2/1/93 Country Club 600.00 Apartments 1002 2/7/93 Fry's Food and Drug 52.87 1003 2/9/93 Sonoran Desert Museum X 50.00 001 2/15/93 Pay check deposit 1500.00 (T means tax deductible) In this database, there are four records and six fields. Each row (transaction) is considered a RECORD, and each column --or unit of information-- is considered a FIELD. The fields in this example are: Number, Date, Description, Tax Deductible, Deposit, and Withdrawal. In database terminology, we would say that each field can be classified in two ways. The first classification is by its data type. The second is by whether it is a key field. We'll talk more about key fields later. For now, we will discuss data types A data type refers to the kind of information a field can hold. Alpha Four has five different data types: Field Type...Description Character....Contains up to 254 alphanumeric characters Date.........Contains dates Numeric......Holds numbers Logical......Holds a true or false value Memo.........Holds up to 5,000 characters of text Looking back at our check book database, let's see how each of the fields can be classified. The field Date is a date field; Deposit and Withdrawals are both numeric. Tax deductible is a logical field (It can either be deducted or it can't.). Description is a Character field since it will hold alphanumeric information and will be less than 254 characters long. The transaction number field will also be a character field. You may be wondering why we classified transaction number as a character field. We did this because, even though it holds a number, we will not be performing mathematical operations on that number. Unlike the deposit and withdrawal fields which we may eventually wanted to add and subtract, we will never add or subtract transaction numbers. Technically we could have made Transaction Number a numeric field, but it is good idea to make number fields character fields if you will not be performing math on them. OK, now that we have talked about data types, let's discuss key fields. A key field is the pivotal field. A key is the field or fields which identify the record. It is the field(s) which all of the other fields describe. In our checkbook database, the Transaction Number would be the key field. Every database must have a key, but the key doesn't have to be just one field. Let's say we had a database in which we recorded the daily sales for each of our salespeople. Initials of Date Salesperson Sales _______ ______________ _______ 2/1/93 S.C. 3000.00 2/1/93 N.K. -800.00 2/1/93 R.S. 1678.00 2/2/93 S.C. 258.00 2/2/93 N.K. 7800.00 2/2/93 R.S. 12562.00 2/3/93 S.C. 322.00 2/3/93 N.K. 22000.0 2/3/39 R.S. 1250.00 In this example the key would be the Date field and the Initials field, since you need to know both units of information to find out sales. In the previous example, we just looked at one database, a checkbook register. Most of the time, more than one database will be necessary. Different databases should be used to hold different types of information about people or things. Below are examples of some typical databases: Doctor's Office Retailer School Admin Library ___________________ ____________ ________________ _________ Patients Customers Students Members Prescriptions Inventory Classes Books Patient Accounts Invoices Teachers Donations Note how each database describes a different type of person or thing. Think about what types of information you need for your own use. On a piece of paper, write the heading "Databases," and list the different types of people and things you wish to store. Let's review what we've learned so far. We have defined a database. We know each database is made of multiple records, and that each record contains multiple fields. We also discussed how fields are classified by data types as well as what the different data types are for Alpha Four. Lastly, we said that most uses of Alpha Four require multiple databases, and that each database is a collection of specific kinds of information on people or things. OK, now let's divide the databases we wrote down into individual fields. Example: Databases (retailer) Customers CUSTOMER_#, NAME, ADDRESS, PHONE_# Invoices INVOICE_#, DATE, CUSTOMER_#, SHIPPING ADDRESS, PHONE_#, ITEMS PURCHASED Inventory STOCK_#, DESCRIPTION, COST, RETAIL, QTY_IN_STOCK After completing this exercise, you would have all the databases and fields you would need to get started. However, further planning at this stage can make your database more intelligible, and that can save you a lot of time in the long run. The process of refining a database's design is called "normalization." To help explain how to normalize databases, let's look back to the example of the retailer. Specifically, let's look at the invoices database. This is the structure of the Invoices database before normalization: Field Field Description _____ _________________ 1. INVOICE_# 2. DATE 3. CUSTOMER_# 4. SHIP_ADDR 5. PHONE_# 6. ITEMS_PUR The first step will be to break down the fields into smaller fields, where appropriate. First, let's look at the SHIP_ADDR field. The type of information we would store in this field would be the shipping street address, city, state, and zip code. By breaking down the SHIP_ADDR field into four smaller fields (i.e. STREET, CITY, STATE, ZIP), we will be given more flexibility. One of the advantages of making this separation is that it would be easier to search or order information by city, state, and/or zip code; this is especially useful for tracking shipments or utilizing mailing lists. Next, let's look at the field PHONE_#. We could also break this field down into two smaller fields, area code and seven digit phone number. However, because it is unlikely that we would ever want to search or order our information by area codes or seven digit phone numbers, it is best that we leave it as one field. The most tricky of the Invoice database's fields is the ITEMS_PUR field. With our current structure, this field could hold the stock numbers, descriptions, prices, and quantities for items being sold. To normalize this field, we would first want to break the information down into separate fields. We would end up with: STOCK#'s, DESC's, QTY's, and PRICEs. The problem we have now is that each field could still need to hold information of multiple items, unless we limited each invoice to record the sale of only one kind of item. If we look back the definition of a field (see glossary), we will see that a field is only supposed to hold ONE unit of information. To solve this problem, we have two options. The first is to create multiple fields for the number of items we could conceivably sell on one invoice. For example, we could have STOCK_#1, DESC_1, STOCK_#2, DESC_2, etc. This method is appropriate in some situations, however, it can be limiting for several reasons. First, each invoice can only have as many items as there are fields; second, it will take more storage space in your computer than the alternate method; third, it takes longer to set up and change field rules (discussed later); lastly, it takes more effort to search for data later on. The alternate, and generally preferred method, breaks the ITEMS_PUR field into another separate database, called the Line_Items database. The Line_Items database has the invoice number as the key field and as a linking field. Invoices Database Line_items Database INVOICE_# ---common-field---> INVOICE_# DATE STOCK_# CUSTOMER_# DESC SHIP_ADDR QTY PHONE_# PRICE Each record in the Line_Items database would represent one line of an invoice. This way the number of lines you could put on one invoice would be virtually unlimited, since you can keep entering more records into the Line_Items database. By breaking a field down into an entirely new database, we are creating a SET. Databases are linked together into SETs by common fields through the use of linking indexes. In this case, the common field between the Invoice and Line_Items databases is INVOICE_#. The linking index would contain INVOICE_# as the key field. The creation and use of SETs is an important and powerful tool. For more information, look at the reference manual and tutorial under the heading "Sets." Also, you may wish to request our document called "Invoicing Sets." This document may be requested by phone, fax, mail, or it may be downloaded from our BBS. An Important Note About Using Sets In this example, the link between the Invoice and Line_Items database is a one to many link (1:n). In other words, for each (one) INVOICE_# in the Invoice database, there could be multiple (many) records in the Line_Items database with a matching INVOICE_#. As a general rule, no database should have more than one 1:n link. If you're database design does, we strongly advise changing it. Unless you are very familiar with database theory, this type of situation can produce unpredictable results. Indexes In the above example, we mentioned the term "linking index." Without getting too technical, this is a brief description of what an index is: An indexes is a file which contains the records number and parts of data (called "keys") for each of the records in a database. The more common uses of indexes are ordering data, connecting databases in a set, performing lookups, eliminating duplicate entries during data entry, and finding records. If, for example, you had a database of customers, and you wanted to find the first record for a given customer, you would use an index based on the customer field. See the reference manual and tutorial for more information on indexes. As mentioned above, one time indexes are needed is when performing a lookup. A lookup is a type of field rule. Field rules are guidelines you set up that the data must conform to and/or steps to automate data entry. Field rules only apply when the data is entered or changed; they do not apply to data which have already been entered. A common field rule is "Case Conversion." You can set up case conversion to automatically capitalize, for example, the two letter state abbreviations as they are being entered. One of the most powerful field rules is a lookup. A lookup can check databases for the presence or absence of information, and can fill in other fields based on what it finds. For example, it can be used to fill in the city and state fields when a zip code is entered. It also check to make sure duplicate information is not being entered. Below is more information on field rules. Field Rules If people were perfect, there would probably be no need for field rules. The purpose of field rules is to make data entry faster and more accurate. You don't have to have field on your database, but they can be quite helpful. Below is a list of the most commonly used field rules. Many of these rules are discussed in the tutorial, and all of them are in the reference manual. This list is meant to provide an overview and for quick reference. Field Rule Description Calculated............Similar to a default, except the value can not be changed by the user, and the expression will calculate every time the record is updated, not just the first time it is entered. Case Convert..........Automatically converts words to uppercase, lowercase, or all capital letters. Default...............A value or calculation which fills in the first time a record is entered. The user can go back and change it if necessary. If you have an invoice screen, and you usually sell only to customers in one state, you can make the state field default to that state. Increment.............This sequentially numbers records as they are entered. Lookup................Checks a table or database for the presence (or absence) of information, allows display of this information in a pop-up window and can fill in fields based on this information. This is probably the most powerful of field rules. Mask..................Allows only certain kinds of information to be entered. It can, for instance, make sure only numbers are entered into a phone number field. Required..............Will not allow the user to save the record until this field is filled in. Skip Expression.......Skips over a field during data entry if specified conditions are met. Template..............Sets up fixed, non-editable characters in a field. For example. a social security number field always has a dash (-) after the first three numbers and the second three numbers. Validation Expression.Assures that correct data is entered based on a formula. For example, a date of birth field must have a date before today's date. Once the field rules are in place, you are ready to create your forms, browse tables, reports, letters, and/or mail labels. You can also automate and simplify the use of Alpha Four for yourself or your end users by the use of applications and scripts. Applications and Scripts Scripts and applications are means of automating the use of Alpha Four so that even users who are not familiar with the program can easily use it. Scripts are similar to small programs which can automatically perform operations in Alpha Four, make decisions, and prompt users for input. An application is a menu structure which can be set up to perform Alpha Four operations, play (or use) scripts, and provide enhanced security to your data. If you are designing an application for others to use, chances are you will want to create an application. Consult the reference manual and tutorial for more information. Also, Alpha Software's BBS is a good source for sample scripts and applications written by other Alpha Four users. Glossary 1:1...............A relationship in a SET where one record in one database matches one other record in another database. This is often referred to as a "one to one link." 1:n...............A relationship in a SET where one record in one database matches one OR MORE records in another database. This is often referred to as a "one to many link." Database..........A collection of records, sometimes called a table. Field.............A unit of information contained within a record. Field Data Types..The kind of data which can be stored in a particular field. Alpha Four has five different data types: Numeric, Logical, Memo, Date, and Character. Index.............A file which stores the order of records in the database (based on an index expression). For example, an index may sort records by last name. Key Linking Field.One or more fields that identify the record. A field (unit if information) which is common between two or more other records in the same or a different database. Normalization.....A process of breaking down databases into their simplest form. Record............A collection of fields. There are multiple records in a database. Set...............Two or more databases linked by one or more common fields. A set works much like one large database. Implementation Outline 1. Make a separate database for each set of related attributes, and give each database a primary key. 2. Normalize your databases. a. Divide fields into component parts when necessary. b. Divide repeating groups of fields into separate databases when necessary. (I.e. build Sets) c. Make sure no database has more than one 1:n link. 3. Decide on data types for each field. 4. Enter the databases, linking indexes, and sets into Alpha Four. 5. Decide on and set up field rules (optional). 6. Set up forms, reports, browse tables, letters, mail labels, and/or indexes (optional). 7. Set up scripts and applications to automate and simplify use of the program (optional).