IBM OS/2 Extended Edition Version 1.3 DATABASE MANAGER HIGHLIGHTS AND DIRECTION Philip J. Sullivan Senior Product Planner IBM Personal Systems Programming Center Austin, Texas T/L 793-2794 VM: SULLY @ AUSVM1 Trademarks The following terms are trademarks or registered trademark of the IBM Corporation in the United States and/or other countries: AIX Application System/400 DB2 ETHERAND Operating System/2 OS/2 OS/400 Personal System/2 Presentation Manager PS/2 QMF RISC RISC System/6000 SAA SQL SQL/DS Systems Application Architecture System/370 System/390 Token-Ring The following terms are trademarks of other companies as follows: Intel Intel Corporation Microsoft Microsoft Corporation Windows Microsoft Corporation Disclaimer Much of the information in this paper concerns future products, or future releases of products currently commercially available. The discussion on Windows 3.X is based upon information which the Microsoft Corporation has made publicly available, and is subject to change. The description and discussion of IBM's future products, performance, functions, and availability are based upon IBM's current intent and are subject to change. Special Notices References in this publication to IBM products, programs, or services do not imply that IBM intends to make these available in all countries in which IBM operates. IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not imply giving license to these patents. OVERVIEW IBM OS/2 Extended Edition Version 1.3, available as of December 1990, includes a Systems Application Architecture (SAA) relational database management system, named Database Manager, that supports the SAA Structured Query Language (SQL). Database Manager can operate as a single-user relational database, or as a multiple-user client-server distributed relational database management system attached to a local area network (LAN). Database Manager has benefited from IBM Research-developed database technologies and architectures, including the relational model of data and SQL. These IBM technologies and architectures, coupled with the new technologies and architectures developed expressly for Database Manager, have positioned Database Manager to be an industrial-strength relational database providing high performance, concurrent data access, robust data integrity and protection facilities, and consistency with the family of IBM SAA relational database management systems. ENVIRONMENT Client-Server Computing Paradigm The client-server paradigm is a significantly different computing environment from the traditional computing paradigm that centralizes processing power, on-line storage and print services. In a client-server paradigm information processing power, on-line storage and print services are de-centralized or distributed across the network among client workstations, workstation servers and host servers. Within the computer industry there is a consensus that client-server computing will be the preferred paradigm of the 1990s. A market study conducted by Business Research Group, Market Opportunities in Client/Server Computing, August 1990, identified two factors as prime drivers for the paradigm shift away from a totally centralized computing environment to a client-server computing environment. The two factors are: - the desire for more cost-effective and de-centralized systems, and - the desire to increase user productivity. The Business Research Group study also revealed that client-server has many different interpretations within the market. The client-server definitions fell into four categories; of which, client-server database was defined as: "Database applications that span across multiple computers on a network." To define and clarify the term client-server, Business Research Group has proposed the following: - "Client/Server Technology - the optimization of data storage, retrieval and computation between machines over current and future networks." - "Client/Server Applications - software that has the following characteristics: -- Uses a graphical interface; -- Enhances user effectiveness; and -- Uses client/server technology." Growing Need for More Information and Fast Access to Consistent Information The decade of the 1990s will continue the 1980s trend toward world-wide markets. Business enterprises around the world will increasingly develop, manufacture and distribute products to the global marketplace. Large enterprises, for the most part, are international enterprises with multiple establishments located in various countries. This characteristic is also true, although to a lesser extent, of medium-size enterprises and, in limited instances, to small enterprises, as well. Within these large, medium and small enterprises, individuals, workgroups, departments and establishments will have increasing need for quick access to consistent information. Many factors will contribute to the success, or failure, of business enterprises to compete and win in the global marketplace. One factor that will determine success, or failure, is the ability of business enterprises to rapidly respond to changing market requirements and deliver market-driven quality products. In order to be truly market-driven, enterprises will require an in-depth understanding of their customers' wants, needs and requirements, an in-depth understanding of their customers' satisfaction levels with existing products, and an in-depth understanding of market opportunities. The collection and management of this information, analysis of the information, sharing of the information and, ultimately, the management decision to apply the information toward the delivery of market-driven quality products will be fundamental to success in the marketplace. These needs will increasingly drive the demand for industrial-strength information management systems. Distributed Relational Database Management Systems During the mid 1980s, enterprises had begun using SQL relational databases for new business applications. And, by the late 1980s, SQL and relational database management systems had become preferred technologies for new business applications for all computing environments: personal computer, advanced workstation, mid-range and mainframe systems. During the 1990s, SQL relational database management systems will increasingly be employed to manage the information needs of the enterprise, departments, workgroups and individual persons within the enterprise. Relational database management systems will function as the repositories of vital information ... essential information necessary to fulfill the mission of the enterprise, the department, the workgroup, and/or the individual person. Distributed database management systems predicated on the client-server model will continue the momentum begun in the late 1980s. During the early 1990s distributed databases will emerge as the next significant step in the evolution of database management systems. Distributed relational database management systems potentially offer the following benefits: - Concurrent information sharing, or the capability to make information more readily available to multiple users and applications within an enterprise. - Increased productivity and economies of scale which may be derived through the placement of information in departments and workgoups, where the work occurs, throughout an enterprise on decentralized remote database management systems running on less expensive personal computer and advanced workstation servers. HIGHLIGHTS Relational Model of Data Database Manager is predicated on the relational model of data. The relational model of data was invented by E. F. Codd at the IBM San Jose Research Center in the late 1970s. Today, the relational model has been widely accepted as an industry standard by the marketplace. The relational model has been designed to be easy to understand and use. Information is presented to the user in an easy to recognize and easy to use table format. A table is a logical data structure consisting of rows (records) and columns (fields). The user defines and accesses data in terms of tables and performs operations on these tables. The main advantage of the relational database model is the clear separation between the user perception of data, i.e. tables consisting of columns and rows, and the internal implementation of data, which is hidden from the user. The simple table format, along with SQL and high-level application development tools, means that the application developer and the database administrator do not have to understand complex physical data structures and access methods, which are characteristics of hierarchical file management systems. The benefits are ease of use and productivity. Structured Query Language SQL is the common interface to Database Manager and all IBM SAA relational database management systems. SQL, originally developed at the IBM San Jose Research Center, has also become a standard in the industry. SQL is a high-level data definition and data manipulation language, used for defining, accessing and changing data in tables. SQL is considered simple to learn, yet powerful in expressing sophisticated queries. A single statement in SQL can perform the same function as many lines of application code developed with a conventional programming language, such as C, Cobol, Fortran, or Pascal. All data accesses to Database Manager are performed through SQL statements and the SQL common interface. SQL supports arithmetic operations on retrieved values. The query functions support selective retrieval from single or multiple data tables and dynamic sorting of the set of resulting rows. Built-in functions include summation, grouping, ordering, and basic statistics (for example, calculating an average of the values in a column). SQL statements can be entered interactively (through Query Manager described below), embedded in a precompiled application program written in a high-level programming language, or embedded in an application program written with IBM Procedures Language 2/REXX. Database Manager provides language precompilers to prepare embedded SQL statements for subsequent application program compilation and execution by Database Manager. Precompiler support is provided for: IBM C/2, IBM Cobol/2, IBM Fortran/2, and IBM Pascal/2. Database Manager supports both Static SQL and Dynamic SQL statements embedded in an application program. The difference between Static SQL and Dynamic SQL is the instance when compilation of the statements occurs at the database. For Static SQL statements, the compilation occurs during precompiling or binding of the application to Database Manager. The compilation is done only once, no matter how many times the statements are executed during the course of running the application. Dynamic SQL statements are compiled each time the statements are executed during the course of running the application. Static SQL is a more efficient process and will, in most instances, provide better performance than Dynamic SQL. However, if the application must issue arbitrary SQL queries, Dynamic SQL is required. Client-Server Distributed Database Architecture Database Manager has been architected to be a multiple-user client-server distributed database management system for IBM OS/2 local area network environments. The architecture provides for separation of the application program from the database itself. Database Manager remote-unit-of-work functions provide OS/2 and DOS client applications with the capability to transparently access data (read and update) in a single remote OS/2 database server, per unit-of-work. A client application is also able to serially access multiple databases. Database Manager client-server distributed database architecture provides a cost-effective solution to a wide-range of applications, such as decision-support and personal productivity applications, which require multiple users to have fast, concurrent, and easy access to consistent data. Performance Database Manager has demonstrated excellent performance. In October 1990, the National Software Testing Laboratory (NSTL) published results of benchmark tests of Database Manager. The purpose of the tests was to compare the performance of Database Manager with three competitive offerings, each one being a participant in the OS/2 client-server database marketplace. The test results showed Database Manager to have performance comparable or superior to the competitive offerings in three of four client-server application scenarios. In the fourth scenario, Database Manager, while not faster than the competition, was competitive. IBM database technologies have been applied to enhance performance. For example: query optimization techniques, join algorithms, and indexing and locking techniques. Record Level Locking allows maximum concurrent access to data and is especially important to database transaction processing performance. Granular levels of data isolation are provided to enhance performance. These include: Repeatable Read, Cursor Stability and Uncommitted Read. Database Application Remote Interface: may be used to enhance performance. It allows a developer to write an application program where the application processing can be split between the database client and the database server on a local area network. When the application is run, some of the application processing load can be transferred from the client to the server, resulting in a reduction of data traffic on the LAN and a significant improvement in database application performance. Robust Data Integrity and Protection Database Manager has comprehensive data integrity and protection features to preserve the integrity of data, thereby ensuring users that information being accessed is consistent. Abnormal application termination can result in a loss of data integrity. Through the use of a Transaction-Management scheme, a high level of data integrity and protection is achieved for the following abnormal situations: Description Cause Transaction failure One database transaction fails, but other database transactions survive, leaving the data in an inconsistent state. System failure System fails due to power failure, user-initiated system reset, or software failure. Media failure Data on the disk is destroyed by physical or logical damage. Physical data failure can occur due to a bad sector on the disk. Logical data error can occur when a system failure happens in the middle of a write operation; some sectors are left in an old st ate while some are updated. Logical data errors can also occur when the data is overwritten by another program and becomes unrecognizable to Database Manager. Transaction-Management: is a feature that allows multiple applications to run concurrently against common tables in Database Manager. If an application accesses Database Manager and terminates normally, a COMMIT statement is issued which allows the database updates to become permanent. If an application is interrupted in the midst of a transaction, the system can perform a ROLLBACK on all uncompleted work after an application failure, or on the next system restart after a system failure. These functions help ensure the integrity of the database information. Database Manager provides transaction-management through the use of locks, multiple levels of data isolation and a recovery log. All changes to data tables and indexes have log entries written that provide sufficient information to allow Database Manager to back out of an update before any changes are written to the data. The lock function and the specified level of data isolation are used to prevent another application from updating a data record while a transaction is pending against that record. Referential Integrity: is another important feature used to provide robust data integrity and protection. It ensures the consistency of data values between related columns of different tables. For example, a user may define an EMPLOYEE table that contains employee and department numbers and a DEPARTMENT table that contains department numbers. In addition, the user may want to ensure that for every department number in the EMPLOYEE table there must be an equal and unique department number in the DEPARTMENT table. Such a constraint defined on the EMPLOYEE table is called a referential constraint. The department number in the DEPARTMENT table is called a primary key, and the department number in the EMPLOYEE table is called the foreign key in this constraint. Enforcement of this constraint provides referential integrity. The Database Manager records and enforces this data relationship, and enforcement by application logic is not necessary. In addition to ensuring the consistency of data, Referential Integrity can also improve application development productivity by allowing this function to be moved out of each application program and into the Database Manager. Data Security In any database environment, there is a need to protect information from unauthorized access. Database Manager provides for protection from unauthorized access of data by a granular grant/revoke privileges scheme. SQL Grant/Revoke Security: prevents unauthorized access by coordinating security functions through an OS/2 Extended Edition component called User Profile Management and through SQL Grant/Revoke Authorization statements. User Profile Management establishes access levels used by Database Manager. In order to access and use objects in the Database Manager, the user must be identified to User Profile Management and be validated by a password on the first use of the Database Manager. The user is then associated with a valid USERID. Access to a specific database and the objects within it (for example, tables, views, access plans) is controlled by SQL Grant/Revoke statements. A creator, or other specifically authorized user of a database object (such as a systems administrator or database administrator), may protect the object by only granting access rights to specific users and/or groups. Another user must be specifically authorized to access and update a database object. These rights can also be revoked as required. A creator also has the option to allow public access to all database objects. IBM SAA Relational Database Consistency Database Manager is a member of the family of IBM SAA relational database management systems: IBM Database 2 (DB2) and IBM Structured Query Language/Data System (SQL/DS), which run on IBM System/390 and IBM System/370 mainframe and mid-range systems, and IBM OS/400 database manager, which runs on the IBM Application System/400. Syntax and semantics of the Database Manager SQL interface functions are designed for consistency with the family of IBM SAA relational databases. For example, the consistent handling of host variables, precompile, bind, and query optimization simplifies the application developer's work in writing heterogeneous cross-system applications for an enterprise-wide distributed database environment. Import Data from DB2 and SQL/DS Databases Database Manager provides a facility, named SQLQMF, that allows users to import table data stored in DB2 or SQL/DS into a Database Manager database. The SQLQMF facility takes host data exported by the IBM Query Management Facility (QMF), stored in QMF format, downloads the data, and converts it into a file that can be imported into a Database Manager table. DATABASE MANAGER COMPONENTS Database Manager consists of three major components: Database Services, Remote Data Services, and Query Manager. Database Services Accesses to Database Manager are performed by Database Services through the SQL common programming interface (CPI) and utility application programming interfaces (APIs). Database Services manages the data stored in the database, generates access plans to the data, and includes transaction-management, data integrity and protection, multiple-user concurrency and security functions. Database Services also provides a number of utility functions that assist the user in maintaining and manipulating the contents of a database. These utilities include functions to import and export data from a DOS or OS/2 file, save and restore individual data tables, perform database backup and restore functions, and optimize system performance. Remote Data Services Remote Data Services provides database connection services to allow Database Manager to function as a client/server distributed database management system on a LAN. This capability allows multiple OS/2 and DOS personal computer client applications to transparently access a remote OS/2 distributed database server attached to a LAN, or standalone OS/2 personal computer clients, not attached to a LAN, to access a remote OS/2 distributed database server. Remote Data Services supports the following LAN environments: IBM Token-Ring, IBM PC Network LAN, or ETHERAND and IEEE 802.2. OS/2 clients and OS/2 servers utilizes the APPC communication protocol, provided by Communications Manager, or the SQL LAN-Only Option (SQLLOO) to connect in a LAN environment. APPC is used to connect to a non-LAN environment. DOS clients attached to a LAN utilize the NetBIOS communication protocol to connect to a remote OS/2 distributed database server. Query Manager Query Manager provides database tools and functions that allow users to access data, manipulate data and administer data in Database Manager. Query Manager provides an easy to use prompted user interface which is based on the SAA Common User Access guidelines and takes advantage of the OS/2 Presentation Manager windowing and graphic services. The prompted interface hides SQL from the application user, so that the user does not have to learn SQL syntax. However, once a prompted query has been created, its SQL syntax can be optionally viewed. This allows users to learn SQL syntax so they can enter SQL statements in a non-prompted mode, should they desire to do so. Query and Report Writing Tools provide functions to create a database query (inquiry), initiate the data access request and display the results of the query via the report feature. The user may save the query and report objects and save or print the accessed information. A Business Graphics Interface provides the user with the capability to optionally install and use a third-party business graphics program that has been written to this interface. This permits graphic presentation (pie, line, bar, tower graphs, etc.) of report data that was accessed by the Query and Report features. Custom Application Tools provide functions that allow a user to create custom database applications. Panels may be used to create custom data entry and edit formats. Procedures may be used to create Query Manager commands or procedure language statements to execute previously defined query and form (report) objects, or to combine and automate several panel operations. Menus may be used to create a customized list of application selections. Items listed on the Menu may be selected to initiate and automatically run queries, generate reports, run and call procedures, panels, and other menus. Database Administration Tools provide functions to administer and manage a database system. For example: creating, altering and dropping tables, views, indexes; entering and editing data; defining referential constraints; importing and exporting data; reorganizing tables; and monitoring database activities. One of the Database Manager features that is supported through Query Manager is Operational Status. Operational Status provides a snapshot of information about current database activity. This feature provides information about where the databases are located, alias names, the time and date a database was last backed up, and how many applications are currently connected to a specific database. A Command line is also provided for users to issue Query Manager commands directly. These commands include functions such as print, get, import, and display. These commands may be used to execute Query Manager objects, such as Procedures, directly from the command line. FUTURE VERSIONS AND RELEASES OF DATABASE MANAGER Future versions and releases of Database Manager will focus on providing industrial-strength distributed relational database solutions for standalone LAN environments; and in concert with other IBM SAA relational database products, enterprise-wide distributed relational database solutions for interconnected LANs and host systems. Emphasis will be placed on: data integrity and protection, concurrency, flexibility, connectivity and interoperability, database administration, useability, security, performance and capacity. 1991 DIRECTION Industrial-Strength Enhancement: Forward Recovery Data protection is essential to customers who have mission critical database applications, such as an on-line order entry application. Database Manager industrial-strength capabilities will be enhanced through the addition of Forward Recovery functions. Forward Recovery, a.k.a, "Archival Log", will provide the user with the capability to recover lost on-line data due to media failure, such as a disk crash. During normal on-line database transactions, information necessary to maintain data integrity and consistency is preserved on log journals. Journals are normally kept on separate media from the database. Should media failure occur, a backup off-line copy of the database may be loaded on the system and restored as the on-line database. Forward recovery provides the means to apply log journal information against the restored database. Log journals contain changes made to the on-line database since the last backup (off-line copy) of the on-line database was made. After Forward Recovery has been completed, the state of the on-line database is as it was prior to the media failure and subsequent loss of data. The addition of Forward Recovery is consistent with our direction to continue to focus on providing leadership industrial-strength database functions. Forward Recovery enables Database Manager to be used for a variety of mission critical applications, such as on-line transaction processing (OLTP). Microsoft Windows 3.X Support Database Manager DOS Database Requester functions will be enhanced to provide support for applications developed for the Microsoft Windows 3.X environment. With this capability, Windows 3.X client applications written to the Database Manager SQL CPI will be able to transparently access a remote Database Manager server attached to a LAN. NetBIOS Support for OS/2 Clients NetBIOS support for OS/2 clients will be provided as an optional LAN communication connectivity protocol to an OS/2 database server. NetBIOS is a popular personal computer protocol for connecting personal computers together. NetBIOS will require less random access memory (RAM) than APPC, which will result in a reduction in the amount of memory required for client workstations. This option will also simplify database installation and configuration. Database Tool Enhancements Database Administration Tools will be provided as a separate installable option. The tools will include functions to configure Database Manager and a database, backup and restore a database, perform forward recovery, and configure remote client-server and gateway connections. The Database Administration Tools will have graphical user interfaces that provide an easy to use prompted interface which is based on the SAA Common User Access guidelines and takes advantage of the OS/2 Presentation Manager windowing and graphic services. Database Manager Command Interface will provide the user with the capability to create and run SQL statements, database environment commands and database utilities from the OS/2 command line interface and from OS/2 command files. Included with the Database Manager Command Interface is a facility (Re-Org Check) to inform the database administrator that it is beneficial to reorganize a table in the database, so that performance may be improved. Support for Non-IBM Personal Computer Platforms Database Manager will be enabled to run on selected IBM- compatible personal computer hardware systems. Support for non-IBM systems will extend the versatility of Database Manager and provide customers with additional flexibility in selecting hardware platforms: IBM Personal System/2 solutions, non-IBM hardware solutions, or mixed IBM and non-IBM hardware solutions. SAA Relational Database Compatibility Enhancements SAA enhancements will improve Database Manager compatibility with the IBM family of SAA relational database products. SQL Date and Time Arithmetic and Scalar Functions will provide applications with the capability to add and subtract the following data types: TIME, DATE, TIMESTAMP. An example of the use and value of SQL DATE/TIME/TIMESTAMP is an application designed to track and monitor the elapsed time of a commercial airline flight from the originating city and gate to the destination city and gate. SQL State will provide applications with diagnostic information which is common across the family of IBM SAA relational database management systems. This capability makes possible the development of error handling routines which are consistent with and portable across the SAA database family. User Defined Collating Sequence will allow the database user to specify methods which will be used to sort and compare character data. This option may be used to ensure collating results which correspond to IBM host systems, such as System/370. Translate Translate will allow the user to transform character data in various ways; in particular, it makes possible the conversion of characters to their uppercase or lowercase equivalent. This capability makes possible case insensitive searches. Database Manager "Front-Ends" (Data base Applications and Tools) IBM will continue to provide Database Manager technical support to complementary independent application developers and to business partners. This support will be provided through developer assistance programs, whose purpose is to provide assistance in the design and development of a variety of OS/2 Presentation Manager, Windows 3.X, and DOS database applications and tools, such as general purpose query and report writers and customizing tools, that work with Database Manager. Make Available Database Manager Code to Selected Software Developers IBM's direction is to make components of Database Manager code available to complementary software developers who meet IBM's selection criteria. Selected software developers may integrate, package and market Database Manager code with application programs they develop. General purpose database query and report writing programs, and industry specific custom application programs are examples of the types of programs that may use Database Manager code. IBM Database Manager code can provide the software developer with industrial-strength relational database management services, distributed database application connection services to IBM OS/2 database servers attached to a local area network, and distributed database application connection services to IBM SAA host relational databases. FUTURE DIRECTION OS/2 Database Manager Access to IBM SAA Host Databases As IBM begins the roll-out of heterogeneous SAA distributed relational database functions, Database Manager will initially provide functions to support heterogeneous remote-unit-of-work (RUOW) database applications. Support for RUOW and the IBM Distributed Relational Database Architecture (DRDA) will allow Database Manager client applications (OS/2, Windows 3.X, and DOS) to transparently access (read and update) a single remote IBM SAA host distributed relational database, per unit-of-work. A Database Manager client application will also be able to serially access multiple SAA host relational databases. Database Manager Database Application Gateways, as single-user and multiple-user offerings, will provide the transparent connection between personal computer database clients and an IBM SAA host relational database server, such as IBM DB2, IBM SQL/DS, or IBM OS/400 database manager. Over time Database Manager will be enhanced to provide additional distributed database functions. Support will be provided for increasingly more complex data access, such as distributed-unit-of-work, distributed (SQL) statements, and distributed tables. Compatible 32-BIT OS/2 and AIX Database Managers IBM's direction is to provide compatible 32-bit client-server SAA and AIX Database Managers that fully exploit the 32-bit OS/2 and AIX platforms. The 32-bit OS/2 Database Manager will be provided for Intel-based IBM Personal System/2 computers and IBM-compatible personal computers. The 32-bit AIX Database Manager will be provided for IBM RISC System/6000 platforms. The OS/2 Database Manager and the AIX Database Manager will be optimized for LAN performance and interoperability, conform to industry and de facto standards, and provide enhanced industrial-strength capabilities. Parallel Database Processing IBM's direction is to enhance Database Manager to provide 32-bit parallel database processing. Parallelism will support multiple system processor nodes, such as multiple IBM Personal System/2 computers, or multiple IBM RISC System/6000 computers, that are linked together to present a single-system image, Database Manager parallel functions will provide customers with high throughput rates, fast response times, access to very large (gigabytes) amounts of information, and fault tolerant functions. Database parallelism will be especially beneficial to on-line transaction processing (OLTP) applications.