Reference How to Use the Documentation Please read the Release Notes before installing OmniPage. The notes include up-to-date lists of supported scanners, compatible file formats, and any last minute information concerning the current release of OmniPage. Use this Reference manual to find specific information about any OmniPage feature. It describes all the commands and settings, how to use the editor, how to improve performance, and how to troubleshoot common problems. This information is also available in OmniPage's online Help system. Chapter 2 contains a variety of tutorial exercises to help you learn OmniPage and see what it can do to streamline your workload. OmniPage professional contains Caere's 24-bit image-editing program Image Assistant. The Image Assistant Tutorial booklet introduces the program's basic features. Refer to the Image Assistant online Help system for detailed information about Image Assistant's features. Some features described in this documentation are available only in the OmniPage professional version of the product. These descriptions are marked "Professional version only.' Assumptions We assume that you know how to work in the Microsoft Windows environment. If you have questions about how to use dialog boxes, scroll bars, edit boxes, and so on, please refer to the Windows User's Guide. CAERE CORPORATION 100 Cooper Court Los Gatos, California 95030 European Offices: CAERE Gm6H. Ismaninger Strasse 17-19 81675 Munich, Germany OmniPage and OmniPage Professional Windows Version S CopyrightC 1994 Caere Corporation. All rights reserved. CAERE OmniPage, OmniPage Professional, Image Assistant, AnyPage, AnyFax, 3D OCR, and True Page are trademarks of Caere Corporation. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Such designations appearing in this manual have been printed in initial caps. Product Serial Num6er: (from Disk #1 label) Table of Contents 1 Installation What's in the Package 1-2 System Requirements 1-3 __ Security Lock (International Versions Only) 1-4 Saving Previous User Dictionaries Before Installation 1-5 Installing the Software 1-6 Setting up a Windows Swap File (Virtual Memory) 1-8 Starting OmniPage 1-10 Selecting Your Scanner 1-11 Conserving Disk Space 1-12 Chapter 2 Tutorials Before You Start2-1 Tutorial 1 - Basic Text Recognition2-2 The OCR Process2-2 Automatic OCR with the Default Settings2-6 Touring the Toolbar2-13 Touring the Settings Panel2-15 Using the Process Buttons2-23 True Page Recognition (Professional version only)2-29 Opening a Graphic in Image Assistant (Professional version only)2-32 Tutorial 2 Document Types and OCR Settings2-35 Setting a Zoning Method2-36 Complex Layouts2-38 Standardized Forms2-42 Legal Documents and Spreadsheets2-49 Documents with Specialized Characters (Professional version only)2-50 Foreign-Language and Multilingual Documents2-56 Tutorial 3 Streamlining the OCR Workflow 2-58 Saving a Settings File for Specific Documents 2-58 Scanning Large Jobs 2-60 Opening Multiple Image Files 2-63 Exporting Images 2-64 Deferring Recognition (Professional version only) 2-66 Chapter 3 Commands and Settings The Too5$oarrtcut Command Buttons 3-3 3-3 Processing Buttons 3-5 AUTO Button 3-5 Image Button 3-7 Zone Button 3-10 OCR Button 3-12 The File Menu 3-14 Open Document 3-14 Close Document 3-16 Mail 3-16 3-16 Save As 3-16 Export Image 3-19 Revert to Saved 3-21 Get Accuracy Info 3-21 Save Settings 3-24 Load Settings 3-25 Save Zone Template (Professional version only) 3-25 Print 3-27 Publish to Envoy (Professional version only) 3-27 Exit 3-28 The Edit Menu 3-29 Cut 3-29 Copy 3-30 Clear 3-30 Clear 3-30 All Zones 3-31 Select All in Page 3-31 Check Recognition 3-31 3-33 3-33 Delete Recognized Zone 3-35 Select Recognized Zones 3-35 _____ Delete Current Page 3-36 ___ Go to Page 3-36 The Format Menu 3-37 Character 3-37 _____ Paragraph 3-38 The Process Menu 3-41 Auto3-41 Stop3-42 Scan Image3-42 Load Image3-43 Auto Zones3-45 Manual Zones3-46 Use Template (Professional version only)3-50 Perform OCR3-51 Defer OCR (Professional version only)3-51 Train OCR (Professional version only)3-52 Process Settings3-55 Finish Current Document3-55 Finish Deferred Documents (Professional version only)3-57 Start Image Assistant (Professional version only)3-59 The Settings Menu3-60 Settings Panel3-60 Select Scanner3-62 Select Languages3-62 Edit Training File (Professional version only)3-63 Edit Zone Contents File3-66 Edit User Dictionary3-68 The Window Menu3-70 Tile Horizontal3-70 Tile Vertical3-70 Cascade3-70 Arrange Icons3-70 Hide/Show Toolbar3-71 Hide/Show Status Bar3-71 Hide/Show Ruler3-71 Zone Window3-71 Text Window3-71 Zoom In3-71 Zoom Out3-71 The Help Menu3-72 Contents3-72 Procedures3-72 Using Help 3-7 About 3-7 Chapter 4 The Settings Panel Settings Panel Overview 4-2 Selecting Settings Panel Options 4-3 Scanner Options 4-4 Page 4-4 ADF 4-5 Options 4-6 Zones Options Columns 4-9 4-9 Single Column or Table 4-10 None 4-10 OCR Options 4-11 input Options 4-11 Use Language Analyst 4-12 Retain Graphics 4-13 Output Options 4-14 Fonts Options 4-17 Retained Font Formats 4-17 ignored Font Formats 4-18 Spelling Options 4-19 4-19 Spell Checking Options 4-20 Preferences Options 4-21 Save Page Images in Caere Document 4-21 Prompt Before Deleting Pages 4-22 Save Settings on Quit 4-22 Reject Character 4-22 Chapter 5 Editing Recognized Documents Choices Before OCR 5-2 OCR Output Options 5-2 Font Options 5-6 Retaining Graphics 5-7 Language Analyst 5-8 Languages and Dictionaries 5-9 Editing Options After OCR 5-12 Overview of the Text Window 5-12 Checking Recognition 5-13 Verifying the Image5-14 Formatting the Page and Editing Text5-14 Saving Your Document5-19 6 Improving Performance __- Improving Speed 6-2 Manual Brightness 6-2 Language Analyst 6-4 Manual Zones 6-5 Set Up a Permanent Windows Swap File 6-5 Improving Accuracy 6-6 Document Quality 6-6 Scanner Options 6-7 Scanning Angle 6-7 Scanner Glass Clarity 6-8 Paper Transparency 6-8 7 Troubleshooting Before You Begin7-2 Installation7-3 Installing OmniPage with the Norton Desktop7-3 Conflicts with Disk Cache Programs7-3 Using EMM386.EXE7-4 SETUP repeatedly requests the same disk7-4 Testing OmniPage with a Simplified System7-4 Scanners7-5 The Scan Image commands are grayed out7-5 "Can't Open Scanner" message displays7-5 Microtek Scanners7-5 Testing OmniPage with the Sample Pages7-5 Checking the Scanner Driver Name and Version7-6 Checking the Scanner Hardware7-9 Changing your Scanner Installation7-9 Scanning Causes System Crash7-10 Memory7-11 Operation7-12 Error Messages7-15 Caere Product Support7-24 Dialog-up Services7-24 Information We Need From You7-24 International Support 7-24 Chapter 8 Understanding OCR How OCR Works 8-1 Basic OmniPage OCR Technologies 8-2 AnyFont 8-3 Page Analysis 8-3 Character Experts 8-4 Self-Learning OCR 8-5 AnyPage 8-5 Compound Neural System 8-6 AnyFax 8-7 The Language Analyst 8-7 Trigram Analysis 8-7 3DOCR 8-8 True PageTM 8-8 Glossary Appendix A How to Use True Page When to Use True Page A-2 True Page Considerations for Different Documents A-4 Business Documents A-S Legal Documents A-6 Newspaper and Magazine Articles A-7 Tables and Spreadsheets A-8 Settings Panel Options A-9 OCR Options A-9 Zones Options A-10 Font Options A-11 Target Applications A-12 Working in a Target Application A-12 Appendix B Technical Information TWAiN Scanners B-2 with the Canon CJ-10 Scanner B-2 Use Relisys and Umax Scanners with TWAiN Only B-3 TWAiN Scanners Not Listed B-3 Using Microtek Scanners with TWAiN B-3 Supported Output File Formats B-4 Formatting Information B-6 One-Page, True Page Output in WordPerfect for Windows 5.2 B-6 True Page Output to WordPerfect for Windows 6.0 Format B-6 Recognizing Wide Text Zones May Cause Incorrect Margins B-6 Saving to Spreadsheet Applications B-7 Incorrect Font Size Output B-7 Retaining Graphics using AUTO and HP AccuPage B-8 Other Important Information B-9 Required HP Printer Driver B-9 Resolution for Orchid Fahrenheit 1280 Video Card B-9 Recognizing Legal, Landscape Pages with 3D OCR B-9 After Dark Star Trek Edition 1.0 and Image Assistant B-9 Calibrating a Printer for Image Assistant B-9 Index Chapter 1 Installation Please read this section carefully! It includes: What's in the Package þ System Requirements þ Security Lock (International Versions Only) - Saving Previous User Dictionaries Before Installation þ Installing the Software þ Setting up a Windows Swap File (Virtual Memory) þ Starting OmniPage - Selecting Your Scanner Conserving Disk Space What's in the Package What's in the Package Your OmniPage or OmniPage Professional 5.02 package includes: þ OmniPage Installation disks þ OmniPage Reference manual þ image Assistant tutorials booklet þ Color Calibration Chart for Image Assistant (Professional version only) Warranty Registration Card If anything is missing, please contact your Caere dealer. oPln~~t~5e~ write your warranty registration number (printed disk labels) in this manual. The number should be nnnnX-Xnri~rinnnnn wherein n is a digit and x is a letter. System Requirements Requirements To install and run OmniPage, you need the following setup: Computer with an 80386 or higher processor. Microsoft Windows version 3.1 or higher. Windows-compatible mouse Total system memory of at least 8MB RAM. 12MB RAM are recommended for Windows for Workgroups users to optimize speed. þ4MB or larger permanent Windows swap file. þOmniPage requires hard disk space with at least 24MB available: 10MB for OmniPage files (international versions require 14MB), 10MB for temporary storage while OmniPage is running, and 4MB for Windows swap file. OmniPage Professional requires hard disk space with at least 27.5MB available: 13.5MB for OmniPage files (international versions require 18.5MB), 10MB for temporary storage while OmniPage is running, and 4MB for Windows swap file. Image Assistant (Professional version only) requires a Super-VGA color monitor with 512K memory on the adapter card to view 256 colors. To view all 24 bits of color (millions of colors) in 24- bit color images, you need a 24-bit video card. t Image Assistant uses large amounts of free hard disk space when processing images. The more free disk space you have, the more you can edit and process large image files. An OmniPage-compatible scanner. OmniPage supports most Windows-compatible scanners; see TWAIN Scanners on page B-2. Your scanner must be installed and tested according to the manufacturer's instructions. OmniPage can open TIFF image files produced by other scanners. See Supported Input File Formats on page 3-8. Security Lock (International Versions Only) Security Lock (International Versions Only) international versions of OmniPage use a hardware key as security lock to prevent unauthorized use of Caere software. The security lock, included in the OmniPage package, is a sma. device that fits between your computer's parallel printer port (LPT) and a parallel printer cable (if used). The security lock must be installed in order for OmniPage to work. To install the security lock: 1 Plug the security lock into your parallel printer port (LPT). 2 Plug the parallel printer cable (if used) into the security lock. 3 The security lock will not affect printer use. Printer cable (if used) Security lock « Parallel port (LPT) a Security Lock with HP Scanjet Scanners The combination of the security lock, an HP Scanjet or Scanjet Plus scanner with an HP 88920 scanner interface board, and a computer with an on-board printer port may not work. The error message "Cannot write to device HP Scan" may appear when you try to select a scanner or scan a page with the security lock installed. You must add another parallel port board via an add-in card and attach the security lock there if this occurs. Savina Previous User Dictionaries Before Installation Previous User Dictionaries Before Installation Save a user dictionary (*.ud) created in a previous version of OmniPage as a text file before you install OmniPage or OmniPage Professional 5.0. The dictionary is overwritten during installation of the later version otherwise. To save a previous user dictionary as a text file: 1 Open your older version of OmniPage. 2 Choose User Dictionary.. in the Defaults menu. 3 Select the user dictionary file to save and click OK. The Edit User Dictionary dialog box appears. 4 Click Export... 5 Save the dictionary as a text file in a different directory. 6 Install and open your newer version of OmniPage or OmniPage Professional. 7 Choose Edit User Dictionary.. in the Settings menu. The Select File dialog box appears. 8 Click New. The File to Save dialog box appears. 9 Enter a name for your new user dictionary and click OK. The Edit User Dictionary dialog box appears. 10 Click import.... 11 Select the user dictionary you saved as a text file and click OK. See Edit User Dictionary... on page 3-68 for more information on importing a text file. Installing the Software Installing the Software To optimize installation speed, make sure SMARTDrive is loaded before installing OmniPage. Do one of the following to load SMARTDrive: Type smartdrv at the MS-DOS prompt before you start Windows. þOr, add the SMARTDrive command line to your autoexec.bat file. For more information, see the Optimizing Windows chapter in your Windows User's Guide. To install OmniPage software: 2 Start Windows and open the Program Manager window. Insert the copy of Omnipage disk #1 in drive A: (or B:) of your computer. 3 Choose Run in the Program Manager File menu. The Run dialog box appears. 4 Type A: \SETUP (or B: \sETUp) in the Command Line edit box and click OK. A dialog box prompts you to choose where to install OmniPage. OMNIPAGE is the default directory for first-time OmniPage users. OMNIPRO is the default directory for first-time OmniPage Professional users. If you are installing an upgrade of Omnipage in the same directory as your current version, existing OmniPage files are automatically deleted. 5 Click Continue to start installation. A progress meter appears. 6 7 Insert the other installation disks as prompted. Select your geographic location, North America or Not North America, in the dialog box that appears and click OK. Your selection determines the default dictionary. 8 Type your registration information in the dialog boxes that appear. Installinq the Software You will be prompted to print out this information. Product support is only available to registered users. Please send the printed registration information to Caere in the supplied envelope. Outside the US or Canada, be sure to use the correct envelope. Or, fill out and return the supplied registration card. 9 Click OK in the notification box when OmniPage notifies you installation is complete. 10 Restart Windows. Setting up a Windows Swap File (Virtual Memory) Setting up a Windows Swap File (Virtual Memory) OmniPage performs faster the more available memory you have. 12-16MB RAM is recommended for optimal performance. Set up a permanent Windows swap file with a minimum of 4MB of free, contiguous disk space to further improve disk speed. A swap file acts as virtual memory. Free disk space set aside as a swap file is used as if it were additional memory. This lets you run more programs than you could with memory alone, but it is slower than using regular memory. The disk space used for a swap file is different than the _____ disk space needed for temporary storage while you are working on a file. Be sure to allocate enough free disk space for both a swap file and temporary storage. Windows 3.1 automatically creates a swap file at setup. You can change the size of the swap file through the Control Panel. Before setting up or changing a swap file, you may need to optimize your disk to maximize the amount of contiguous free disk space (defragment the disk). Contiguous means that the free disk space is literally one solid, empty block. Utility programs such as Norton Utilities can defragment a hard disk. For more information about swap files, see the Optimizing Windows chapter in your Windows User's Guide. To set up or change a Windows swap file (virtual memory): 1 Start Windows in Enhanced mode by typing win /3 2 Double-click the Control Panel icon in the Main window of the Program Manager. 3 Double-click the 386 Enhanced icon to open the 386 Enhanced dialog box. 4 Click the Virtual Memory button to open the Virtual Memory dialog box. This dialog box displays the location, size, and type of swap file. The swap file should be at least 4096KB. 5 Click the Change button to expand the dialog box. Setting up a Windows Swav File (Virtual Memory) 6 Select a new drive in the Drive list if you want to locate the swap file some place other than the default drive. For example, you can store the swap file on a second hard disk that is faster or larger than the default. If you can't find a drive with at least 4096KB of free space, try deleting some files and optimizing the disk again. - - - A swap file must be located in an uncompressed drive. If - you use DoubleSpace or another disk compression method, consult its documentation regarding swap files. 7 Select Permanent from the Type list. 8 Type 4096 or greater in the New Size edit box and select Use 32-Bit Disk Access if it is available. 9 Click OK in the Virtual Memory dialog box and click Yes to verify changes to virtual memory. 10 Restart Windows. Starting OmniPaqe Starting OmniPage To start Omnipage, do one of the following: þ Double-click the OmniPage or OmniPage Professional icon in the Windows Caere Applications program group box. þOr, choose Run in the Windows Program Manager File menu. Type in the drive and directory where OmniPage is installed followed by the Omnipage command. For example: C: \Omnipage\omnipage for OmniPage or C: \omnipro\omnipage for OmniPage Professional. Selectinq Your Scanner Your Scanner Your scanner must be installed according to the manufacturer's instructions. Make sure that the scanning software supplied by the manufacturer works on your system before you install OmniPage. The first time you run OmniPage, the Select Scanner dialog box appears for you to select your scanner. Depending on the make of your scanner, a dialog box may appear prompting you for scanner driver parameters such as I/O addresses. Please consult your scanner's documentation for information about port or memory addresses. The ScanMgr icon appears at the bottom of the Windows desktop after you select a scanner. You can change your scanner selection at any time. For a list of supported scanners, please see TWAIN Scanners on page 3-2. To select a TWAIN scanner. please see TWAIN Scanners on page 3-2 To change your scanner selection: 1 Install and test the scanner according to the manufacturer's instructions. 2 Start OmniPage. 3 Choose Select Scanner... in the Settings menu. The Select Scanner dialog box appears. 4 Select the name of your scanner and any required scanner driver parameters. 5 Click OK. If you experience a system crash when you try to scan, Cþb add the following line under [386Enh] in your system.ini file: ENNExciude-A000-EFFE and then restart Windows. Conservinq Disk Space Conserving Disk Space OmniPage copies all file format conversion filters and ScanMgr files to your hard disk during installation. The size of the average filter file is 26KB. The size of the average Sc file is about 18KB. To save disk space, you can delete unused file format conversion filters and ScanMgr files from your hard disk. ScanMgr files and conversion filters for supported file -are listed under Appendix B. To reinstall a full set of conversion filters and ScanMgr files, you must reinstall OmniPage. Chapter 2 Tutorials This chapter contains three tutorials, each of which contains a number of exercises. The tutorials take you through basic text scanning and into more advanced concepts such as how to create OCR training files, scan a large stack of documents, and use deferred page recognition to maximize your efficiency. There are three tutorials: þ Basic Text Recognition þ Document Types and OCR Settings þ Streamlining the OCR Workflow e You Start 5; 1 Be sure your scanner is attached, turned on, and working with your system. 2 Make sure you have the following page samples you need to work through the tutorials in this chapter: þ Multiple Column Page Sample þ Single Column Page Sample þ True Page Sample (Professional version only) These samples were included with your OmniPage package. 3 Save the files as directed during the exercises so you can use them in later exercises. Tutorial 1 Basic Text Recognition Tutorial 1 - Basic Text Recognition OmniPage lets you scan documents and recognize text with the click of a single button in the toolbar. The toolbar also the most common OCR options at your fingertips. OmniPage gives you efficient, flexible control over your documents: you can stop, backtrack, and restart at any stage without repeating the whole process. This chapter takes you through basic scanning and text recognition exercises with OmniPage. After completing the exercises in this chapter. you will know how to: Use the Auto feature to recognize text in standard pages. þ Use the toolbar and Settings Panel. þ Use the process buttons to scan, zone, and recognize text. þ Check and change OCR results. þ Work with the True Page option (Professional version only). Launch Image Assistant from OmniPage to work with graphics (Professional version only). The OCR Process This exercise acquaints you with the OmniPage application window and gives you a brief overview of the OCR process. There are two steps: 1 Open OmniPage. 2 Reset the defaults if necessary. Tutorial 1 Basic Text Recognition Open OmniPage Open OmniPage by double-clicking its icon in the Program Manager. Caere Applications is the default program group. The OmniPage Toolbar A single click of the AUTO button processes your documents automatically. OmniPage's toolbar contains an AUTO button, three large process buttons, and several shortcut command buttons as pictured above Tutorial 1 Basic Text Recognition The process buttons outline the basic flow of OCR. The Image button determines where the page images come from. The Zone button determines whether recognition zones are automatically or manually set. The OCR button determines how and when OCR is performed. Basic OCR includes: Where the page images come from OmniPage can scan documents or open image files.þ Whether recognition zones are automatically or manually set. OmniPage can automatically define and order the page areas to be recognized How and when OCR is performed. OmniPage can perform optical character recognition on the page right away (Perform OCR). In the Professional version only, it also can perform OCR at a later time (Defer OCR) and learn special characters or symbols (Train OCR). The smaller buttons are shortcuts to menu commands such as Copy, Save, Check Recognition..., and so forth. Reset the Defaults (if necessary) The default settings are active the first time you open OmniPage. If you have not changed any settings, proceed to the next section, Automatic OCR with the Default Settings. Otherwise, follow these steps to return to the default settings: Tutorial 1 Basic Text Recoqnition 1Click the drop-down list under each process button and select these options: Scan Image Auto Zones þ Perform OCR (only the Professional version has a drop-down list) Click the arrow to open the options list. Then click the option you want to set. 2 Click the Settings Panel button in the toolbar to open the Settings Panel. 3Click Use Defaults to return to the default settings. 4 Click Yes in the dialog box that asks if you are sure. S Click Close to close the Settings Panel. You can leave the Settings Panel open if you have room on your screen. (This is useful if you need to change the settings frequently.) The next exercise shows you how to use the OmniPage default settings to scan a page and recognize its text. Tutorial 1 - Basic Text Recognition Automatic OCR with the Default Settings OCR is easy with OmniPage even when the page itself is complex. Just click the AUTO button and OmniPage goes to work: it determines scan intensity, column structure, and performs OCR. In this exercise, you will use the Multiple Column Sample practice sheet, scanning with the default settings. (If the defaults been changed, please reset them as described on the page.) There are three steps: 1 Click AUTO to start the process. 2 Check the results in the text window. 3 Save the file. Click AUTO to Start the Process 1 Place the Multiple Column Sample in your scanner. AUTO For best text recognition, make sure the page is aligned and oriented correctly with the text facing down. Most scanners have an arrow or graphic that indicates proper page placement. 2 Click the AUTO button. The AUTO button changes to a STOP button. OmniPage highlights each process button in turn as its function is in progress. The status bar at the bottom of the window also reports progress. The Image button is outlined in black as OmniPage creates a recognition document from the scanned page. Tutorial 1 - Basic Text Recognition þ The scanned image opens in the zone window. The Zone button is highlighted. OmniPage determines column flow for the text and divides it into recognition zones, each surrounded by a rectangle. This shows how OmniPage will order the text as it recognizes the image. þ The OCR button is outlined in black as character recognition takes place. A character window opens with an enlarged view of the text during the OCR process. There are three passes over the text for the OCR when you use the default settings: a cyan pass for initial recognition; a blue pass as the recognition text Tutorial 1 Basic Text Recognition is analyzed and corrected; and, a dark blue pass the final recognition stage. «--- 3 View the recognized text document in the text w' if you have OmniPage Professional, the OCR Settings Panel option True Page Retain All Page Formatting is the default your text output matches the original document as closely a' possible. See True Page Recognition (Professional version onl' on page 2-29 for more information about scanning with this feature. Tutorial 1 Basic Text Recognition If you are using the non-professional version of OmniPage, the OCR Settings Panel option Retain Font and Paragraph Formatting is the default. OmniPage matches the font and paragraph format to the original but the text is displayed in one column in order of recognition. -- -- a. TI -, I - -I ffi- TT -TI - - a il a If the text is not ordered by either method described above, you may have misaligned the page in your scanner. Realign the page and try scanning again. Check the Results in the Text Window After the image is converted to editable text, view the recognized document in the text window and compare it to the scanned image in the zone window, then check OCR results. OmniPage highlights words or characters in the text that were changed by the Language Analyst or identified as questionable or unrecognizable. (If there are no highlighted words, there were no known errors.) Blue: words the Language Analyst changed are highlighted in blue. Green: "suspects," words that may not have been recognized correctly, are highlighted in green. unrecognizable characters, or "rejects," are marked with a red tilde (~). Tutorial 1 Basic Text Recognition 1 Choose Tile in the Window menu so that you compare the text and zone windows if the te> is maximized. 2 Click the maximize arrow at the top right of the to return the text window to full size for easier editing. 3 Click the Check Recognition button to check for potential OCR errors. (This button is dimmed if window is not active.) 4 Correct any errors in the text. If the word is misspelled, you can correct the spelling. Change To edit box and click Change. OmniPage may list one or more suggested words in the Change To drop- down list. The first word in the list is the word as OmniPage recognized it. Click on a suggested word and click Change to replace the word in the text. Alternatively, type the proper word in the Change To edit box. If the word is correct as presented, you have two choices: þ Click Add to add the word to your User Dictionary. Words added to the User Dictionary are considered acceptable spellings in future documents . Tutorial 1 - Basic Text Recognition þ Click Ignore to ignore the currently flagged word. Other instances of the word in the document will be checked After you click a button, OmniPage automatically moves to the next word it identified as misspelled or questionable. Continue correcting text as necessary. Double-click any word in the text window. The Verification window opens to display the corresponding word in the original scanned image. Verify that the recognized word is the same as the word in the original. If it is not, retype the word correctly in the text window. Click anywhere in the text window to close the Verification window. Save the File Once a document's text is recognized, you can save it either as a file in one of several word-processing formats or as a Caere Document. Documents saved using a word-processing format, such as WordPerfect, can be opened from within that application in the same way as any other one of its files. This method saves the formatted text and any embedded graphics in the text window, not the original scanned page image from the zone window. Saving the document as a Caere file (*.met) saves both the recognized text in the text window as well as the original page image from the zone window. OmniPage can open only Caere Documents. If a scanned page is going to be used more than once, or saved to several different word-processing file formats, save it first as a Caere Document. This saves you the trouble of rescanning the page. In this exercise, you will save the file first as a Caere Document and then as a word-processing file. Tutorial 1--Basic Text Recognition Choose Tile in the Window menu so that you can compare the text and zone windows if the text wi is maximized. Click the maximize arrow at the top right of the wint to return the text window to full size for easier text editing. Click the Check Recognition button to check for any potential OCR errors. (This button is dimmed if the t window is not active.) The Check Recognition window opens with the imag and the text of the first word that was replaced or questioned during OCR. on are gcnerated eac: Correct any errors in the text. If the word is misspelled, you can correct the spelling in the Change To edit box and click Change. OmniPage may list one or more suggested replacements in the Change To drop-down list. The first word in the list is the word as OmniPage recognized it. Click on a suggested word and click Change to replace the word in the text. Alternatively, type the proper word in the Change To edit box. If the word is correct as presented, you have two choices: Click Add to add the word to your User Dictionary. Words added to the User Dictionary are considered acceptable spellings in future documents. Tutorial I--Basic Text Recoqnition ù Click Ignore to ignore the currently flagged word. Other instances of the word in the document will be checked . After you click a button, OmniPage automatically moves to the next word it identified as misspelled or questionable. Continue correcting text as necessary. 5 Double-click any word in the text window. The Verification window opens to display the corresponding word in the original scanned image. Verify that the recognized word is the same as the word in the original. If it is not, retype the word correctly in the text window. Click anywhere in the text window to close the Verification window. Save the File Once a document's text is recognized, you can save it either as a file in one of several word-processing formats or as a Caere Document. Documents saved using a word-processing format, such as WordPerfect, can be opened from within that application in the same way as any other one of its files. This method saves the formatted text and any embedded graphics in the text window, not the original scanned page image from the zone window. Saving the document as a Caere file (*.met) saves both the recognized text in the text window as well as the original page image from the zone window. OmniPage can open only Caere Documents. If a scanned page is going to be used more than once, or saved to several different word-processing file formats, save it first as a Caere Document. This saves you the trouble of rescanning the page. In this exercise, you will save the file first as a Caere Document and then as a word-processing file. Tutorial 1--Basic Text Recognition Click in the text window to make it active. 2 Click the Save As... button. The Save As dialog box opens. Caele~' METI Select CaeYe[~.Met] in the Save File as Type edit box an select the Data directory as its location. 4 Type lsc~n.met in the File Name edit box. 5 Click OK. 6 Choose Close Document in the File menu or use the Ctrl+W keyboard shortcut. Choose Open Document... in the File menu or use the Ctrl+O keyboard shortcut to re-open the file lscan.met The file opens in the OmniPage window. Note that both the zone and text windows open. Click the Save As... button. Tutorial 1--Basic Text Recognition 9 Select a word-processing application file type, such as Microsoft Word for Windows, in the Save Files as Type list box and give the file a new name. I~œ~t ~ lile ~or all pdg~ s ,,~r,l, ~:t lealc nev~ : ~ blanl pag~ E~7 c:~ omnipage œ~ dala 10 Click OK. 11 Choose Close Document in the File menu or use the Ctrl+W keyboard shortcut. Touring the Toolbar The OmniPage toolbar provides options for fast and efficient document processing. You can set the Image, Zone, and OCR options on the toolbar before clicking the AUTO button so that scanning progresses without interruption. You can also select the Image, Zone, and OCR processes one at a time. Click the Image button to scan or load images, and click the Zone button later when you are ready to set the zones. Each process button becomes available as soon as the preceding process is finished. This exercise gives you an overview of the most commonly used options. Note the location of the Settings Panel button in Tutorial 1--Basic Text Recognition the toolbar. For more information on the Settings Panel see Touring the Settings Panel on page 2-15. The Image button determines where the page images come from. The Zone button determines whether recognition zones are automatically or manually set. The OCR button de- This button opens termines how and the Settings Panel. when OCR is performed . ~ ~ ~ ___ I ~ _---~ ~ =~ '-~'1 1 1~'-~ ~ F~ ~ ~ ~ ~ ~ a~ ~ 1 ~ ~ ~ ~ ~ .~ l There are three steps in this exercise: Select an Image button option. 2 Select a Zone button option. 3 Locate the OCR button (select options in the Professional version only). Select an Image Button Option Getting an image, either by scanning a document or loading an image file, is the first step in the OCR sequence. The option you select in the drop-down list takes place when you click the AUTO button. Click the drop-down list under the Image button and choose Scan Image. Use this setting when scanning pages with your scanner. 2 Click the drop-down list in the Image button and select Load Image. Use this setting when opening image files for recognition. 3 Reset the drop-down list to Scan Image. Select a Zone Button Option The Zone button lets you choose whether to have OmniPage draw zones automatically or whether to draw the zones manu ally. The Zone button is available after a page has beenscanned or an image loaded. Click the drop-down list in the Zone button and select Manual Zones. If you select Manual Zones, then click the AUTO button, OmniPage stops after acquiring the image. At this point you can draw the zones manually. See Chapter 2, Document Types and OCR Settings, for more information about when and how to draw your own 70ne~. OmniPage Professional users also can use the Zone button to open zone templates they have created. See Standardized Forms on page 2-42 for more information. 2 Reset the drop-down list to Auto Zones. Locate the OCR Button The OCR button is the last button in the process. The OCR button is available after a page has been zoned, whether automatically or manually. OmniPage Professional users can select the Train OCR and DefeY OCR options, detailed in later chapters. Touring the Settings Panel The Settings Panel lets you customize the OCR process: you can set scanning, zoning, OCR, spell checking, and other parameters. Select these options before scanning. This exercise provides an overview of the most important Settings Panel options as well as brief explanations of when to choose certain options. For a more detailed explanation of each item in the Settings Panel, please refer to Chapter 4, The Settin~s Panel. There are eight steps: Open the Settings Panel 2 View the Panel options 3 Select the Scanner options A C A I A ~ tt h A 7 r~ n A ~n t i ~ n c: TutoYial I--Basic Text Recognition 5 Select the OCR options 6 Select the Fonts options 7 Select the Spelling options 8 Select the Preferences options Open the Settings Panel There are several ways to open the Settings Panel. Click the Settings Panel button in the OmniPage toolbar. The Settings Panel opens the way you last left it. 2 Click Close to close the Settings Panel. 3 Position the mouse pointer over the Image button in the toolbar and click the right mouse button. The Settings Panel opens to the Scanner options. This method of opening the Settings Panel works with the Scanner, Zone, and OCR settings when you click the correspondin~ process button. Each task must be available or the button cannot be clicked. The Zones and OCR process buttons are not available until a document has been loaded or scanned. 5 Click Close to close the Settings Panel. 6 Choose Settings Panel... in the Settings menu or use the Ctrl+E keyboard shortcut to re-open the Settings Panel. View the Panel Options The Settings Panel is composed of six different sets of options. Each is represented by an icon in a scrollable list in the window. Use one of the methods described in Step 1 to open the Settings Panel. Click the Scanner icon in the left of the Settings Panel to view scannin~ options. Click each icon in turn to view its optlons. Use the scroll box to access and select icons below the OCR icon. Select the Scanner Options Select the scanner icon again to view options available when using your scanner. Tutorial 1--Basic Text Recognition The most important settings for recognition accuracy are the choices under Options. These determine how a page is scanned and will vary according to the type of document you want to recognize. OmniPage Professional users can select the 3D OCR option. Combined with AnyPage or HP AccuPage 2, 3D OCR achieves OmniPage's best possible accuracy with difficult document types such as degraded documents or pages with varying print intensity. See Chapter 2, Document Types and OCR Settings, for more information. Select the Auto Brightness with AnyPage/HP AccuPage Z o~tion. Use this for pages with crisp text on varicolored backgrounds, such as magazine pages in which some of the text appears on a colored background. Any halftones on a page scanned with this setting will appear as grayscale images. Scanning with this setting is slower than using the manual brightness control, but the OCR results are generally better. This option uses Caere AnyPage or HP AccuPage 2 technology to adjust the brightness levels for each section of the text automatically. If your scanner supports HP AccuPage 2, the option will be named Auto Brightness with HP AccuPage 2. Select Manual Brightness. Use this for pages with distinct, normal-sized text (8-12 point) printed on white paper, and for all black-andwhite scanners. Any graphics on a page scanned with this setting will appear in black and white. This is the fastest option because the page is scanned at a uniform brightness level. Move the scroll box for brightness control to the left and to the ri~ht. The brightness setting changes as the scroll box moves. The number of settings you have available depends upon your scanner model. An HP ScanJet Plus, for example, has 256 brightness settings. The brightness control is only available with the Manual Brightness option. It is dimmed if one of the other two options is selected. Select the Zones Options Click the Zones icon to view the Zoning Method options. These options determine how OmniPage zones the areas for reco~nition and orders the text. r~ mS~ rir f31 ! ~ble Multiple Columns is the default zoning method. It detects column flow in standard and multi-column documents and lets you save any graphic images. Single Column or Table is used for spreadsheets and correspondence. Graphics will be discarded. None is used in special circumstances. Refer to None on page 2-10 for more information about this setting. Select the OCR Options Click the OCR icon to view OCR input and output options. Select Use Language Analyst to check spelling and perform word and character analYsis. Language analysis begins automatically during the recognition process. Use Output Options to choose how OmniPage will handle output formatting. ù True Page - Retain all Page Formatting is the default for the Professional version only; it is not an option in the regular version of OmniPage. Select this to retain all the original page formatting in a recognized document. Use this to duplicate a document as closely as possible, especially when you will not need to do much editing or reformatting after recognition. Retain Font and Paragraph Formatting Only is the default setting in the regular version of OmniPage. Select this to retain font and paragraph formatting in a recognized document. Ignore Fonts and All Formatting is an option for those who need only unformatted text from a document. The choice you make in Output Options will affect which selections are available in the Fonts Options. Select the Fonts Options Click the Fonts icon to choose typefaces that will appear in the text window or your word processor if you chose OCR options that retain font formatting. If font formatting is ignored, choose one typeface for all font formats. Use the Retained Font Formats choices when you have selected the True Page - Retain All Page Formatting or Retain Font and Paragraph Formatting options in the OCR settings. Use the Ignored Font Formats choices when you have selected Ignore Fonts and All Formatting in the OCR settings. The font characteristics will not be retained if you disable your Windows TrueType fonts. Select the Spelling Options Click the Spelling icon to select dictionary and spell checking options. OmniPage's main dictionary contains over 100,000 terms. (Order dictionaries for additional languages by calling Caere at (800) 535-SCAN.) The User Dictionary is your personal editable dictionary. Select the Preferences Options Click the Preferences icon to customize general OmniPage operations. Using the Process Buttons Instead of using the AUTO button, you can click each process button in the toolbar individually. Each button becomes available as soon as the preceding button's process is finished. Once an image is loaded or scanned, you can set Zones or OCR options in the Settings Panel if necessary. Once set, you do not have to rescan the image; just click the appropriate button to have the new settings take effect. You will use the Single Column Page Sample to practice using each process button individually. After going through all three processes, you will reset the Zoning Method option in the Set tings Panel and then click the Zone and OCR buttons to redo the OCR process. There are six steps: Check the toolbar settings. 2 Click each process button in turn. 3 Change the zoning method in the Settings Panel. 4 Click the Zone button to reset the zones. 5 Click the OCR button to finish the process. 6 Check the text and save the file. Check the Toolbclr Settings Click the drop-down lists under the process buttons and select Scan Image Auto Zones ù Perform OCR Click the Settings Panel button in the toolbar to open the Settings Panel. Click Use Defaults to return OmniPage to its default settings . Tutorial 1--Basic Text Recognition 2-Z4 Tutorials 4 Click the Zones icon in the Settings Panel. Note that the Multiple Columns zoning method is the default. You are scanning the Single Column Page Sample, which means that the wrong zoning method is selected. Leave this setting as it is for now-- you will change it later. 5 Professional users only--click the OCR icon. 6 Select the Retain Fonts and Paragraph Formatting Only option . 7 Click Close. Click Each Process But~on in Turn Align the Single Column Page Sample properly in your scanner. 2 Click the Image button. The scanned document appears in the zone window. Click the Zone button. OmniPage zones the document. Note that OmniPage, because of the zoning method set in the Settings Panel, mistakenly zones the numbers on the right of the table as a separate column of text. Click the OCR button. OmniPage makes three passes over the document and displays the recognized text in the text window. 5 Compare the formatting of the document in the text window to that of the image in the zone window. 1;~ ~ _ ~: I e t edlbng capa~lllbes Price Sophistication of features Image handling capabilities Number of features Direct application input Multi-language recognition i 9~ ~ g2 89 87 82 82 7~ _ I ~ ~ ~ n .~ This exercise used the Multiple Columns method, which means that OmniPage ordered the zones from left to right according to the columns it detected. OmniPage has separated the table into two distinct columns, placing the numbers column below the text column. Multiple Columns was the wrong option to choose for this document. You must choose the Single Column or Table zoning method to maintain the table's formatting. Change the Zoning Method in the Settings Panel Reopen the Settings Panel. 2 Click the Zones icon. Tutorial 1--Basic Text Recognition Select the Single Column or Table option. 4 Click Close. Click the Zone Button to Reset the Zones Click the Zone button to reset the zones. A dialog box asks if you want to replace the current zone~. Click Yes. OmniPage draws new zones. Tutorial 1--Basic Text Recognition 3 Verify that the new zones are drawn correctly. The table is now preserved as a unit. Click the OCR Button to Finish the Process Click the OCR button to finish the process. A dialog box asks if you want to replace the current text . 2 Click Yes. When recognition is complete, the text window opens. 3 Compare the formatting of the document in the text window to that of the image in the zone window. Notice that the numbers in the table's second column now line up with the corresponding lines of text in the first column. The table's format has been preserved by using the proper zoning method. 2-28 Tutorials Check the Text and Save the File Click the Check Recognition button in the toolbar and make any changes necessary. See Automatic OCR with the Default Settings on page 26 for detailed instructions on checking text. 2 Click the Save As... button. 3 The Save As dialog box opens. Select a word-processing application file type, such as Microsoft Word for Windows, in the Save File as Type drop-down list and save the file with a new name. 5 Click OK. 6 Choose Close Document in the File menu or use the Ctrl+W keyboard shortcut. True Page Recognition (Professioncll version only) OmniPage Professional includes True Page recognition so your OCR output can retain the page layout and the images exactly as they were displayed on the page. Skip this exercise if you do not own the Professional version of OmniPage. In this exercise you will scan the True Page Sample. There are three steps: Select True Page and other settings. Tutorial 1--Basic Text Recognition 2- ~n Tut~Yi~lc 2 Click AUTO. 3 Check the results and save the file. Select True Page and Other SeHings Click the drop-down lists under the process buttons in the toolbar and select ù Scan Image ù Auto Zones ù Perform OCR 2 Click the Settings Panel button in the toolbar. 3 Click Use Defaults. 4 Click the OCR icon. Make sure the following options are selected: True Page - Retain All Page Formatting Retain Graphics ~llck Close. Click AUTO Align the True Page sample in your scanner. 2 Click the AUTO button. When the process is finished, the text window should open to show the recognized document in its original format and the graphic in its proper place. thas exarrp~e, lryoure setung contrast you'll see 15 sarn- set of ~ples of how your rrnage will ft look with 15 different con- trast settings Check the Results and Save the File Click the Check Recognition button in the toolbar to verify the OCR results. 2 Save the file as a Caere Document. 3 Type the name 1 4 Save the file again in a word-processing format. You have a number of different options for saving the file. If your word-processing application supports embedded graphics, you can save the document in that application and the graphics will be displayed. Tutorial 1--Basic Text Recognition 2-32 Tutorials If you like, repeat Steps 1 and 2 but deselect the Retain Graphics option in the OCR Settings Panel. The text should appear in the same format you see, but will have an empty space where the graphic was originally. ofes- ''~ et of ple s ~f how ~ 0~ ~g- w~ g soft trast settingS Opening a Graphic in Image Assistant (Professional version only) You can work with Caere's full 24-bit image editing application, Image Assistant, directly from OmniPage. This exercise uses the saved document from the previous exercise. Skip this exercise if you do not own the Professional version of OmniPage . There are three steps: Open the Caere Document file called layout.met. 2 Double-click the graphic. 3 Experiment with the Tool Palette. Open the Caere Document File Named Layout.met Open the Caere Document file you saved as layout.met. If you need to scan the page, see the instructions in the preceding exercise. Tutorial 1--Basic Text Recognition Double-Click the Graphic Click the text window to make it active. t ~t ohas example, rl yoi~r~i s e~5~ng s avail- IPook with 15 c'ifferent cong trast settings the text window. 2 Double-click the graphic in Image Assistant launches and opens a document window containing the graphic you just double-clicked. Tl~torial~ 2- ~ ~ Tutoria/ 1--Basic Text Recognition Experiment with the Tool Palette Experiment with the tools in Image Assistant's tool palette to see what special image editing effects you can achiev~ Refer to the Image Assistant tutorials booklet for a ~uided tour of its features. Choose Save in the File menu if you want to save changes to the file. The file's default name is OmniPage.tif. It is saved as a TIFF file. You can choose Save As... in the File menu if you want to assign the file a different name and file type. 3 Click OK. 4 Choose Exit in the File menu. Image Assistant closes and you are returned to OmniPage Professional. Refer to the Image Assistant tutorials booklet included with your OmniPage package for detailed information. You can also use the online Help system available in Image Assistant. Tutorial 2--Document Types and OCR Settings Document Types and OCR Settings People encounter a variety of documents in an average workday: office memos; legal documents; standardized forms; newspaper and magazine pages; foreign-language reports; etc. Before you scan any page, you must determine how you want OmniPage to order the page information and in what format you want the pages' recognized text and graphics. This chapter examines some common document types and the OCR concepts associated with each one: Setting zoning options You'll use the zone tools and learn which OCR settings to choose for various types of documents. Complex layouts You'll learn when to use manual zoning and practice recognizing just a portion of a scanned document. Standardized forms You'll specify zone contents, edit a zone contents file, create a zone template, and export a graphic. Legal documents and spreadsheets General tips are listed for each. Documents with specialized characters You'll train OmniPage to recognize specialized characters and edit an OCR Training file. Foreign-language and multilingual documents You'll learn how to use the Language Analyst, how to select dictionaries, and how to select an appropriate character set. Make sure you have the following page samples you need to work through the tutorials in this chapter: Manual Zoning Page Sample Standardized Form Sample Tutorial 2--Document Types and OCR Settinas Setting a Zoning Method The zoning method selection in the Settings Panel tells OmniPage how it should evaluate the column structure of text zones. These zones may be drawn either automatically by OmniPage or manually by you. Select Multiple Columns when recognizing several columns of text on a page or any column with a graphic. OmniPage separates text from graphics and looks for regular vertical separations of text to define columns. Tutorial 2--Document Types and OCR Settings This is a good method to use on magazine or newspaper pages. Select Single Column or Table when recognizing a table, chart, spreadsheet or page-wide text with no graphics (memos and reports, for example). ~ . 1~ Select None when you want everything in the zone recognized as text. OmniPage will not discern column layout or distinguish graphics from text. This is the fastest option to use when you recognize manually drawn, text-only zones. It is useful for documents with very small text areas such as those found in pleading pages or telephone books. With practice, you will learn which options to select for particular documents. The examples in this chapter will strate some of those choices. Refer to Touring the Settings Panel on page 2-15 and Chapter 4, The Settings Panel, for more information on the Settings Panel options. Tutorial 2--Document Types and OCR Settinqs Complex Layouts After you select options in the Settings Panel, you have a choice between auto and manual zoning. With complex or unusually formatted documents, manual zoning often returns better results than auto zoning. In Chapter 1 of this tutorial you used auto zoning after scanning the page samples. Manual zoning would have achieved virtually the same recognition results, but with more effort on your part. In those exercises, there was no point to manual zoning because the entire document was recognized and no text was reordered. Use manual zoning in the following circumstances: to select just a portion of a page for recognition to rearrange text order to specify the contents of a particular zone You can practice drawing your own zones with the Manual Zoning Page Sample. There are six steps: Set Manual Zones and other options. 2 Click AUTO to start the process. 3 Practice using the zone tools. 4 Draw the appropriate zones. 5 Perform OCR. 6 Check the results and save the file. Set Manual Zones and Other Options Set these options in the toolbar: Scan Image Manual Zones Perform OCR Tutorial 2--Document Types and OCR Settings Open the Settings Panel and click Use Defaults to return OmniPa~e to its default settings. 3 Click the Zones icon. 4 Be sure that Multiple Columns is the selected zoning method . 5 Click the OCR icon. 6 Select Retain Font and Paragraph Formatting. This option preserves the document's fonts and paragraph structure. 7 Click Close. Click AUTO to Start the Process Align the Manual Zoning Page Sample in your scanner. 2 Click AUTO to start the process. OmniPage scans the page. The zone window opens with the zone tools palette displayed. The process stops so you can draw recognition zones manually. Use the arrow buttons to rotate the image. Zoom your view of the e~ page in and out. ~ ~ l ZoneConen~s: |Alphanume~ic 1~ 1 Draw zones around the text _ you want recognized. + Change the order of the ~ . O~pSe~ ~hroclt recognition zones. 1~ ~5~ I 1~ ~ l ~ I Erase a zone. ~_, ~ ,~ _.. ~ _ -~.~.= L _______~_ Tutorial 2--Document Types and OCR Settings Practice Using the Zone Tools ~: I When the zone window opens, the Draw Zones tool button . +~ I with a cross-hair appears. (If you had selected Auto Zones, this _ I button would show a cursor instead of a cross-hair.) Click the Zoom tool. The cursor becomes a magnifying glass. Move the Zoom tool over any part of the image and click the left mouse button to enlarge the image. Move the Zoom tool over the enlarged image and click the right mouse button to reduce the image. 4 Click the Draw Zones tool. 5 Hold down the mouse button and drag the cursor to draw a rectangle around any section of text on the page. Leave white space around the text if possible. OmniPage tags this rectangle with a 1. Draw a second zone anywhere on the page. This rectangle is numbered 2. OmniPage numbers each new zone sequentiallv. Click the Order Zones tool. The cursor becomes the # symbol and numbers in the two zones disappear. Click the second zone you drew. Now the zone is labeled 1. This zone will be recognized first and placed at the beginning of the new document in the text window. Click the first zone you drew. It is now labeled 2 and will be recognized second. 10 Click the Erase Zones tool and click each zone. The zones disappear. TUtt')rial 2--Document Types and OCR Settings 11 Click the left arrow button to rotate the image 90 degrees counterclockwise. 12 Click the right arrow button to rotate it back. OmniPage rotates the page automatically when you use the AUTO feature and you have Automatically Correct Page OYientation selected in the Settings Panel OCR options. Draw the Appropriate Zones Suppose the only information you need from this page is the text about the international awards OmniPage products have won. Shorten recognition time by drawing zones around just the portions of text you want to use. Draw a zone around the February 1992 award listed in the Product Highlights section of the text. OmniPage Sets the Standard for O t~re~5 OmniPalle ~s thC world ~tr~drlrd for optical ~cW n~cogaition iOCI~ W] r Orir~n~llv releasrdl b 198~, OmniPœge WA5 the ~ir9t acr~te, rd~rdnble tX ptrrsonal conipuWs Since then Cr CtC hn~ pion~ d r ew OCR technr loEiec to mnl page tecognltion t ~C acr urate ;t~d er der to rle d41n ever To~y ca›c alro of hn aw~rd-win[ung ODutil'age Psofeuion~l, whiCh provirle~ you zll ~hc p~wnfi~l ~p b ci Omni P~p and mrare C~P rlrmotmcc ~urt ~d irnage scr~nnin~ rottluor. Aprll 1~2 OnmiP~gc win~ Word. luly 1992: OraniPay~ r~s PC ~:~gr~rin~'~ '~litDr'~ t~hoice" uwr~rd ALguel 199~. MacworlJ mag~zine ~ward~ O~niPage it~ "~'vorld ~Irlcc Awrrd - UK'~ "Elitt}r's Cho aw~d Novernbcr 1992 PC Lbn puting n~ zulc sebctt~ OrnniP OCR" Din3ct ar,d HP'r~ Sc b Ir992 npn~ blorir SCpkc~ › Product~ - Inpul D~ US Par~rnlO~ice Cuerr a ~tent r)n Tutorial 2--Document ~ypes and OCR Settings 2 Draw another zone for the August 1992 award. 3 Draw a third zone for the October 1992 award. Perform OCR Click the AUTObutton or the Perform OCR button to continue the process. The recognized text appears in the text window. Check the Results c~nd Save the File Click the Check Recognition button in the toolbar to check your OCR results. You can save the file in the format of your choice, choose Close Document in the File menu, or use the Ctrl+W shortcut. Standardized Forms You can speed document processing and improve accuracy by manually zoning a document and telling OmniPage what kind of characters it should recognize in those zones. This is called "specifying zone contents." It is particularly effective with standardized forms and spreadsheets. You can save the zones as a template file and use this template each time you scan the same kind of document. If you do not specify the zone contents, OmniPage looks for Alphanumeric characters: letters of the alphabet, numbers, and standard punctuation symbols. You can specify a zone as Numeric or Graphic as well. ù The Numeric file is editable. Characters can be added to or deleted from this file. ù Any zone recognized as a graphic can be exported separately as a graphic file. ù You also can create your own zone contents files with the characters you require. Specifying Zone Contents In this exercise you will create a new zone contents file and export a graphic. Use the Standardized Form Sample in this exercise. Tutorial 2--Document Types and OCR Settings There are five steps: Set the toolbar options. 2 Scan and zone the image. 3 Create a new zone contents file. 4 Perform OCR. 5 Export the graphic as a TIFF file. Set the Toolbar Options Set these options in the toolbar: Scan Image Manual Zones Perform OCR Open the Settings Panel and click Use Defaults to return OmniPage to its default settings. Scan and Zone the Image Click the AUTO or Image button to begin scanning. OmniPage scans the page and stops so you can draw manual zones. The image appears in the zone window. Tutorial 2--Document Types and OCR Settings 2 Draw a zone around the logo and the words "Account Analysis" in the top left of the page. r CCII =AAUkBiB~ Fh~ndal Inlann~n Click the Zone Contents drop-down list and select Graphic. This tells OmniPage not to perform OCR on that zone because it contains a picture. For the purposes of this exercise, you are recognizing the entire company logo as a graphic, even though it consists mainly of letter.s. Account Analysis Note that OmniPage normally would recognize the logo as text, and skip recognizing the icon in the logo entirelv. Tutorial 2--Document Types and OCR Settings Draw a zone around the text under the logo, from the Account information through the first paragraph under th~ Financial Information header. .~.~ ,. ~ . Fhlmdal Inf~aa~n ~- ~ . Select Alphanumeric in the Zone Contents drop-down list. OmniPage will look for both letters and numbers when it recognizes this portion of the image. 6 Draw a zone around the financial section of the page. 7 Select Numeric in the Zone Contents drop-down list. The only characters in this section are numbers and the letters YTD. If the Alphanumeric option were selected, a 5 could be mistaken for the letter S and a 0 (zero) for the letter O. Selecting the Numeric option reduces these common OCR errors. A numeric zone contents file does not contain any alpha characters, however, so in this case the Numeric designation is not sufficient for optimal recognition. You will create a new zone contents file that includes the characters YTD. Tutorial 2--Document Types and OCR 5ettings Create a New Zone Contents File Choose the Edit Zone Contents File... command in the Settings menu. The Select File dialog box opens. Click New. The Edit Zone Content File dialog box opens with a string of highlighted characters. The highlighted characters are replaced with the ones you enter. Click Save. The File to Save dialog box opens. Type the file name finance in the File to Save dialog box. 6 Click OK. 7 If the third zone you drew around the financial contents in the zone window is not selected, click in it now to select it. Tutorial 2--Document Types and C)CK Settlngs 8 Select Finance in the Zone Contents drop-down list. Perform OCR Click the OCR button to continue the process. OmniPage recognizes each of the zones according to the zone contents you specified. Click the Check Recognition button to verify the results. You should find no errors in the form. Export the Graphic as a TIFF File Choose Export Image... in the File menu. The Export Image dialog box opens. Select the Save Each Graphic Zone to a File option in the Image Options section. Type the name graphic . tif in the File Name edit box. The graphic format TIFF is already selected in the Save Fi1es as Type list box. 4 Click OK. To save the file in the format of your choice, choose Close Document in the File menu, or use the Ctrl+W shortcut. For more information on exporting graphics and saving files, see Tutorial 3--Streamlining the OCR Workflow on page 2-58. Creating a Zone Template (Professional version only) If you regularly scan a particular type of document, especially standardized forms that require the same manual zoning on each page, create and save a zone template. Instead of redrawing the zones each time you scan that document type, simply open the zone template before scanning. Each zone template file designates zones exactly as they were drawn along with their zone contents specifications. (Zones options from the Settings Panel are not saved.) You can create up to 250 templates. In this exercise you will create manual zones on a scanned document, save the zones as a new zone template, and open the template to use on the form again. Tutorial 2--Document Types and OCR Settings You can use the Standardized Form Sample you used in the previous exercise, or any page you choose. Set these options in the toolbar: ù Scan Image Manual Zones - Perform OCR Scan a document of your choice. The image appears in the zone window and OCR stops so you can draw manual zones. 3 Draw your manual zones. 4 Specify zone contents as appropriate. 5 Choose Save Zone Template... in the File menu. The Save Zone Template File dialog box opens. Enter a file name in the File Name edit box. Normally you would open a zone template after scanning a document and before it has any zones. For the purpose of this exercise, you will remove the zones already set in the image, then open the zone tem~late. 7 Click OK. 8 Choose Clear All Zones in the Edit menu, click the Clear All Zones button in the toolbar, or use the Erase 7nn~c Tutorial 2--Document Types and OCR Settinqs tool in the zone window palette to erase zones one by one.) Select the file name of your new zone template in the drop-down list under the Zone button. 10 Click the Zone button. OmniPage draws the template zones on the image. 11 Set options as needed in the Settings Panel before you perform OCR. Legal Documents and Spreadsheets The exercises in this chapter have used standard 8.5 x 11 inch portrait-oriented pages. Many users, however, need to scan documents of varying sizes, orientation, and complexity. This section lists some general tips to keep in mind when scanning the following commonly used documents. Legol Documents Keep these general tips in mind when scanning legal documents: Select Lega/ size in the Scanner Options section of the Settings Panel if the document is printed on legal size (14 inches in length) paper. Many legal documents consist of page-wide text. If this is the case, select Single Column or Table as the zoning method in the SettinKs Panel. Pleading Papers Keep these general tips in mind when scanning pleading papers: Generally you should select the None zoning method in the Settings Panel. If you would like to reproduce page layout without much editing or reformatting, select the True Page Retain All Page Formatting (Professional version only) OCR option in the Settings Panel. Users who have the regular version of OmniPage should select Retain Font and Paragraph Formatting Only. You may want to draw manual zones in some circumstances. If numbers on pleading papers are I I A C C 1- h A ~ f i ~ A c f r m t h _ t _ v t n m n i P A A ~ Tutorial 2--Document Types and OCR Settinqs will consider the numbers to be part of the text body and place them on a text line. Try drawing a zone around just the body text to omit the line and numbers from the recognition process. This is a good option if you are going to add text that will change the line numbers. Use your word-processing application to number each line. If you want a carriage return inserted at the end of each line, try saving the scanned document as a standard ASCII file and open it in your word-processing program. You would choose the Text Only method of conversion in some programs. Consult your word-processing manual for a more detailed description of importing documents not created by that program. You may have to experiment to find the best process for scanning and saving each document. Spreadsheets Keep these general tips in mind when scanning spreadsheets (these tips also work for charts, tables and memos with pagewide text and tabs): Select Landscape as the orientation in the Scanner options section of the Settings Panel if the document is presented in landscape view. Select Single Column or Table as the zoning method in the Settings Panel to preserve the spreadsheet format. When OmniPage detects five or more spaces, the Single Column or Table option converts the spaces to a tab. Draw your own zone around a table of numbers and identify its contents as a Numeric zone to improve recognition results. You can create new zone contents files for special characters that your spreadsheet may contain. See Standardized Forms on page 2-42. Documents with Specialized Characters (Professional version only) OmniPage automatically recognizes characters commonly found in most documents. Other documents, such as mathematical papers, will contain characters and symbols OmniPage has not yet learned to recognize. You can train OmniPage Professional to recognize these characters. Tutorial 2--Document Types and OCR Settinqs Creating an OCR Training File This tutorial shows you how to teach OmniPage Professional to recognize characters not normally found in text by using the Train OCR Sample. Set these options in the toolbar: ù Scan Image Auto Zones Train OCR Open the Settings Panel and click Use Defaults to return OmniPa~e to its default settin~s. Scan a document of your choice that contains symbols or other specialized characters. Tutorial 2--Document Types and OCR Settings After recognition, the Train Characters window opens to display images of recognized characters. Those OmniPage had trouble identifying are displayed in the grid boxes at the top of the dialog box. Beneath each image, in smaller type, is OmniPage's attempted identification of that character. A tilde means that OmniPage couldn't identify the character. Characters OmniPage believes it identified correctly in the document are listed alphabetically below the suspect characters. Check for common errors, such as a zero being recognized as the letter O. Occasionally, you will see common characters, such as c or e. Generally, you will not want to train OmniPage to recognize these letters unless they are in a very specialized font. The Language Analyst corrects common OCR errors more efficiently. Double-click a character, or select it and click Specify. In this example, OmniPage must be taught to recognize the copyright symbol (~). Tutorial 2--Document Types and OCR Settings The Specify Character dialog box opens with a close-up of the symbol as it appeared in the scanned document. The dialog box includes a list box of Extended ANSI ~haracters and a Character edit box. 'r'ac'y.~ P US ' ple, ~: and If the symbol you seek appears in the list, click it. It appears in the Character edit box. A If the symbol or character does not appear in the list, C~ you must type it in the Character edit box instead. In the example below, the symbol for pi is not in the list, so the user has chosen to type in the numbers 3.14159 to replace the symbol. 160 --- 161 --- j 162 --- C 163 -- 164 --- ~ 165 --- Y 166 --- ~ 167 --- 168 --169 -- and ~ witl TutoYial 2--Document Types and OCR Settings 6 Click OK. The specified character now appears under the suspectcharacter in the Train Characters dialog box. ~_~ ~ ~u u_ ~ ~d d tt oc pec tr _ . (' ~ fhN tW rf ., f~ _ _ _ -I ~ ca S~ rt ~ ca oc ... ~ ~ ~ ~ ~ . ~=~ The symbol has turned gray to indicate that you have specified a character for it. Click the Save button. The Enter save file name dialog box opens. 8 Type a file name in the Filename edit box. 9 Click OK. A dialog box asks if you want to recognize the image with the training file you just created. At this point, you can continue recognition or stop the exercise. The new file becomes the default in the OCR section of the Settings Panel if you click Yes. Editing an OCR Training Eile You can edit a training file as needed when you scan a document with previously unrecognized characters. Any training file can be appended to another training file. Choose Edit Training File... in the Settings menu. The Select File dialog box opens. Tutorial 2--Document Types and OCR Settings Select a training file. Th~ Tr~in Characters dialo~ box opens. Use the buttons to add, delete or modify character identifications as needed. If you had created another training file previously, you could click the Append button to add these characters to it. Click Save to save your changes and close the Train Characters dialog box. If you have made no changes, click Cancel to close the dialog box. Tutorial 2--Document Types and OCR Settinqs Foreign-Language and Multilingual Documents For optimal recognition of documents in any language, you should select: the appropriate language in the Select Language(s) dialog box (choose Select Languages... in the Settings menu). English G elman Flench llalian D u~ch Sp~nish Swedish Polluguese Danish Nolweuian the appropriate main dictionary for the language of the text you are recognizing in the Spelling section of the Settings Panel. the Language Analyst in the OCR options of the Settings Panel. Foreign-Language Documents When you want to recognize a foreign-language document double-check that your settings are correct as described above. During recognition of any document, the Language Analyst consults the main dictionary and the user dictionary. This is why it is important that the currently selected dictionary matches the language you are trying to recognize. Speed recognition by deselecting the option IJse Language Analyst in the Spelling section of the Settings Panel if the right dictionary is not available. The Language Analyst will try to match words to the chosen dictionary when selected, then turn itself off anyway if it perceives that dictionary entries are not improving recognition results. (If you recognize many documents in a language other than that of your default main dictionary, you should order a dictionary for that language from your Caere distributor or by calling Caere at 800-535-SCAN. Tutorial 2--Document T~Pes and OCR Settinqs Multilingual Documents You may want to recognize documents written in more than one language. It is important to select both the proper language set and main dictionary. You can zone multilingual documents automatically, but manual zoning may return better results. Suppose you have a document written largely in French, with a few sections in Portuguese. If you use auto zoning, select both French and Portuguese in the Select Language(s) dialog box. Select French as the main dictionary selection. English Gelman 3~a;D ~alian u~ch panish wedish =~ anish olweqian The Language Analyst will assist in recognizing the French portions of the document but shut itself off when it finds a text block in Portuguese. Use the Check Recognition feature to correct recognition mistakes. If recognition is poor, try turning off the Language Analyst and recognizing the document again. If you use manual zoning, the process is more timeconsuming but the results will be more accurate. Draw recognition zones around just the French portions of the text. Leave the Language Analyst on. After recognition of the French portions is complete, save the document as a word-processing file. Repeat the process for the Portuguese portion of the text, making the appropriate dictionary and language selections. This replaces the French text recognized first. (If you don't want to replace the first recognized text, save the document as a Caere Document befoYe recognizing the second language.) When recognition is done, save the second document as a word-processing file. Use your word-processing program to open both documents and cut and paste as needed. Tutorial 3--Streamlining the OCR Workflow Streamlining the OCR Workflow OmniPage provides a number of time-saving features to help you streamline your OCR workflow. This chapter shows you how to use some of them. After completing the exercises in this chapter, you will know how to: Save and reload a settings file. Determine the most efficient way to process a large group of documents. Open multiple image files. Use different options to export pages and graphics as individual image files. Use the Defer OCR option (Professional version only). There are no text samples for the exercises in this chapter. Saving a Settings File for Specific Documents OmniPage lets you save Settings Panel selections as a settings file. You can open and use this file when needed for similar document types to save yourself time. Disk space is the only limit to the number of settings files you can create. Suppose you regularly receive double-sided customer response forms printed in landscape mode with small (8-point Helvetica) type. This type of form requires that you select specific options in the Settings Panel. Rather than set them each time you scan the incoming forms, save the settings as a file and open it as needed. In this exercise you will set, save, and load settings for particular documents. Open the Settings Panel. 2 Set the following options: ù Scanner: Landscape OYientation, Double-sided Pages, and Auto BYightness with AnyPage. Zones: Single Column or Table. OCR: Ignore Fonts and all FoYmatting. OmniPage will match the original fonts if you use the Retain Font Formats option, but in this case we want Tutorial 3--Streamlinina the OCR Workflow the type to appear in a larger size and a different font than the original. Fonts: in the Ignored Font Formats group box, select Times New Roman in the drop-down Font list, and type 12 in the Font Size edit box. If necessary, choose Select Languages... in the Settings menu and select a Language for your saved settings file. Choose Save Settings... in the File menu. The Save Settings dialog box opens with the Caere Settings file format as the default. ct a location for the file and type in a file name. 6 Click OK. 7 Open the Settings Panel and click Use Defaults to return OmniPage to its default settings. (In the normal course of your work, you would go on to scan documents with your settings and later change the settings as you worked with other documents.) Tutorial 3--Streamlining the OCR Workflow 8 Choose Load Settings... in the File menu. The Load Settings dialog box opens. 9 Select your file and click OK. 10 Browse through the settings you changed in the Settings Panel to verify that they were restored. Scanning Large Jobs If you have an automatic document feeder (ADF), you can use the OmniPage AUTO button to scan a large stack of documents, recognize them as a group, and save the results later as a single file or as several smaller files. For example, you may want to fill your ADF and click the OmniPage AUTO button before you leave the office for the day. OmniPage can scan, zone, and recognize all the documents and have them ready for you to save the next morning. To automatically scan a batch of documents unattended, you must select Auto Zones or, with the Professional version, either Auto Zones or a zone template. If Manual Zones is selected, OmniPage stops after each page image so you can select zones. OmniPage Professional users have the option of deferring OCR to a later time. See Deferring Recognition (Professional version only) on page 2-66. All OmniPage users have the option of saving scanned files in word-processing or Caere Document file format. You can reopen Caere Document files and make changes after the process is finished. To protect your processing time investment, save the scanned documents in Caere Document file format before Tutorial 3--Streamlining the OCR Workflow you begin checking text recognition or editing in the text window. Preparing Documents for the ADF Decide how you will save the scanned documents before you fill the ADF. Suppose you wanted to scan 25 pages. How you plan to save the pages affects how you will group them in the document feeder. You have these options for saving scanned pages in a word-processing format: Create one file for all pages The pages would be saved as one 25-page file. Create one file per page The pages would be saved as 25 one-page files. Create new file at each blank page You would insert blank pages as separators into a stack of one-sided documents. All pages following a blank page would be saved as a separate file with a unique document name. Automatic file naming is discussed in the section Saving the File(s) on page 2-62. Viewing Pages in the Zone and Text Windows View your scanned document or loaded image in the zone or text window. Move through the pages by clicking the arrows in the bottom left of the OmniPage window. Click the right arrow to move to the next higher page and click the left arrow to move to the next lower page. Alternatively, choose Go To Page... in the Edit menu or use the Ctrl+G keyboard shortcut to open the Go To Page dialog box. Adding, Replaeing, and Deleting Pages Scanned pages or loaded images can be appended to any open Caere Document or to the current loaded or scanned image in the zone window. Pages also can be replaced and deleted. If a one-page image file or Caere Document is open, any new image loaded or new page scanned becomes page two of that document. Tutorial 3--Streamlining the OCR Workflow When you click the Image button while viewing any page of a multi-page document except for the last page, the Scan Image dialog box opens. (If the Load Image option is set, the Load Image dialog box opens. It is the same as the Scan Image dialog box.) Choose whether to replace the current page with the new page(s), or whether to insert the new page(s) before the current page or at the end of the document. ~ If you open a Caere Document while another Caere C~ Document or an image file is open, the currently open document will close first. A dialog box gives you the option of saving the document before it closes, if you have made changes to it. To delete a page in a currently open document, move to the page you want to delete and choose Delete Current Page in the Edit menu or use the Ctrl+D keyboard shortcut. Saving the File(s) When recognition and any text editing you want to do are complete, click the Save As... button in the toolbar. Choose either a word-processing or a Caere Document file format. If you choose a word-processing format, you have three options for saving the scanned pages: as a single file, as one file per page, or as one file for every blank separator that OmniPage locates. See Preparing Documents for the ADF on page 2-61 for more information on these choices. ù Create one file per page or Create new file at each blank page: enter a name with five characters or less into the File Name edit box. OmniPage adds three numbers to each file name to make it unique. For example, if you typed file into the File Name box, the first page is saved as fileO01, the second page as fileO02, and so on. Tutorial 3--Streamlining the OCR Workflow ù Create one file for all pages: enter a standard eightcharacter file name. If you chose the Multiple Columns and Retain Graphics options in the Settings Panel, the graphics will be saved with the text only if the format in which the file is saved supports embedded graphics. ASCII text, for example, does not support embedded graphics. You can export graphics to a separate file independent of the text as described in Exporting Images on page 2-64. Opening Multiple Image Files You can load any number of image files--such as a batch of faxes received on a fax modem--for group recognition. You can load TIFF, PCX, DCX, or BFX images. To do this: Set the OmniPage toolbar appropriately for the images you are recognizing. For example: ù Load Image ù Auto Zones ù Perform OCR 2 Click the AUTO button. The Load Image dialog box opens. Tutorial 3--Streamlining the OCR Workflow 3 Select the type of image you want to recognize and click Add to add it to the Selected File.s list h-)~ Click OK when you have added all the files you want. Each file is opened and processed in order of appearance in the list. When you load images with the AUTO button, the image files are added to any currently open document as described in Adding, Replacing, and Deleting Pages on page 2-61. Choose Open Document... in the File menu to open Caere Document files. When you use the Open Document... command to open a Caere Document image file, OmniPage closes any open image file or Caere Document first. A dialog box gives you the option of saving the document before it closes. Exporting Images Any scanned page or pages can be exported as an image file. You can export the image files in TIFF, BMP, or PCX format. OmniPage can export an entire page as one image file, or it can find the individual graphic zones on each page and export them as separate files. Tutorial 3--Streamlininq the OCR Workflow Choose Export Image... in the File menu when you want to export either an image or its graphic zones to a file. The Export Image dialog box opens. There are two choices under Save Options: Save Current Page Only Save All Pages There are two choices under Image Options: Save Each Graphic Zone to a File Save Entire Page to a File Choose one option in each section. How you match these two options affects the length of the file name you can assign the image and how OmniPage appends an extension. Save Current Page Only and Save Entire Page to a File: the name you choose can have eight characters. This creates one one-page image file. Save All Pages and Save Entire Page to a File: the name you choose can have five characters. 00n is appended, where n represents the page number (001, 002, etc.). This creates multiple one-page image files. Save Current Page Only and Save Each Graphic Zone to a File: the name you choose can have seven characters. OmniPage appends a letter to indicate the order of the graphic on the page. A is the first graphic, B is second and so on. This creates one file for each graphic on the current page. Up Tutorial 3--Streamlining the OCR Workflow to 26 files can be creat method . Save All Pages and Sav~ name you choose can ] appends both a numb~ The number (OOn) indicates the page number and the letter indicates the order of the graphic on the page. Thus the second graphic on the second page would be ~* i 002B. This creates one file for each graphic on every page . Deferring Recognition (Professional version only) The typical OCR flow is to scan, zone, and OCR a page in the stack and then repeat the process with the next page until every page in the stack is done. Compared to the time it takes to scan and zone a page, however, the recognition process can be time-consuming. You might find it more convenient to scan and zone all your pages at once but defer recognition to a later time when it can take place unattended by you. OmniPage Professional gives you the option of deferring the recognition process. This means that you can scan and zone a number of documents or open and zone a number of images and put off recognition until later. You can even schedule OCR to commence at a specific time. This chapter gives you general guidelines on deferring recognition when scanning a stack of documents or loading a group of image files. Set the toolbar with the Defer OCR setting. Set the other toolbar and Settings Panel options according to the requirements of the documents or files you plan to scan. Load the Automatic Document Feeder (ADF) with the documents to be scanned or set the Image button to Load Ima~e. Tutorial 3--Streamlininq the OCR Workflow Click the AUTO button. If you are scanning documents, the first page in the stack is scanned and zoned, then the next page, etc. If you are loading image files, the first image in the list is opened and zoned, then the next image, and so on. If you are using the Auto Zones feature, each page is zoned automatically. If you are using the Manual Zones feature, the AUTO processing stops each time a page is ready for zoning. Once the page images have been zoned, you have two choices: finish recognizing the current document or save the file and Perform recognition later. Finish Current Document If you want to finish the current open document: Choose Finish Current Document... in the Process menu. The Finish Current Document dialog box lets you choose to save the document to a specific file format. Select Convert Automatically to save the document immediately after recognition. If you do not select Convert Automatically the file will be saved as a Caere Document. You also have the option of deleting the Caere Document after recognition. Click Save Output To... to choose a file format and location for the saved file. The Save As dialog box opens. Choose a file type and a destination for your file. Click OK to return to the Finish Current Document dialog box. Tutorial 3--Streamlining the OCR Workflow Sove a Document for Lclter Recognition If you want to save the document for later processing: Choose Save in the File menu after the page has been zoned. The Save As dialog box opens. Select the Caere Document file format .met as the file type and type a name into the File Name edit box. 3 Click OK. 4 When you want to finish the document, choose Finish Deferred Documents... in the Process menu. The Finish Deferred Documents dialog box offers options for recognizing and saving the deferred files: Click Add Files... to add Caere Documents to the list. The Open dialog box opens. Double-click the files you want to open, and click OKto return to the Finish Deferred Documents dialog box. Select Convert Automatically to save the document immediately after recognition. If you do not select Convert Automatically the file will be saved as a Caere Document. You also have the option of deleting the Caere Document after recognition. Click Save Output To... to choose a file format and Tutorial 3--Streamlining the OCR Workflow location for the saved file. The Save As dialog box opens. Choose a file type and a destination for your file. Click OK to return to the Finish Deferred Documents dialog box. Choose a time to start OCR in the When to Recoanize section. 5 Click OK. At the set time, OmniPage opens each document in the order in which it was added to the list. It performs recognition, saves, and closes each document when recognition is complete ** No page found ** Chapter 3 Commands and Setti ngs This chapter explains how to use all of OmniPage's commands and settings which are located within seven menus and a convenient toolbar. The OmniPage menus include the: File menu Edit menu Format menu Process menu Settings menu Window menu Help menu The toolbar provides shortcut command and processing buttons to perform OmniPage operations. The information in this chapter is organized hierarchically to describe each toolbar button and menu command. For example, the description for Save Options is listed at the end of the following series of descriptions: File menu description: Save As... description: Save Options description Some features are only available with the OmniPage Professional version. These are noted in this chapter as "Professional version only." For practical ways to use OmniPage with step-by-step instructions, see Chapter 2, Tutorials. Use the toolbar to access the fundamental steps of the OCR process: Getting the page image that you want to recognize. 2 Choosing what will be recognized in the image by creating zones. Recognizing the image or, if you're an OmniPage Professional user, performing other OCR options before recognition . You can choose automatic processing so that OmniPage automatically performs all these steps according to the commands that you select. Or, you can work interactively with OmniPage each step of the way. In addition to the OCR processing steps, the toolbar also provides shortcuts for performing other important OmniPage commands. i ~ AuTr~ ~ ~Scan Image ~: Processing buttons Shortcut command See Touring the Settings Panel on page 2-15 for more information. Shortcut Command Buttons The toolbar's shortcut command buttons are for your convenience. Use the Settings Panel button to open the Settings Panel. Use the Save button to save the current document. Use the Save As... button to save the current document with a different name or in another file format. Use the Print button to print recognized text in the current document. Use the Help button to get help on OmniPage. Use the Image Assistant button to launch the Image Assistant 24-bit color and image-editing program (Professional version onlY). Use the Cut button to cut text in a recognized document. Use the Copy button to copy text in a recognized document. Use the Paste button to paste text in a recognized document. Use the Clear All Zones button to delete the currently drawn zones in the zone window. Only the zone borders are deleted; the image itself remains the same. Use the Find/Replace button to find and replace words in a recognized document. Use the Check Recognition button to check for errors in a recognized document. The shortcut command buttons perform the same functions as the corresponding commands in the File, Edit, Settings, and Help menus. For more information about these commands, see their respective menu entries further in this chapter. The Toolbar Processing Buttons The toolbar's processing buttons perform the same operations as the Process Settings commands in the Process menll . A UTO button Image Zone OCR button button button You can use the: ù AUTO button to automatically process your document from start to finish according to the currently selected processing commands. ù Image button to get an image for recognition by scanning a page or loading an existing image. ù Zone button to specify what will be recognized in an image by creating zones manually, automatically, or with a template (Professional version only). ù OCR button to perform OCR, defer OCR (Professional version only), or train OCR (Professional version only). The status bar at the bottom of the screen reports the currently activated operation and then the operation that you can select next. AUTO But~on The AUTO button, located on the far left side of the toolbar, performs the same operations as the Auto command in the Pro~cc mrm AUT0 Click AUTO to start and finish processing each page of a new document automatically or to finish processing the current page of an open document. This process is determined by the commands selected in the Image, Zone, and OCR button dropdown list boxes. For example, if you want OmniPage to automatically scan and process a multi-page document, you can select Scan Image, Auto Zones, and Perform OCR in the processing button dropdown list boxes. When you click AUTO, the first page in the scanner will be scanned, automatically zoned, and recognized. The same process is automatically repeated for the next page. This continues until all of your pages are processed. When a document is already open, you can click AUTO to finish processing the current page. The resulting operation depends on the state of the page and the selected Image, Zone, and OCR commands. For example, if your page image already has zones, then OmniPage immediately begins recognition processing according to the selected OCR command. The AUTO button changes to STOP as automatic processing begins. Click STOP at any time if you want to discontinue processing. Image Scan Image the Toolbar Button Use the Image button to get an image for recognition by scanning a page or loading an existing image. This button performs the same operations as the Scan Image/Load Image Process Settings commands in the Process menu. Soan Image S oan I mage Loa~ Imaoe Select Scan Image or Load Image from the Image button dropdown list box. Click the Image button to initiate the selected operation. The selected Image command is also used when OmniPage performs automatic processing. Scan Image Choose this to scan a page in your scanner. Before scanning, make sure the appropriate Scanner options are selected in the Settings Panel. /~ You can use your right mouse button to click the Image button and automatically open the Settings Panel to Scanner options. While scanning a page, a progress meter appears and the status bar reports the progress. The page image appears in the zone window when scanning is complete. Click the STOP button in the toolbar to cancel scanning at any time. Load Image Choose this to load a previously saved image file as a new document or to add an image file to your open document. .tlf An image file is a "picture" of text and/or graphics that is saved in an image file format such as TIFF or PCX. When you load an image file in OmniPage, it appears in the zone window. See the next section for a list of supported input file formats. Supported Input File Formats OmniPage can open files with the following file formats. Caere Format (~.met) You can open Caere Document files (*.met) created in the 5.0 or later version of OmniPage. Image File Formats PCX TIFF Uncompressed TIFF Compressed (Types 11,111, IV, and PackBits) TIFF files must be line art and 200, 300, 400, or 600 dpi; 300 dpi is recommended. Fax File Formats OmniPage supports fax files saved in the .PCX format. Many fax boards can receive or convert the .PCX format; please consult your fax documentation for more information . To load an image file: In the Load Image dialog box, specify the path and directory where your image files reside. Select the type of file you wish to load from the List Files of Type drop-down list box. Files of that type in the specified directory appear in the File Name list box. Click the file you want to load and then click OK. The file opens in the zone window. For a multi-page image file, you must click the Image button to load each consecutive page in the file. Click Cancel to exit without loading an image file. You can load multiple image files when you have Load Image selected for automatic processing. For example, you may have a number of TIFF files that you want to process automatically. These files are loaded and processed in the order that they are selected and combined into one working document. To load one or more image files for automatic processing: Select Load Image and the desired zone and OCR commands and then click the AUTO button to begin processing. The Load Image dialog box appears. Specify the path and directory where your image files reside. Select the type of files you wish to load from the List Files of Type drop-down list box. Files of that type in the specified directory appear in the File Name list box. For each file you want to load, click the file and then click Add. Click Add All to select all the files in the directory. The files appear in the Selected Files list box. To add image files from other directories, repeat steps two through four. You can select up to 255 files. To remove a file from the list box, click it and then click Remove. Click Remove All to remove all files from the list box. When you have selected all the files you want to load, click OK. The images will be loaded into the zone window and processed one at a time, in the order that they were seleeterl Click Cancel to exit without loading any files. Zone Button Use the Zone button to create zones that determine what will be recognized in the page image. This button performs the same operations as the Auto Zones/Manual Zones/Use Template Process Settings commands in the Process menu. Select Auto Zones, Manual Zones, or a zone template file (Professional version only) from the drop-down list box. If you select Auto Zones or a zone template file, click the Zone button to initiate the o~eration. The selected Zone command is also used when OmniPage performs automatic processing. Auto Zones Select this in the drop-down list to have OmniPage automatically draw and order zones in the current page image and determine the appropriate text flow for recognition. To create Auto Zones, OmniPage uses the selected Zones option in the Settings Panel: Multiple Columns, Single Column or Table, or None. For more information about each of these options, see Zones Options on page 2-9. You can use your right mouse button to click the Zone button and automatically open the Settings Panel to Zones options. If a page already has zones, you are prompted to delete the current zones before auto zoning occurs. Click Yes to proceed. The zone window is then updated so that you can review the 7~1n~ thAt ~r~ rir~wn. See Tutorial 2--Document Types and OCR Settings on page 2-35 to learn more about using zones. Manual Zones Choose this to draw and order your own zones in the current page image using the tool palette in the zone window. When you create zones manually, OmniPage uses the selected Zones option in the Settings Panel (Multiple Columns, Single Column oY Table, or None) to determine the text flow within each zone that you draw. For more information about each of these options, see Zones Options on page 2-9. If a page already has zones, you are prompted to delete the current zones; click Yes to proceed. The zone window is then updated so that you can draw your own zones. For more detailed information on creating manual zones, see Manual Zones on page 2-46. Zone Templates (Professional version only) Choose a zone template file directly from the drop-down list box to apply zones to the current page image based on that template. This is a quick and efficient means of processing similar documents with the same zoning requirements. A zone template file is comprised of various zone attributes such as position, order, and zone contents. If you frequently process documents with the same layout, such as business forms, create and save a zone template and apply it to all such documents. If a page already has zones, you are prompted to delete the current zones before applying a zone template. Click Yes to proceed. The zone window is updated so that you can review the zones that are drawn. You can create zones manually and save them as a template using the Save Zone Template... command in the File menu. For more information on creating zones manually, see Manual Zones on pa~e 2-46. OCR Button Use the OCR button to perform the selected OCR command on the page image. This button performs the same operations as the PeYfoYm OCR, Defer OCR, and TYain OCR Process Settings commands in the Process menu. Select PeYfoYm OCR, DefeY OCR (Professional version only), or Train OCR (Professional version only) from the drop-down list box. If you select PeYfoYm OCR or TYain OCR, click the OCR button to initiate the operation. The selected OCR command is also used when OmniPage performs automatic processing. Perform OCR Choose this to recognize text on the current page. Before performing OCR, make sure the appropriate OCR ol~tions are selected in the Settin~s Panel. /~ You can use your right mouse button to click the OCR button and automatically open the Settings Panel to OCR options. If there are no zones on the page when you select Perform OCR and click the OCR button, OmniPage automatically creates zones according to the selected Zone command. If Manual Zones is currently selected, OmniPage ignores this and draws zones automatically. Defer OCR (Professional version only) Choose this to delay text recognition of one or more pages of your document. For example, you can use the AUTO button to scan pages, create zones, and defer OCR of your document. Then, at your convenience, you can set OmniPage to recognize your entire document by choosing Finish CurYent Document or Finish DefeYYed Documents in the Process menu. You can also recognize individual pages of an unrecognized document. For example, you can open a document to a particular page, choose PeYfoYm OCR, and click the OCR button; only that page will be recognized. To save a document with deferred pages, you must save it in Caere Document format (i.met). For more information, see Deferring Recognition (Professional version only) on page 2-66. Train OCR (Professional version only) Choose this to create a character training file (i.trn) that assists OmniPage during text recognition and allows better reco~nition of special characters. A character training file is a set of pre-recognized text characters that OmniPage compares with the characters in the page image during recognition. Before recognizing an image, you can create a new training file or choose an existing one in the Settings Panel OCR options. For more information on creating a training file, see Train OCR (Professional version only) on page 2-52. For step-by-step instructions on training OCR for special characters, see Documents with Specialized Characters (Professional version only) on page 2-50. Open Document.... ... Ctrl+O Close Document Ctrl+W Save Ctrl+S Save hs Export Image eevert trl SaYed Get hccuracy Inlo Save Settings Load Settings, Save Zone Template erint............. Clrl+P Publish to Enyoy The File menu lets you manage OmniPage file operations. File menu commands include: Open Document... Close Document Mail... (MAPI mail systems only) Save Save As... Export Image... Revert to Saved Get Accuracy Info... Save Settings... Load Settings... Save Zone Template... Print. .. Publish to Envoy... Exit Open Document... Choose Open Document... to open a Caere Document (i.met) or an image file. Caere Document (~.met) OmniPage creates a Caere Document the first time you scan or open an image. A Caere Document can have up to 255 pages. Each page can vary to include the original image, zones, and reco~nized text. You can continue to reopen a Caere Document in OmniPage, make edits, and save it in any other supported file format you wish. Additionally, if a Caere Document is saved with its original page images, you can retain graphic images, verify recognized text with the page image, defer recognition, and rerecognize pages at any time. Image file An image file is a "picture" of text and/or graphics that is saved in an image file format such as TIFF or PCX. Image files do not have OCR or zone information. When you open an image file in OmniPage, it appears in the zone window. To open a Caere Document or image file: Locate your Caere Documents or image files in the Open Document dialog box. Select the type of file you wish to open from the List Files of Type drop-down list box. Files of that type appear in the File Name list box. Double-click a file or select it and click OK. The image file opens in the zone window. A Caere Document opens with recognized text in the text window and its original image (if saved) in the zone window. In either case, the first page of your file is displayed. Click Cancel to exit without opening a file. You can only have one working document open at a time. If you attempt to open another file, you are prompted to close your current document. You can add page images to your document by using the Load Image or Scan Image command in the Process menu or Image button drop-down list box. Close Document Choose Close Document to stop working on a document but leave OmniPage running. If the current document has not been saved or has changed since the last save, a prompt appears asking if you want to save the document before closing. Click Cancel to go back to the open document. Choose Mail... to access your mail system and send each page of recognized text from your currently open document. This command is only available for MAPI mail systems such as Microsoft Mail. Choose Save to write the contents of your current working document to disk. This command is also available as a button in the toolbar. The Save As dialog box appears if you are saving the file for the first time. After saving, you can continue working on your document. S~ve ~s... Choose Save As... to choose a file format and save a document to disk. This command is also available as a button in the tnnlh~r. Use this command to save Caere Documents and recognized documents to other file formats. To save a recognized document in more than one file format, you can: Save the file as a Caere Document (i.met). By saving your document as a Caere Document, you can continue to reopen it in OmniPage, make edits, and save it in other supported file formats. A Caere Document can have up to 255 pages. Each page can include the ori~inal ima~e, zones, and reco~nized text. Save the initially recognized document in each desired format using save As... while it is open in the text window. Remember, only a Caere Document can be reopened (and resaved in a different format) in OmniPage . Caerer MET) L~,~ ti~ 6 ~M~ a~ f Sc-ve Options When you save your document to a file format other than a Caere Document you can select one of three Save Options. create one file for all pages Select this to save all the pages in your document as one file. (Blank pages are not saved.) Save the file with a standard file name of eight characters or less. create one file per page Select this to create a separate file for each page in your document and automatically increment file names. (Blank pages are not saved.) The assigned file names are comprised of up to five characters and appended numbers starting with 001. For example, if you use "form" as a file name, the first file is named formO01, the second file formO02, and so on. The file extension added depends on your choice of file formats. Word for Windows file would be named form001 .doc . Create new file at each blank page Select this to create a new file after each blank page in your document. (Blank pages are not saved.) For example, if you want to scan several batches of pages at once, insert blank pages to separate each batch. OmniPage will save the first batch of pages as a file, detect a blank page, save the next batch of pages as a file, detect a blank page, and so on. The assigned file names are comprised of up to five characters and appended numbers starting with 001. For example, if you use "form" as a file name, the first file is named formO01, the second file formO02, and so on. The file extension added depends on your choice of file formats. A Word for Windows file would be named fnrm()() 1 ~1 n~ To save a file: Select the path and directory to save your file in the Save As dialog box. The default directory is called Data; OmniPage creates this during installation. 2 Type a name for your file in the File Name edit box. 3 Select the appropriate file format from the Save Files as Type drop-down list. See Supported Output File Formats on page 2-4 for a list of supported file formats and a description of ASCII and ANSI options. For a recognized document that you are saving in another file format, select the appropriate Save Option as described in the preceding section. Click OK. OmniPage automatically adds the appropriate file extension to the file name and the current working file returns to the screen. Click Cancel at any time to exit without saving. Export Image... Choose Export Image... to save an image to disk in an image file format such as TIFF or PCX. An image file is a "picture" of text and/or graphics. For example, scanning a page results in an image that you can save in an image file format. Image files do not have OCR or zone information. When you open an image file in OmniPage, it appears in the zone window. Save Options You can select one of two Save Options. Select Save Current Page Only if you want OmniPage to save only the current page image as a file. Select Save All Pages if you want OmniPage to create a separate file for each page in your document and automatically increment file names starting with 001. Image Options You can select one of two Image Options. Select Save Each Graphic Zone to a File if you want OmniPage to save only the graphics within your page image. You must create zones in the page image and perform OCR before you can ~hf~nc~ thi~ nntion. f\ Choose the Multiple Columns zoning option in the Settings Panel Zones options to have OmniPage automatically separate graphics from text. Or, draw manual zones and identify the graphics as graphic zones. Select Save Entire Page to a File if you want OmniPage to save the entire page image. You do not need to create zones or perform OCR unless you have graphic zones. Graphic File Name How you match the Save and Image Options affects the length of the file name you can assign the image and how OmniPage aPpends an extension. Save Current Page Only and Save Entire Page to a File: the name you choose can have eight characters. This creates one one-page image file. Save All Pages and Save Entire Page to a File: the name you choose can have five characters. 00n is appended, where n represents the page number (001, 002, etc.). This creates multiple one-page image files. Save Current Page Only and Save Each Graphic Zone to a Fi/e: the name you choose can have seven characters. OmniPage appends a letter to indicate the order of the graphic on the page. A is the first graphic, B is second and so on. This creates one file for each graphic on the current page. Up to 26 files can be created in one directory with this method . Save All Pages and Save Each Graphic Zone to a Fi/e: the name you choose can have four characters. OmniPage appends both a number and a letter as an extension. The number (OOn) indicates the page number and the letter indicates the order of the graphic on the page. Thus the second graphic on the second page would be i002B. This creates one file for each graphic on every ~Jd~e . To save an image file: Select the path and directory to save the file in the Export Image dialog box. T~ f r ~ llr fil~ th~, ril,. I~r~,, ,~,lit 1~ ~ Select the appropriate file format from the Save Files as Type drop-down list. 4 Select the appropriate Save and Image Options. 5 Click OK. OmniPage automatically adds the appropriate file extension to the file name and the current working file returns to the screen. Click Cancel at any time to exit without saving. Revert to Saved Choose Revert to Saved to undo edits made to a file and return to the last-saved version of the file. For example, if you have deleted important information or cut and pasted text into unreadable gibberish, choose Revert to Saved and the file will reappear as it was when you last saved it. Get Accuracy Info... Choose Get Accuracy rnfo .. for a statistical report showing how well OmniPage recognized a page. Accuracy information is valuable for comparing the effect of different settings on recognition accuracy. For example, if you are not sure about which Scanner options to choose, you can compare the recognition accuracy percentages of different options. You can also quickly tell if a poor-quality document is worth scanning. If the recognition accuracy rate is less than 97%, it might be quicker to rescan a better copy of the page or to enter the text manually. The Get Accuracy Info dialog box provides a statistical report for the most recently recognized page. I Acculacy Infolmalion lo~ Pagc: 1 Numbel ol Chalactel~: 3!i8" N umbc~ ol Wolds: 673 Numbe~ ol Rejeck: O Numbel ol Suspecls: 0 Numbel ol Spolling Replacemenls: O Recognilinn Time: 13 lec Wo~d~ pe~ Minule: S~ Recognilion Rale: 3 cha~sec Accu~acy Rdle: 100 00 Z L~ Number of Characters This is the number of characters and spaces on the page. Number of Words This is the number of words on the page. Number of Rejects This is the number of unrecognizable characters. This does not count improper substitutions or incorrectly recognized formatting commands. Reject characters appear in red in the recognized document; by default, re jects are represented by the tilde (~) character. Number of Suspects This is the number of questionable characters which OmniPage made an attempt to recognize. These words appear in green in the recognized document. Number of Spelling Replacements This is the number of words which were automatically corrected by Caere's Language Analyst feature. These words appear in blue in the recognized document. Recognition Time This is the time it took to break the page down into text and graphics and perform recognition. This does not count scanning time, the time it takes to create zones, or the time spent writing data to disk. Words per Minute This is the number of words per minute (wpm) that OmniPage recognized. Assuming that the average word is five characters long, the formula is: Recognition Rclte This rate is expressed in characters per second (cps). The formula is: Accuracy Rclte This is the recognition accuracy given as a percentage. The formula for Accuracy Rate is: If the accuracy rate is less than 97%, it might be quicker to rescan a better copy of the page or to enter the text manually. Save Settings... Choose Save Settings... to save the currently selected Settings Panel options and language selection(s) to a settings flle (~.set) for later use. Saving settings files is especially useful if you process different types of documents. Since various documents may require different settings, you can save different settings files and then load the appropriate file for a particular docum(~nt To save settings: In the Save Settings dialog box, select the path and directory to which you want the settings file saved. You may want to create a special directory for your settings files. Type a name for your settings file in the Fi/e Name edit box. Select Caere Settings (nset) as the type of file you are saving. Click OK to save the settings file. Click Cancel to exit without saving. To load a settings file, use the Load Settings... command in the File menu. Load Settings... Choose Load Settings... to load a previously saved settings file (*.set). A loaded settings file automatically sets Settings Panel options and language selection(s) to preselected values. This is useful for quickly restoring OmniPage to settings required by certain documents. To lo~d cl set~ings file: In the Load Settings dialog box, select the path and directory where your settings files reside. Select Caere Settings (~.set) as the type of file you are loading. Double-click the settings file you want. Or, select the file and click OK. The settings are loaded immediately. Click Cancel to exit without loading a settings file. To save a settings file, use the Save Settings... command in ~hf~ Fil~ m~nll Save Zone Template... (Professional version only) Choose Save Zone Template... to save the zones that you manually create on a page image as a template. A zone template file (~.zon) is comprised of various zone attributes such as position, order, and zone contents. For example, if you frequently process documents with layouts and content that require the same type of zoning, you can create and save a zone template and apply it to all such documents. To save a zone template: After manually creating the zones you want to save, choose Save Zone Template.... In the Save Zone Template File dialog box, select the path and directory where you want the zone template ~ve~l The default directory created during installation is called Data. OmniPage looks for zone template files in this directory. Type a name for your zone template file in the File Name edit box. Select Caere Zone (~.zon) as the type of file you are savmg. Click OK to save the zone template file. Click Cancel to exit without savin~ the zone template. To apply a zone template to a page image, choose the Use Template... command in the Process menu or select a template directly from the Zone button drop-down list box. Choose Print... to print a recognized document. This command is also available as a button in the toolbar. The dialog box that appears depends on your printer. A document is printed according to the selected print options such as print range, print quality, and number of copies. Select the desired print options and click OKto start the print job. Click Cancel to exit without printing or saving the selected print options. Publish to Envoy... (Professional version only) Choose Publish to Envoy... to save recognized text and any retained graphics as a WordPerfect Envoy runtime file. An Envoy file displays information as if it were printed, only it is printed on the screen rather than on paper. Envoy preserves your document's fonts and page layout. Text and graphics will appear exactly as they did in the OmniPage text window. You cannot change an Envoy file's contents as you would edit a file in a word processor, but you can rearrange, combine, and delete whole pages of the file. You can annotate an Envoy file on screen and print out the entire file on paper. You can also copy selected items from an Envoy file to the Windows Clipboard and paste them into other applications. Saving Recognized Text as a WordPerfect Envoy Runtime File: Choose Publish to Envoy.... The Print dialog box appears with the Envoy driver automatically selected. Click OK. The Save Envoy Runtime As dialog box appears. 3 Select the path and directory in which to save the file. 4 Enter a name for your file. 5 Click Setup... to change the default print options. 6 Click OK. Your file will automatically be saved as an Envoy Runtime file with an .exe extension. Opening a WordPerfect Envoy Runtime File An Envoy runtime file is self-opening: it includes a scaleddown version of the WordPerfect Envoy application. This file can open itself on the same kind of computer it was created on (Macintosh or PC) even without Envoy installed. To open your Envoy runtime file, double-click its file name (~.exe) in the Windows File Manager. Your file will open in a scaled-down version of the Envoy viewer with the title "Embedded Document." /~ The file will open in the regular Envoy viewer if you have the Envoy application installed. Descriptions appear on the Envoy title bar at the top of the screen and on the status bar at the bottom of the screen to define what a selected button or command does and actions you can do next as you perform a task. I ~ You cannot import or open any file other than the file ù ~ that is attached to the Envoy runtime viewer. Choose Exit to quit the OmniPage program.lf the current working document has changed since the last save, a prompt appears asking if you want to save the document. Click Cancel The Edit Menu Cut Ctrl~X Copy Clrl~C P~ sle Clrl ~V Cl~ar Del Selert All In P~ge Check Recognilion.. Clrl~K Verify Im;~ge Clrl~Y EindlRepl~ce....... Clrl~F The Edit menu lets you revise text in the text window and work with images in the zone window. Edit menu commands i n ~ Cut Copy Paste Clear Clear All Zones (for zone window only) Select All in Page Check Recognition... Verify Image Find/Replace. . . Delete Recognized Zone Select Recognized Zones Delete Current Page Go to Page... Choose Cut to temporarily delete selected text from the recognized document. This command is also available as a button in the toolbar. Cut text is stored on the Windows clipboard and may be pasted anywhere (except into a graphic) in the document. The text remains on the clipboard until new text is cut or copied. To cut text: Position the text cursor at the start of the text, hold the mouse button down, and drag the cursor across the text to hiœ~hli~ht it. Release the mouse button when you have selected the desired area of text. Choose Cut or click the Cut button. The selected text disappears. Place the cursor where you want to place the text and click the mouse button. 5 Choose Paste in the Edit menu or click the Paste button. The Verify Image feature cannot track text that is cut and pasted from one page to another. Choose Copy to duplicate selected text from the recognized document. This command is also available as a button in the tnnlhAr Copied text is stored on the Windows clipboard and may be pasted anywhere (except into a graphic) in the document. The text remains on the clipboard until new text is cut or copied. To copy text: Position the text cursor at the start of the text, hold the mouse button down, and drag the cursor across the text to highlight it. Release the mouse button when you have selected the desired area of text. Choose Copy or click the Copy button. The selected text remains as is. Place the cursor where you want to place the text and click the mouse button. Choose Paste in the Edit menu or click the Paste button. Choose Paste to place cut or copied text in the recognized document. This command is also available as a button in the toolbar. Pasted text appears at the cursor location. A copy of the pasted text remains on the Windows clipboard until new text is cut or copied. /j\ The Verify Image feature cannot track text that is cut and ù ~ pasted from one page to another. Choose Clear to delete selected text from the recognized document permanently. To clear text: Position the text cursor at the start of the text, hold the mouse button down, and drag the cursor across the text to highlight it. Release the mouse button when you have selected the desired area of text. Choose Clear. Cleared text is not stored on the Windows clipboard, so you cannot paste it. Clear All Zones Choose Clear All Zones to delete all of the zones in a page image in the zone window. This command is also available as a button in the toolbar. Clear All Zones appears in the menu only when the zone window is active. When you clear zones, only the zone borders are deleted; the image itself remains the same. After the zones are cleared, you can create new zones manually, automatically, or by using a zone template (Professional version only). Select All in Page Choose Select All in Page to automatically select the entire contents of a recognized page in the text window. This command is available only when the text window is active. To deselect a selected page, click anywhere within it or choose Select All in Page again. Check Recognition... Choose Check Recognition... to check for errors in a recognized document. This command is also available as a button in the toolbar. OmniPage uses the currently selected main and user dictionaries to check recognition. The Check Recognition operation will stop at: Blue words: these were replaced by the Language Analyst. Green words: these have questionable characters that OmniPage made an attempt to recognize. Red: unrecognizable characters in a word are replaced with a red reject character (~ is the default). Words not found in the dictionariec When OmniPage finds a possible error, the Check Recognition dialog box shows the original image of the word in the context of the original page. /~ You can only see character bitmaps if the original page images are saved in the Caere Document. Cae~e designs~ develo Choose one of the following options for a word flagged as a possible error. After you choose an option for the word, OmniPage automatically finds the next possible error. Ignore Click this to allow a word to remain as is and go on to find the next error. Add Click this to add a word to your current User Dictionary and go on to find the next error. Other occurrences of the word that are suspected errors in the current document will be checked. However, OmniPage will accept future occurrences of the word when you use the same user dictionary for future documents. Change Click this to replace a word with the word in the Change To edit box. To place a word in the Change To edit box, you can either type in a word or select a word from the drop-down list box A The original text of a word corrected by the Language Analyst appears as the first word in the list box in case you want to change it back. Done Click this to exit the Check Recognition operation. Any changes made up to that point will be retained. Verify Image Choose Verify Image to view the original image of recognized text in the Verification Window. The Verification Window is an important feature that you can use while you are viewing and editing recognized text. It shows a clear close-up of the original image and surrounding area of selected text. In order to verify images, the original page images must be saved in the Caere Document. To save page images, make sure Save Page Images in Caere Document is selected in the Settings Panel Preferences options before scanning or loading an image. Saving the original page images slightly slows down ~ ù ~ processing and takes up more disk space. To verify cm imclge: Place the cursor in the area of recognized text that you want to verify. Choose Verify Image. Or, double-click the mouse button. The Verification Window appears showing the original image of the selected area of text. You cannot verify the image of text that is cut and pasted ~ ù ~ from one page to another. Find/Replace... Choose Find/Replace to find a word or set of characters in the recognized document and replace it with another word, if desired. This command is also available as a button in the toolbar. ~ . ith ~ By default, when you search for a word, all occurrences of letter combinations that match the word are found. For example, if "jelly" is the search word, OmniPage would find the "jelly" in "jellyfish." You can also select other more specific options for finding words, including Match Whole Word Only and Match Case. Match Whole Word Only Select this to find only the words that exactly match the length of the search word. Compound words that contain the search word within them will not be found. Motch Case Select this to find only the words that exactly match the upper- and lower-case attributes of the search word. To find ~ word: Type the word or set of characters for which you are searching in the Find What edit box. Select specific search options, if desired. You can select Match Whole Word Only and/or Match Case. Click Find Next. The first occurrence of the word is highlighted. To continue searching, click Find Next again. k ~anr~/ tr) ~it To replace a word: Type the word or set of characters that you want to replace in the Find What edit box. Select specific search options, if desired. You can select Match Whole Word Only and/or Match Case. Type the desired replacement word in the Replace With edit box. Click Find Next. The first occurrence of the word is highlighted. Click Replace to insert the replacement word. OmniPage then automatically looks for the next lrrPn~e of the ~e~reh word. Click Replace All to replace all instances of a word. Click Cancel to exit. Delete Recognized Zone Choose Delete Recognized Zone to delete a selected text or graphic zone in a recognized page. This command is available in the Fdit menu when the text window is active. You can delete a text zone if your cursor is in it. To delete a graphic zone, however, you must choose Select Recognized Zones in the menu and then click in that graphic zone. Select Recognized Zones Choose Select Recognized Zones to select all of the text and graphic zones in a recognized page. A check mark appears next to this command when it is selected. OmniPage produces various text and graphic zones in a recognized page which you can resize or reposition to change the page layout. When you select zones, handles appear on each zone. Use the handles to resize a zone. To move a selected text or graphic zone to another area of the recognized page, place the mouse pointer inside the zone, hold the mouse button down, and drag it to the desired location. To deselect the zones in the recognized document, choose Select Recognized Zones from the Edit menu again. The check mark disappears and the zones are deselected. /~ You can also select zones individually by placing the ' ~ mouse pointer inside a zone and doing an Alt-right mouse click. Delete Current Page Choose Delete Cuwent Page to delete a page. You may want to delete a page in your document that was poorly scanned or recognized. When you delete a page, everything is discarded including the page image and recognized text. Go to Page... Choose Go To Page... to switch to another page in the current document. Both the text window and zone window will change to reflect the selected page. You can also open the Go To Page dialog box by clicking the current page number in the status bar. In the Go To Page dialog box, you can select First Page, Last Page, or type in a specific number in the Page Number edit box. Click OK to go to the selected page. Click Cancel to exit and return to the current page. The Format Menu The Format menu lets you format character and paragraph attributes while you edit a recognized document in the text window. Format menu commands include: Character. . . Paragraph . . . Character... Choose Character... to change the attributes of a selected character or section of text in a recognized document. A Character formatting will not be retained if you save a C~ file in ASCII or ANSI format. You can select multiple attributes in the Font dialog box including font, font style, size, and effects. The Sample box illustrates the attributes that you select. Font Select a font from the Font list box. You can type a letter in the Font edit box to skip to the fonts beginning with that letter. Font Style Select Regular to return selected characters to an unformatted state. Bold, italic, and underlined characteristics disappear. Select Italic to change selected characters to an italicized format. Select Bold to change selected characters to a boldfaced format. Select Bold Italic to change selected characters to a boldfaced and italicized format. Size Select from a range of font sizes in the Size list box. Effects Select Underline to change selected characters to an underlined format. To apply character formatting: Position the text cursor at the start of the text, hold the mouse button down, and drag the cursor across the text to highlight it. Release the mouse button when you have selected the desired area of text. 3 Choose Character... in the Format menu. 4 Make the desired formatting selections in the Font dialog box. Click OK to accept the formatting selections; the selected text changes accordingly. Click Cancel to exit without applying the formatting selections. You can also use the Bold, Ita/ics, and Underline buttons in the text window for convenient formatting shortcuts. Paragraph... Choose Paragraph... to change the attributes of a selected paragraph in a recognized document. Paragraph formatting will not be retained if you save a file in ASCII or ANSI format. You can select line spacing and alignment attributes in the Paragraph Format dialog box. Line Spacing Select Single for single-spaced lines. Select Double for double-spaced lines. Select Triple for triple-spaced lines. Alignment Select Left for left-aligned text. Select Center for center-aligned text. Select Right for right-aligned text. Select Justify for justified text. To apply paragraph formatting: Place the cursor somewhere within the paragraph that you want to format. 2 Choose Paragraph... in the Format menu. 3 Make the desired formatting selections in the Paragraph Format dialog box. Click OK to accept the formatting selections. The selected text changes accordingly. Click Cancel to exit without applying the formatting selections. You can also use the buttons in the text window for convenient formatting shortcuts. The Process Menu The Process menu lets you perform fundamental OmniPage operations, including each step of the OCR process. Process menu commands include: Auto/Stop Scan Image/Load Image Auto Zones/Manual Zones/Use Template... Perform OCR/Train OCR/Defer OCR Process Settings Finish Current Document... Finish Deferred Documents... Start Image Assistant (Professional version only) ~ome of the Process menu commands are available as buttons in the toolbar. In particular, the Process Settings commands are available in the Image, Zone, and OCR button drop-down list boxes. The Process Settings commands change according to the currently selected button commands and vice versa. Scan Image Auto Zones Perform OCR Process Settings Choose Auto to automatically start and finish processing each page of a new document or finish processing the current page of an open document. This command performs the same function as the AUTO button in the toolbar. Automatic processing of a document is determined by the currently selected Image, Zone, and OCR Process Settings commands. For example, if Scan Image, Auto Zones, and Perform OCR are selected as the processing commands, the following process occurs when you choose Auto: The page in your scanner is scanned and the resulting imacre appears in the zone window. OmniPage automatically creates zones on the page image . OmniPage performs OCR on the page image and the resulting recognized page appears in the text window. This same process repeats for every page in a multipa~e document. Scanning, zoning, and OCR operations occur according to the currently selected Settings Panel options. When a document is already open to an unfinished page image, you can choose Auto to finish processing that page according to the selected processing commands. For example, if your document is open to an unrecognized page image without zones, you can choose Auto to create zones and recognize the page according to the selected zone and OCR commands. As automatic processing begins, the Auto command changes to Stop. Choose Stop if you want to discontinue processing. Choose Stop if you want to discontinue processing at any time. This command performs the same function as the Stop button in the toolbar. For example, you may want to stop scanning your page if you realize that inappropriate scannin~r o~tions were select~d Scan Image Choose Scan Image to scan a page in your scanner. This command performs the same function as the Image button when Scan Image is selected from the drop-down list box. Before scanning, make sure the appropriate Scanner options are selected in the Settings Panel. /~\ You can use your right mouse button to click the Image button and automatically open the Settings Panel to Scanner options. A scanned image becomes your working document if a document is not already open. When a document is already open, scanned images can be added to it. An image is automatically appended to the end of the document if the last page is currently open. If the document is not open to the last page, a dialog box opens. You can replace the current page, insert before the current page, or append to the end of the docllm~nt While scanning an image, a progress meter appears and the status bar displays progress. To cancel scanning at any time, click the STOP button in the toolbar. When scanning is complete, the image appears in the zone window. You can scan multiple pages when you select Scan Image for automatic processing. For example, you may have a multi-page document that you want to process automatically. After selecting Scan Image and the desired zone and OCR processing commands, you can choose Auto to begin automatic processing. The pages are scanned and processed in the order that they are placed in the scanner and combined into one working document. You can change the Scan Image command to Load Image in the Process Settings cascading menu or the Image button dropdown list. Load Im~ge Choose Load Image to open a previously saved image file. This command performs the same function as the Image button when Load Imaqe is selected from the drop-down list box. An image file is a "picture" of text and/or graphics that is saved in an image file format such as TIFF or PCX. When you load an image file in OmniPage, it appears in the zone window. See Supported Input File Formats on page 2-8 for a list of supported image file formats. A loaded image file becomes your working document if a document is not currently open. When a document is already open, image files can be added to it. An image is automatically appended to the end of the document if the last page is currently open. If the document is not open to the last page, a dialog box opens. You can replace the current page, insert before the current pa~e, or append to the end of the document. To load an image file: Locate your image files in the Open Document dialog box. Select the type of file you wish to open from the List Files of Type drop-down list. Files of that type appear in the File Name list box. Double-click a file or select it and click OK. The file opens in the zone window. For a multi-page image file, you must click the Image button to load each consecutive pa~e in the file. Click Cancel to exit without loading an image file. You can load multiple image files when you have Load Image selected for automatic processing. For example, you may have a number of TIFF files that you want to process automatically. These files are loaded and processed in the order that they are selected and combined into one working document. To load one or more image files for automatic processing: Select Load Image and the desired zone and OCR command s . Click the AUTO button to begin processing. The Load Image dialog box appears. 3 Locate your image files. 4 Select the type of files you wish to load from the List Files of Type drop-down list. Files of that tY~e appear in the File Name list box. Click the file to load and then click Add. The file appears in the Selected Files list box. Click Add All to select all the files in the directory. ù To add image files from other directories, repeat steps two through four. You can select up to Z55 files. To remove a file from the list box, click it and then click Remove. Click Remove All to remove all files from the list box. Click OK when you have selected all the files to load. The images are loaded into the zone window and processed one at a time, in the order they were selected . Click Cancel to exit without loading any files. You can change the Load Image command to Scan Image in the Process Settings cascading menu or Image button dropdown list. Auto Zones Choose Auto Zones to have OmniPage automatically draw and order zones that determine what will be recognized in the page image. This command performs the same function as the Zone button when Auto Zones is selected from the drop-down list. To automatically create zones and determine the text flow for recognition, OmniPage uses the selected Zones option in the Settings Panel: Multiple Columns, Single Column or Table, or None. For more information about each of these options, see Zones Options on page 2-9. If a page already has zones, you are prompted to delete the current zones before auto zoning occurs; click Yes to proceed. The zone window is then updated so that you can review the 70n~s that are drawn. The automatically drawn zones appear in black and each zone has a number indicating its recognition order. Using the zone window tools, you can reorder zones for recognition and deselect zones that You do not want to recognize. To reorder zones for recognition: .;-- . I 1 Click the Order Zones tool. ;~ . .. l '1 | The numbers in the zones will disappear. Click within the zone you want to recognize first. The number 1 will appear in the zone. Click within the next zone you want recognized. The number 2 will appear in the zone. 4 Continue until all the zones are appropriately ordered. To deselect zones that you do not want to recognize: Click the Select Zones tool. 2 Click within each zone you want to deselect. A zone changes from black to white when it is deselected. To reselect a zone, click it again with the .Select Zones tool. You can change the Auto Zones command to Manual Zones in the Process Settings cascading menu or the Zone button dropdown list box If you use the OmniPage Professional version, you can also change this command to Use Template... in the Process Settings cascading menu or select a zone template directly from the Zone button drop-down list box. Manual Zones Choose Manual Zones to draw, order, and specify your own zones that determine what will be recognized in the page image . When you create zones manually, OmniPage uses the selected Zones option in the Settings Panel (Multiple Columns, Single Column or Table, or None) to determine the text flow within each zone that you draw. For more information about each of these options, see the Zones Options entry in Chapter 2, The Settings Panel. You can draw zones using the tool palette in the zone window. |Alphsrlumeric Use the Zoom tool to zoom in or out. After selecting it, click the left mouse button to zoom in (enlarge the image) and the right mouse button to zoom out (reduce the image). Use the Draw Zones tool to draw zones around areas of text. Use the Order Zones tool to number zones in the order you want them reco~nized. Use the Erase Zones tool to delete existing zones. Use the Arrow buttons to rotate the entire image 90 degrees left, 180 degrees, or 90 degrees right. Use the Zone Contents drop-down list box to assign a zone ~onnontc fil~ t~ ~ s~l~cted zone. To draw zones: Click the Draw Zones tool. 2 Enclose an area you want as a zone by holding the mouse button down and dragging the mouse. When you have enclosed the desired area, release the mouse button. Continue using the mouse to draw zones in the page ima~e until you have finished. You can draw up to 64 separate zones of which 26 can be graphic zones. Any area of the image that is not part of a zone will not be reco~nized. A number appears in each zone indicating the order in which the zone will be recognized. To reorder zones, llcr thr ()rrl~r 7nn~ tnnl To resize zones: Click the Draw Zones tool. 2 Click a zone to select it. Handles appear on the zone border. Select a handle, hold the mouse button down, and drag the mouse in the direction that you want to enlarge or reduce the zone. To order zones for recognition: Click the Order Zones tool. The numbers in the zones disappear. 2 Click within the zone you want to recognize first. The number 1 appears in the zone. 3 Click within the next zone you want recognized. The number 2 appears in the zone. 4 Continue until all the zones are appropriately ordered. To reorder zones, click the Order Zones tool again. To move zones: Click the Draw Zones tool. 2 Place the mouse pointer inside a zone. Your cursor changes to a four-way arrow. Hold down the mouse button and drag the zone wherever you want it. Only the zone borders can be moved; the contents of the page image remain in the same place. To erase zones: Click the Erase Zones tool. 2 Click within each zone you want to delete. Only the zone borders go away; the contents of the page image remain. Assigning Zone Contents Files For better recognition accuracy, you can assign zone cnntents files to various zones that you draw in a page image. For example, if your image has a paragraph of text followed by a table of numbers, you can draw separate zones around each and assign an alphanumeric zone contents file to the paragraph and a numeric zone contents file to the table. To assign zone contents files to zones: After drawing zones manually, click within a zone to select it. Select the appropriate zone contents file from the Zone Contents drop-down list box. Repeat steps one and two for any other zones you wish. You can change a zone contents assignment at any time before recognition. You can change the Manual Zones command to Auto Zones in the Process Settings cascading menu or the Zone button dropdown list box. If you use the OmniPage Professional version, you can also change this command to Use Template... in the Process Settings cascading menu or select a zone template directly from the Zone button drop-down list box. Use Template... (Professional version only) Choose Use Template... to create zones that determine what will be recognized in the page image by applying a zone template file (*.zon). This is a quick and efficient means of zoning similar documents. A zone template file is comprised of various zone attributes such as position, order, and zone contents. If you frequently process documents with layouts and content that require the same type of zoning, you can save a zone template and apply it to all such documents. A You can create zones manually and save them as a template using the Save Zone Template... command in the File menu. For more information on creating zones manually, see Manual Zones on page 2-46. When you choose Use Template..., a dialog box appears listing all zone template files in the Data directory. To apply ~ zone templ~te: Click the zone template p.zon) to use for the current page Image. The selected file is highlighted. Click OK. Zones are drawn on the page image according to the selected zone teml)late. Click Cancel to exit without applying the zone template. You can also select a zone template directly from the Zone button drop-down list. You can change the Use Template... command to Manual Zones or Auto Zones in the Process Settings cascading menu or the Zone button drop-down list. Perform OCR Choose Perform OCR to recognize text on the current page. This command performs the same function as the OCR button when Perform OCR is selected from the drop-down list. Before performing OCR, make sure the appropriate OCR options are selected in the Settings Panel. Use your right mouse button to click the OCR button and automatically open the Settings Panel to OCR options. If there are no zones on the page when you select Perform OCR, OmniPage automatically creates zones according to the selected Zone command. If Manual Zones is currently selected, OmniPage ignores this and draws zones automatically. If you use the Professional version, you can change the Perform OCR command to Defer OCR or Train OCR in the Process Settings cascading menu or OCR button drop-down list. Defer OCR (Professional version only) Choose Defer OCR to delay text recognition of one or more pages of the document you are processing. This command performs the same function as the OCR button when Defer OCR is selected from drop-down list box. For example, you can select Scan Image, Auto Zones, and Defer OCR as the Process Settings commands and then choose Auto to initiate automatic processing. Each page of your document will be scanned and zoned but recognition will be deferred . At your convenience, you can choose Finish Current Document... in the Process menu to finish processing your open document. Or, set OmniPage to recognize the deferred document at a specified time by choosing Finish Deferred l)nnl~m~7t~ in th~ Process menu. You can change the Defer OCR command to Perform OCR or Train OCR in the Process Settings cascading menu or OCR button drop-down list box. Train OCR (Professional version only) Choose Train OCR to create a character training file (i.trn) that assists OmniPage during text recognition and allows better recognition of special characters. This command performs the same function as the OCR button when Train OCR is selected from the drop-down list. A character training file is a set of pre-recognized text characters that OmniPage compares with the characters in the page image during recognition. Before recognizing an image, you can create a new training file or choose an existing one in the Settings Panel OCR options. For step-by-step instructions on training OCR, Documents with Specialized Characters (Professional version onlY) on ~a~e 2-50. The Train Characters dialog box shows the original image and OmniPage's interpretation of each character in the page image. Click any character to select it for training. ~(' l I Original ima9e (~ T a cefg OmniPage's in~erpre~a~ion 1~ T a cefg ( g h i I n o h i n P P P r r S _ Specify Select a character and click Specify (or double-click the character) to open the Specify Character dialog box. This shows the selected character in the context of the original page image. 1 60-- 1 61--I 1 62 - - 1 63 - _ 164-- 1 65-- 1 66-- 1 67-- 1 68---- 1 69 =~ ~:~Training a list o~ ~Ppr~N ~K You can associate character(s) for the selected character bitmap. Type the desired character(s) in the Character edit box or select a character in the Extended ANSI list box and click OK. Delete To discard a previously specified character, select it and click Delete. Append Click Append to add the current set of trained characters to another training file. A dialog box appears displaying a list of existing character training files. Click the file you wish to append and then click OK. Click Save to save the trained characters to a file; a dialog box appears. Name the file and click OK. If you name an existing file, you will be asked if you want to replace it with the current file. Training files are saved to the Data directory; this is the default directory that OmniPage creates during installation. To create a character training file: Open an image file or scan an image that includes the characters YOU want to train. Select the appropriate zones and choose Train OCR in the Process menu or OCR button. The Train Characters dialog box appears. Specify characters in the dialog box and edit them as desired . Click Save to name and save a character training file for the characters you have trained. Click Append to add the trained characters to an existing training file. Click Cancel to exit without saving the training file. A dialog box gives you the option of recognizing your page image and making this the current training file after saving or appending the file. Click Yes to recognize your page image and apply the training file you just created. Click No if you want to return to the OmniPage screen without recognizing the image. You can change the Train OCR command to Perform OCR or Defer OCR in the Process Settings cascading menu or OCR button drop-down list box. Process Settings Choose Process Settings to access the image, zone, and OCR processing commands. These commands are also accessible in the Image, Zone, and OCR button drop-down list boxes. Selected image command Selected zone command Selected OCR com mand Finish Current Document... J Load Image Finish Deferred Documents... JAuto Zones Start Image Assistant Manual Zones Use Template JPerform OCR Train OCR Defer OCR In the Process Settings cascading menu, you can select: Scan Image or Load Image Auto Zones, Manual Zones, or Use Template... (Professional version only) Perform OCR, Train OCR (Professional version only), or Defer OCR (Professional version only) The currently selected Process Settings commands determine what image, zone, and OCR operations can be performed. OmniPage also uses the selected commands for automatic processing. For more information on each Process Settings command, refer to its respective Process menu entry in this chapter. Finish Current Document... Choose Finish Current Document... to automatically finish recognition ,~rocessing of an open document. For example, you can scan pages and create zones in a multipage document without taking the time to recognize it. Later, at your convenience, you can choose Finish Current Document... to reco(Jnize the entire document. A OmniPage uses the currently selected Settings Panel ~ ~ ~ options to finish processing your document. Sclve Options For your convenience, you can select Save Options in the Finish Current Document dialog box. This way, OmniPage automatically finishes your document and then saves it to your specifications. Select Convert Automatically to save your document in a preselected file format after recognition. Click Save Output to... to open the Save As dialog box and select specific options for saving your document. Select Delete Caere Document when Finished to automatically discard the Caere Document after recognition. Your document will only be saved in the file format that you select. Remember, only a Caere Document can be reopened (and resaved in a different format) in OmniPage. If the Caere Document is deleted, you can't reopen or ù ~ edit your reco~nized document in OmniPage. To finish the current document: Open the document you want to finish, if it is not already open in OmniPage. 2 Choose Finish Current Document... in the Process menu. 3 Select the appropriate Save options. Click OK to begin processing immediately. Every page of the document will be processed. If a page does not have zones, OmniPage automatically creates zones using the selected Settings Panel Zones option. Click Cancel to exit without processing the current document. Finish Deferred Documents... (Professional version only) Choose Finish Deferred Documents... to automatically finish recognition processing of up to 255 documents at a specified time. A document with one or more unrecognized pages can be saved as a Caere Document. You can defer recognition of a page by choosing Defer OCR in the Process menu or the OCR button drop-down list box. For more information on this feature, see the Defer OCR entry in this chapter. /~ OmniPage uses the currently selected Settings Panel options to process your documents. SAve Options For your convenience, you can select Save Options in the Finish Deferred Documents dialog box. This way, OmniPage automatically finishes your deferred documents and then saves them to Your specifications. Select Convert Automatically to save your document in a preselected file format after recognition. Click Save Output to... to open the Save As dialog box and select specific options for saving your document. Select Delete Caere Document when finished to automatically discard the Caere Document after recognition. Your document will only be saved in the file format that you select. Remember, only a Caere Document can be reopened (and resaved in a different format) in OmniPage. /j\ If the Caere Document is deleted, you can't reopen or ù ~ edit your recognized document in OmniPage. When to Recognize Options Select Now to process the document(s) as soon as you click OK. Select Later to process the document(s) at another specified time. Select a time (hour and minute) from the drop-down list h.~f, c To finish deferred documents: In the Finish Deferred Documents dialog box, click Add Files.... The Open dialog box appears. Locate your files. Caere Document files in the specified directory will appear in the list box. Select a file you want to open and click Add, or doubleclick the file. The file appears in the Selected Files list box. Continue to select the deferred files that you want to finish. You can choose files from various directories. If you change your mind about a file, select it and click Remove. When you have selected the desired file(s), click OK. The Finish Deferred Documents dialog box reappears. Select the appropriate Save Options and When to Recognize Options. Click OK to recognize the deferred documents as specified. Each document is opened, processed, saved as specified, and then closed. If you do not specify any save options, a document will be saved to its original file name. Click Cancel to exit without recognizing the deferred ~o~l~m~nts Start Image Assistant (Professional version only) Choose Start Image Assistant to launch the Image Assistant 24-bit color and image-editing program. This command is also available as a button in the toolbar. /~ You can also launch Image Assistant by double-clicking a graphic zone in your recognized document. The graphic will appear in a new image window. With Image Assistant, you can scan and edit color, grayscale, and line-art images. Image Assistant provides a broad range of feature-rich tools and capabilities for sophisticated image control by experienced users. For casual users, the Assist Mode provides the most commonly used features in a simplified format. For more information about Image Assistant, see the Image Assistant tutorials booklet and the on-line documentation in the Image Assistant Help program. The Settings Menu Edit Training File... Edit Zone Contents File... Edit User Dictionary... The Settings menu lets you modify and set system-wide settings. Settings menu commands include: Settings Panel... Select Scanner... Select Languages... Edit Training File... Edit Zone Contents File... Edit User Dictionary... OmniPage retains the most recently selected system settings. For example, if you select Spanish as the language, OmniPage will use the Spanish character set for recognition until you change it. Settings Panel... Choose Settings Panel... to open the Settings Panel. This command is also available as a button in the toolbar. The Settings Panel is the central location for settings OmniPage uses to process your documents. Using the icons in the scroll box on the left side of the Settings Panel, you can access six different sets of options. Click the Scanner icon to select options that control how your scanner scans a page. Click the Zones icon to select the zoning option that determines the flow of text during recognition. Click the OCR icon to select input and output options that assist OmniPage during recognition and determine the format of the recognized document. Click the Fonts icon to select retained or ignored font format options. Click the Spelling icon to select dictionaries and spell checking options. Click the Preferences icon to select options that customize general OmniPage operations. The Settings Panel changes to reflect the options of the icon that you click. For more information about these options, see Chapter 2, The Settings Panel. For a guided tour of the Settings Panel, see Touring the Settings Panel on page 2-15. Select Scanner... Choose Select Scanner... to set the current system scanner. Rbaton DTP Scan 4 Rbaton Scan 300/FB,300/SF Rbaton Scan 3005,300GS Rbaton Transcribe/300 Rpple OneScanner Rgfa Focus Il II, & Color RUR 3~0 Brother BS300,BS300GS Canon IX-12,IX-12F To select a scanner: Scroll through the list of avai-lable scanners and click the ~referred scanner. Click OK to set the chosen scanner. Click Cancel to exit without setting the scanner. Certain scanners require additional parameters such as port address and speed. For these particular scanners, you will be prompted for the appropriate information. See your scanner documentation for more information. Select Languages... Choose Select Languages... to select one or more language character sets for text recognition. OmniPage can recognize additional characters (such as circumflexes, umlauts, etc.) unique to a particular language; eleven languages are available. You may select more than one language at a time, but for faster recognition, use only the minimum number of languages that are necessary. Select the language that matches the language of your main dictionary selection. Also, be sure to select only one language if you use the Language Analyst or 3D OCR. To select one or more languages: Select the preferred language(s) from the list box by clicking once. Selected languages are highlighted. To deselect a language, click it again. Click OK. Click Cancel to exit without setting the selected language(s) . Edit Training File... (Professional version only) Choose Edit TYaining File... to edit an existing character training file. A character training file (~.trn) is a set of pre-recognized text characters that OmniPage compares with the characters in the page image during recognition. Training files assist OmniPage during text recognition and allow better recognition accuracy of special characters. Before recognizing an image, you can create a new training file or choose an existing one in the Settings Panel OCR options. Original image Associa~ed charac~ers When you choose Edit Training File..., a dialog box appears listing all training files in the Data directory. Click the file you want to edit and then click OK. The Train Character dialog box shows the existing characters in the training file, including the original images and the associated characters. ~ q r~ % copyright paragraph trademark _ * @ * ~ _ See Documents with Specialized Characters (Professional version only) on page 2-50 for detailed information on how to create and edit a trainin~ file. Specify Select a character and click Specify (or double-click the character) to open the Specify Character dialog box. "at" is typed in so that OmniPage will You can change the character(s) associated with the selected image bitmap of the character. Type in the desired character(s) in the Character edit box or select a character in the Extended ANSI list box. Click OK to mark the character to be saved. Delete To discard a trained character from the training file, select it and click Delete. Append Click Append to add the current set of trained characters to another character training file. A dialog box appears displaying a list of existing character training files. Select the file you wish to append and click OK. Sclve Click Save to save edits to the trained character file. To edit cl charclcter file: Select the character file you want to edit from the dialog box and click OK. or double-click the file. Edit the characters in the Train Character dialog box as desired. Click Save to save the edited training file and return to the OmniPage screen. Click Append to add the trained characters to another file. Click Cancel to exit without saving the edits to the training file. Edit Zone Contents File... Choose Edit Zone Contents Fi1e... to create a new zone contents file or edit an existin~ file. A zone contents file (*.zcn) lets you identify the specific characters that OmniPage looks for within specified zones during recognition. OmniPage is shipped with numeric, ~raphic, and alphanumeric zone contents files. All zone contents files appear in the Zone Contents drop-down list in the zone window. When you draw zones manually in a page image, you can improve the quality and accuracy of recognition by identifying each zone's contents. For example, if you have a paragraph of alphanumeric text followed by a numeric table, you can draw separate zones and assign an alphanumeric zone contents file to the paragraph and a numeric zone contents file to the table. When you choose Edit Zone Contents File..., a dialog box appears that lists all zone contents files in the Data directory. You can select an existing file to edit or create a new one. To create or edit a zone contents file: To edit an existing file, click the file name in the File list box and then click OK. To create a new file, click New. A dialog box appears containing a list box of the extended ANSI character set and an edit box containing the characters in the zone contents file. If you selected New, the edit box contains the 94-character (typical keyboard) ASCII character set. Edit the contents of the edit box by typing in characters you want to add to the file and deleting (with your Backspace or Delete keys) undesired characters. To add a character from the extended ANSI character set, double-click the character in the list box; the ANSI character appears in the edit box. Click Reset to replace the contents of the edit box with the ASCII character set. 3 Click Save to save your changes. 4 A dialog box prompts you to name the file if it is new. Click Cancel to exit without saving any changes. Edit User Dictionary... Choose Edit User Dictionary... to create a new user dictionary ( .ud) or edit an existin~ one. A dialog box appears listing all user dictionary files in the Data directory. To edit an existing dictionary, select it and click OK. To create a new dictionary, click New; you are prompted to enter a name. Whether you are creating a new dictionary or editing an existing one, the Edit User Dictionary dialog box appears. . .~ . If you are editing an existing user dictionary, the dialog box lists all the words currently in that dictionary. If you are creating a new dictionary, no words are listed. Use the buttons in the Edit User Dictionary dialog box to create or edit your dictionary. Add Click this to add the word that you type in the User word edit box to your dictionary. The word will appear in the list box. Delete Click this to delete a selected word from the dictionary. Purge Click this to delete all words from the dictionary. Import.. . Click this to add words from another application to your user dictionary. For example, you may want to add technical terms from a particular file. The Import Text File dialog box prompts you to enter the file name and directory of the file you want to import. An imported text file can be any document or word list in ASCII format. Most word processors can convert a file into ASCII format; see your program's documentation. OmniPage will go through the selected text file, discard words already in the main or other user dictionaries, and add the remainin~ words to your current user dictionary. Export... Click this to save your user dictionary as a text file. The Export To Text File dialog box prompts you to enter a file name and destination for your file. Sove As... Click this to save an edited dictionary with a new name. User dictionaries are automatically saved with a .ud file extension. Done k this to save edits to your dictionary and then exit. The Window Menu The Window menu provides options for viewing the OmniPage screen and your document. Window menu commands include: Tile Horizontal Tile yertical Cascade Arrange Icons Show Toolbar Hide Status Bar Hide ruler Zone Window Text Window Zoom In Zoom Out Tile Horizontal Tile Vertical Cascade Arrange Icons Hide/Show Toolbar Hide/Show Status Bar Hide/Show Ruler Zone Window Text Window Zoom In Zoom Out Tile Horizontal Choose Tile Horizontal to resize the open zone and text windows so they fit in the window area horizontally. To switch windows, click the window that you want to activate. Tile Vertical Choose Tile Vertical to resize the open zone and text windows so they fit in the window area vertically. To switch windows, click the window that YOU want to activate. Choose Cascade to arrange the open zone and text windows one on top of the other with title bars showing. To switch windows. click the title bar of the window you want to activate. Arrange Icons Choose Arrange Icons to organize minimized window icons at the bottom of the screen. Click the Minimize button in the upper-right corner of the window to iconize the open zone and text windows. Hide/Show Toolbar Choose Hide Toolbar to hide the toolbar. Choose Show Toolbar to view the toolbar again. Hide/Show Status Bar Choose Hide Status Bar to hide the status bar located at the bottom of the window. Choose Show Status Bar to view the status bar a~ain. Hide/Show Ruler Choose Hide Ruler to hide the text window ruler. Choose Show Ruler to view the ruler again. Zone Window Choose Zone Window to bring the zone window into view. Text Window Choose Text Window to bring the text window into view. Choose Zoom In to enlarge an area of an image in the zone window for a close-up view. When an image is opened, it is fit to the zone window. You can zoom in three more levels. Zoom Out Choose Zoom Outto decrease an enlarged view of an image in the zone window. The Help Menu The Help menu provides access to the OmniPage online Help program and information about OmniPage. Help menu commands include: Contents Procedures Using Help About. . . Choose Contents for a list of the topics available in the OmniPage Help program. The Help program conforms to the Windows Help standard. Procedures Choose Procedures for a Help listing of OmniPage procedures for different OmniPage tasks. Using Help Choose Using Help for instructions on using the Help program . Choose About... to see information about the current OmniPage version you are using, any copyrights in effect, the program's licensee, company name, and serial number. Chapter 4 The Settings Panel This chapter explains how to use the Settings Panel: the central location for settings OmniPage uses to process your documents . The Settings Panel includes: Scanner options Zones options OCR options Fonts options Spelling options Preferences options You should make sure that the Settings Panel options are set appropriately for your document before you begin any OmniPage operation. Some options are only available with the OmniPage Professional version; these are marked as "Professional version only." For a guided tour of the Settings Panel, see Touring the Settings Panel on page 2-15. Settings Panel Overview To open the Settings Panel, choose Settings Panel... in the Settings menu or click the Settings Panel button in the toolbar. Using the icons in the scroll box on the left side of the Settings Panel, you can access six different sets of options. Click the Scanner icon to select options that control how your scanner scans a page and the way an image file is loaded. Click the Zones icon to select the zoning option that determines the flow of text during recognition. Click the OCR icon to select input and output options that assist OmniPage during recognition and determine the format of the recognized document. Click the Fonts icon to select font format options for retaining or ignoring the original font styles. Click the Spelling icon to select dictionaries and spell checking options. Click the Preferences icon to select options that customize The Settings Panel changes to reflect the options of the icon that you click. You can select options and then click Close or leave the Settings panel open as a floating window. The options selected last are retained until you select new ones. Selecting Settings Panel Options There are three ways to select Settings Panel options: Manual selection Click each Settings Panel icon and select options manually. You can change your selections at any time. Use Defaults button Click the Use Defaults button in the Settings Panel to reset all the Settings Panel options to the default values. Load Settings command Choose Load Settings... in the File menu to select a previously saved settings file ('.set). A loaded settings file automatically sets the Settings Panel options and language selection(s) to preselected values. You can save Settings Panel selections to a settings file by choosing Save Settings... from the File menu. Disk space is the only limit for the number of settings files you can save. Scanner Options Click the Scanner icon in the Settings Panel to select options that control the way your scanner scans a page. A Use your right mouse button to click the Image button C~ in the toolbar and automatically open the Settings Panel to Scanner options. Select Page options to describe page size and orientation. Size The Size drop-down list box lets you select the dimensions of the pages you are scanning. Select Letter for 8 . 5 " by 1 1 " size page s. ù Select Legal for 8.5" by 14" size pages. ù Select A4 for 21 cm by 29.7 cm European-size pages. Orientation The Orientation drop-down list box lets you select the orientation of the pages you are scanning or page images you are loading. If you are scanning, be sure to position the pages correctly in the scanner. Select Portrait for a vertically-oriented page. Select Landscape for a horizontally-oriented page. Select Flipped to automatically rotate a portrait page 180 degrees during the scan. Select Flipscape to automatically rotate a landscape page 180 degrees during the scan. A Flipped and Flipscape options are useful if you are scanning pages in a book and have trouble positioning the book in the scanner for certain pages. You can select Scan until Empty and Double-sided Pages for automatic processing if you are using a scanner with an automatic document feeder (ADF). tcon until Empty Select this to scan every page in the ADF when OmniPage performs automatic processing. For example, if you put multiple pages in the ADF and click the AUTO button, the first page will be scanned and then processed according to the selected zone and OCR commands. The next page will then be scanned and processed in the same manner. This process will continue until the ADF is empty. If you do not select Scan Until Empty, OmniPage will only scan the first page in the ADF and you will need to click the AUTO button again to process the next page. Double-sided Pages Select this when OmniPage performs automatic processing to scan pages that are printed on both sides. OmniPage will process the batch of pages in the ADF and then prompt you to turn the entire batch over to process the reverse sides. For example, if you have three double-sided pages numbered 1 through 6 (1 is on the front, 2 is on the back, and so on), OmniPage first processes pages 1, 3, and 5 and then prompts you to turn the batch over in the ADF. It then processes pages 6, 4, and 2. The resulting file consists of pages 1, 2, 3, 4, 5, and 6 in the correct order. You can divide a large batch of pages into several sections for processing. For example, if you divide a large batch into two sections, OmniPage would process one side of the first section and then the reverse side. It repeats this procedure with the second batch of pages and then appends them to pages of the first batch in the appropriate order. If you want to later save each batch as a separate file, insert blank pages as separators. A If you use a flatbed scanner without an ADF, do not select Double-sided Pages. Place the pages in the scanner in the order that you want them to be scanned. Select brightness setting for scanning your page. Use these options to account for variations in paper and print quality in much the same way you would adjust brightness on a copier. Depending on the quality of your page, the option you choose greatly affects recognition accuracy. You can select: 3D OCR with ANYPage/HP AccuPage 2 Auto Brightness with AnyPage/HP AccuPage 2 Manual Brightness. A The option that appears, AnyPage or HP AccuPage 2, depends on your scanner. HP AccuPage 2 is available with HP IIp, IIc, and IIcx scanners. AnyPage is available with all other supported grayscale scanners. 3D OCR with AnyPage/HP AccuPage 2 (Professional version only) Select this to combine 3D OCR and AnyPage/HP AccuPage 2 technologies to get the best scanned image and highest recognition accuracy possible. This option is only available with supported grayscale scanners. AnyPage/HP AccuPage 2 technology automatically adjusts an image to get the optimum brightness level for each area of text and graphics on a page. 3D OCR uses the grayscale information of a page during recognition to view characters clearly and completely. This combination of technologies delivers OmniPage's greatest accuracy possible. Use 3D OCR and AnyPage/HP AccuPage 2 for all kinds of pages whenever you want the best possible recognition results. This setting is especially useful when you scan poor quality pages, pages with very small type, or pages with text on colored or shaded backgrounds. 3D OCR and AnyPage/HP AccuPage 2 is slower than other settings. If you scan high-quality documents with crisp text on a white background, select Manual Brightness for the fastest results. 3D OCR adds 150 to 250K per page to the size of the image file. If you are scanning many pages, you may want to use the Auto Brightness with AnyPage/HP AccuPage 2 or Manual Brightness setting to save disk space. Auto Brightness with AnyPage/HP AccuPage 2 Select this to use AnyPage/HP AccuPage 2 technology to get high-quality scanned images and high recognition accuracy. This option is only available with supported grayscale scanners. AnyPage/HP AccuPage 2 technology automatically adjusts an image to get the optimum brightness level for each area of text and graphics on a page. This setting works well for most pages and is especially useful when you scan text on colored or shaded backgrounds. Auto Brightness with AnyPage/HP AccuPage 2 is slower than a manual setting. If you scan high-quality documents with crisp text on a white background, select a Manual Brightness setting for the fastest results. Manual Brightness Select this to manually adjust (lighten or darken) the brightness setting for scanning a page. The setting you choose is applied to the entire page area. Manual Brightness is the fastest setting if you scan highquality documents with crisp text on a white background. However, recognition accuracy is highest using AnyPage/HP AccuPage 2 and 3D OCR technologies. To adjust brightness, select the square in the slide, hold the mouse button down, and move the square to lighten or darken the setting. Or, click the left or right arrow on the slide. The number of settings available depends on the scanner you use. Use a setting in the middle to scan high-quality documents with crisp text on a white background. Use a darker setting for a page that has thin, broken characters. Use a lighter setting for a page that has thick, runtogether characters. The number in the edit box to the right of the slide quantifies the brightness level you select. Use this number as a reference for future documents. To evaluate the effectiveness of the brightness setting, watch the Character Window as OmniPage performs text recognition. Look for clear, legible text samples. Small Text If your scanner supports HP AccuPage 2, you can select the Small Text option for better recognition of text with small point sizes. Select this option to increase recognition accuracy if the text in your page image is between four and seven points. The Small Text option slightly increases processing time. Zones Options Click the Zones icon. in. the Settings Panel to select the zoning method that determines the flow of text during recognition. Use your right mouse button to click the Zone button in C~ the toolbar and automatically open the Settings Panel to Zones options. Regardless of how zones are created on a page image (manually, automatically or with a template), OmniPage uses the selected Zones option in the Settings Panel to determine the flow of text within each zone. OmniPage also uses the selected Zones option to draw and order zones on the page image when you choose the Auto Zones command in the drop-down list under the Zone button in the toolbar. For practical examples of choosing the best zoning method for various types of documents, please see Chapter 2, Tutorials Multiple Columns Select Multiple Columns if you want OmniPage to discern the column layout, determine the order of text, and distinguish graphics from text. This works well with most types of documents and is especially useful for newspaper articles and magazine pages. Using this method, OmniPage looks for regular vertical separations of text to define columns and then recognizes column-wide text zones. It starts at the top of the first column, moves to the bottom, then continues to the top of the next column, "snaking" throughout the text. Unless you have the Trll~ P~e feature (Professional version only) selected, the resulting recognized document displays the text in one column from beginning to end with any retained graphics at the bottom. A Select Retain Graphics in the Settings Panel OCR options when you select Multiple Columns; otherwise, graphics are discarded. Single Column or Table Select Single Column or Table if you want OmniPage to treat the entire page area as one column. This works well with documents such as spreadsheets, tables, financial forms, and memos. Using this method, OmniPage starts at the top of the page and moves to the bottom, outlining page-wide text zones. If OmniPage detects five or more spaces between columns, it assumes the page is in a spreadsheet format and inserts tabs as delimiters between the columns to preserve the format. You must draw zones manually, identify graphics with the Graphic zone contents file, and select Retain Graphics in the Settings Panel OCR options when you select Single Column or Table; otherwise, graphics are ~l i c~ r~ Select None if you want OmniPage to recognize the entire page area as a single text zone. Using this method, OmniPage does not discern column layout or distinguish graphics from text. It tries to recognize everything it sees on the page as text elements. None is the fastest option to use when you recognize manually drawn, text-only zones. It can also be useful for documents with very small text areas such as those found in pleading pages or telephone book pages. OCR Options Click the OCR icon in the Settings Panel to select input and output options that assist OmniPage during recognition and determine the format of the recognized document. Use your right mouse button to click the OCR button in the toolbar and automatically open the Settings Panel to OCR options. Input Options Input options determine the way OmniPage looks at text elements during recognition. You can specify a character type, select a training file (Professional version only), and have the page orientation automatically corrected. Chclrclcter Type The Character Type drop-down list box lets you identify the printed text characteristics in your document. Select Automatic to have OmniPage automatically distinguish between conventional and dot matrix printed text characters in the image you are recognizing. Select Normal if the image you are recognizing has conventionally printed text characters. Select Dot Matrix if the image you are recognizing has characters printed in draft mode by a 9-pin dot-matrix printer. Do not select Dot Matrix for pages printed in near-letter-quality mode or printed by a 24-pin dotmatrix printer. Select OCR-A if all the characters in the image you are recognizing are printed in OCR-A font. OCR-A is a special font used for items such as part numbers and utility bills. A If your document contains a mixture of OCR-A and a conventional font, select Normal or Automatic for faster recognition. Training File (Professional version only) The Training File drop-down list box lets you select a character training file (~.trn) that assists OmniPage with text recognition of special characters. Any training files that you create a~l~ear in this list. A character training file is a set of pre-recognized text characters that OmniPage compares with the characters in the page image during recognition. Before recognition, you can create a new training file or choose an existing one to assist with OCR. For more information on creating a character training file, please see Train OCR (Professional version only) on page 2-52. Automatically Correct Page Orientation Select this to correct an improperly oriented image by 90, 180, or 270 degrees during text recognition. For example, if you have a portrait page that was accidentally scanned upside-down, OmniPage will try to rotate it 180 degrees during recognition so that it is properly oriented in the text window. Use Language Analyst Select Use Language Analyst so that OmniPage automatically performs word and character analysis during the recognition process to check spelling and replace unknown words with words that are most likely to be correct. The Language Analyst uses information about language context and usage rules to evaluate words, compute likely errors, and determine replacement words. Replacement words al~ear in blue in the reco~nized document. Be sure to select the appropriate main and user you use the Language Analyst. You should also make sure that the aD~ro~riate lan~ua~e is selected. If any words in your document such as company-specific terms are replaced inappropriately during recognition, you can: Make sure Ignore Acronyms, Ignore Abbreviations, and Ignore Proper Nouns are selected in the Settings Panel Spelling options so that these types of words will not be replaced. Then re-recognize the document. Create you own user dictionary for special terms and select it as the user dictionary in the Settings Panel Spelling options. Then re-recognize the document. Please see Edit User Dictionary... on page 2-68 for more information on user dictionaries. Deselect Use Language Analyst and re-recognize the document. Use the Check Recognition command in the Edit menu to check for spelling errors and unknown word s . Retain Graphics Select Retain Graphics if you want OmniPage to retain original graphics such as photographs or diagrams in the recognized ] ment. To retain graphics, make sure to also: Select Save Page Images in Caere Document in the Settings Panel Preferences options before you scan or load an image. Select Multiple Columns as your Zones option so OmniPage will automatically distinguish graphics from text. Or, if you select Single Column or Table as your Zones option, create zones manually and identify graphics with the Graphic zone contents file. Select the True Page - Retain All Page Formatting OCR output option (Professional version only) to keep graphics in the same position in the recognized page that they were in the original page. If you do not use True Page, retained graphics are placed at the bottom of the recognized page. OmniPage Professional users can edit and save graphics in the Image Assistant 24-bit color and image-editing program. n(~llhl,o-,Ali,Ak the ~rAr~hic in vour reco~nized document to launch Image Assistant; the graphic will appear in a new image window. You can scan and edit color, grayscale, and line-art images with Image Assistant. For more information, see the Image Assistant Tutorial and the online documentation in the Image Assistant Help program. Output Options Output options determine the way text, graphics, and formatting will appear in the recognized document. You can select: True Page- Retain All Page Formatting Retain Font and Paragraph Formatting Only Ignore Fonts and All Formatting. Files saved in ASCII or ANSI format do not retain any ~ ù ~ formatting other than spaces and carriage returns. True Page- Retain All Page Formatting (Professional version only) Select this to reproduce the original page formatting and layout as closely as possible in the recognized document. True Page technology retains the original paragraph and font formatting. It also preserves the layout of your original page by creating "frames" around areas of text and graphics. These frames are exported intact when you save your document in an appropriate file format and open it in another application that supports frame-based layouts. Use the True Page setting if you want to duplicate a document, such as a resume, as closely as possible and do not plan to do a lot of editing to it after recognition. If you do plan to modify your recognized document, choose Retain Font and Paragraph Formatting Only or Ignore Fonts and All Formatting. True Page attempts to reproduce the following page formatting attributes: Relative text column positioning ù Relative graphic positioning (you must select Retain Graphics in the Settings Panel OCR options) Margins OCR Options ù Tabs ù Line Spacing ù Indentation ù Justification ù Blank vertical space ù Centered lines ù Font styles (select specific fonts in the Settings Panel Fonts options) ù Font sizes ù Character attributes (boldface, italics, underline) Retain Font and Paragraph Formatting Only Select this to retain font and paragraph formatting in the recognized document. This option does not retain the original page layout; it formats recognized text in a single column. If you also selected Retain Graphics, any graphics in your document appear at the bottom of the page. This option attempts to reproduce the following page formatting attributes: ù Margins ù Tabs ù Line Spacing ù Indentation ù Justification ù Blank vertical space ù Centered lines ù Font styles (select specific fonts in the Settings Panel Fonts options) ù Font sizes ù Character attributes (boldface, italics, underline) Ignore Fonts and All Formatting Select this to ignore fonts and all formatting in the recognized document and use a universal font and font size instead. Choose a font and font size for recognized text in the Ignored Font Formats section of the Settings Panel Fonts options . This option does not retain the original page layout; it formats recognized text in a single column. If you also selected Retain Graphics, any graphics in your document appear at the bottom of the page. Fonts Options Click the Fonts icon in the Settings Panel to select font format options for retaining or ignoring the original font styles. Retained Font Formats You can select fonts to map to the various font styles in your document if you choose True Page - Retain All Page Formatting (Professional version only) or Retain Font and Paragraph Formatting in the Settings Panel OCR options. OmniPage will detect the font styles of characters during recognition. Characters with a particular font style will be formatted in the recognized document according to the font selected for that style. For example, if you assign Arial to Serif Proportional font styles, characters with Times New Roman (a Serif Proportional style font) would be formatted with Arial font in the recognized document. Use the drop-down list boxes to assign fonts for the following font styles: Serif Proportional Character spacing varies depending on each character; short lines finish off the letter strokes. Sans-Serif Proportional Character spacing varies depending on each character; letter strokes do not have finishing lines. Serif and Monospaced Character spacing is the same for each character; short lines finish off the letter strokes. Sans-Serif and Monospaced Character spacing is the same for each character; letter strokes do not have finishing lines. Ignored Font Formats You must select a universal font and font size for recognized text in your document if you choose Ignore Fonts and All Formatting in the Settings Panel OCR options. OmniPage will ignore the font styles of characters during recognition. Instead, all of the characters will be formatted in the recognized document according to the font and font size you select. Select a font in the Font drop-down list box and type a font size in the Font Size edit box Spelling Options Click the Spelling icon in the Settings Panel to select dictionaries and spell checking options. Dictionaries You can select one main dictionary and one user (personal) dictionary. OmniPage uses the selected dictionaries for checking recognition and the Language Analyst; be sure to always select the appropriate dictionaries for your document. Mclin Dictionory Select a main dictionary in the Main Dictionary drop-down list box. Main dictionaries have the file extension .ndx. OmniPage is delivered with the United States English main dictionary, useng.ndx, and the United Kingdom main dictionary, ukeng.ndx. International versions of OmniPage also include dictionaries for other languages. To order dictionaries for additional languages, call your local Caere distributor or call Caere at (800) 535-SCAN. User Dictionclry Select a user (personal) dictionary from the User Dictionary drop-down list box. User dictionaries have the file extension .ud. To create a user dictionary or edit an existing user dictionary, choose Edit User Dictionary... from the Settings menu. For more information on creating and editing a user dictionary, please see Edit User Dictionary... on page 2-68. Spell Checking Options You can select the following spell checking options to be used by the Language Analyst and the check recognition process: Ignore Acronyms Ignore Proper Nouns Ignore Abbreviations Ignore Acronyms OmniPage will ignore entirely capitalized words of four characters or less (for example, HUD, USDA). Be sure to deselect Ignore Acronyms if you want the acronyms in your User Dictionary to be checked or if you want to add acronyms to your user dictionary. Ignore Proper Nouns OmniPage will ignore a word not beginning a sentence that has a capitalized first letter (for example, in He sawJane throw... OmniPage ignores the name Jane). Be sure to deselect Ignore Proper Nouns if you want the proper nouns in your User Dictionary to be checked or if you want to add proper nouns to your user dictionary. Ignore Abbreviations OmniPage will ignore a capitalized letter followed by three or fewer lowercase letters and a period (for example, Mrs., Dr., etc.). Be sure to deselect Ignore Abbreviations if you want the abbreviations in your User Dictionary to be checked or if you want to add abbreviations to your user dictionary. Preferences Options Click the Preferences icon in the Settings Panel to customize ~ general OmniPage operations. Save Page Images in Caere Document Select this option to save original page images in Caere Documents. An image is the "picture" of text and/or graphics that appears in the zone window when you scan a page or open a TIFF image file. To the page image, make sure Save Page Images in Caere ù ~ Document is selected before you scan or load a page image. You can reopen a Caere Document in OmniPage, make edits to recognized text, and save it in any other supported file format. However, you must save the original page images in a ~aere Document in order to: ù Retain graphics. ù Verify recognized text with its original image. ù Re-recognize pages. ù Defer recognition (Professional version only). Saving a Caere Document without page images allows quicker processing and saves disk space but does not allow any of the above operations. Preferences Options Prompt Before Deleting Pages Select this if you want OmniPage to prompt you before carrying out the Delete Page command. This gives you the option to cancel the operation before deleting a page. Save Senings on Quit Select this if you want to automatically save the current OmniPage settings when you exit the program. The Settings Panel options, language selection(s), and scanner selection will be retained until you select new settin~s. Reject Character Unrecognizable characters are represented by a red reject character in the recognized document. In the Reject CharacteY edit box, type in any character that you want to be the reject character. The default character is a tilde (~). For example, if OmniPage could not recognize the J in REJECT, and the tilde (~) was the reject character, the string RE~ECT would appear in your recognized document. Chapter 5 Editing Recognized Documents The OmniPage editor is designed for quick and efficient editing of any errors in your recognized document. It also has text editing and page formatting capabilities. Additionally, if you are using OmniPage Professional, you can use Image Assistant to edit graphics in your recognized document. Remember that OmniPage is designed to be used in conjunction with word-processing and desktop publishing applications, not to replace them. Extensive editing of your recognized document is more efficient in the applications designed for that purpose. For example, you can recognize a document, correct any errors, make some text and formatting changes, and then save your document to another application to continue working with it. This chapter discusses the factors that influence the output of your document, including: ù Choices Before OCR ù Editing Options After OCR ù Saving a Recognized Document Choices Before OCR The choices you make before performing OCR on your document have a significant impact on the resulting text format, page format, and accuracy. In particular, the following factors are important: ù OCR Output Options Font Options Retaining Graphics Language Analyst Language and Dictionary Selections Output Options The OCR output option that you select in the Settings Panel determines the way text and paragraph formatting will appear in your recognized document. You can select True Page - Retain All Page Formatting (Professional version only), Retain Fonts and Paragraph Formatting, or Ignore Fonts and All Formatting. A Regardless of the OCR output option you select, to retain graphics in your recognized document, you must also select Retain Graphics. Select an OCR output option True Page - Retain All Page Formatting (Professional version only) Select True Page - Retain All Page Formatting as the OCR output option if you want the recognized document to match the original page layout as closely as possible. Select True Page formatting only if you want to preserve the original page layout. If you plan to do extensive editing, such as adding additional paragraphs, you should select another OCR output option. True Page retains font characteristics, paragraph formatting, and the relative positioning of columns to match the original page layout. If you also selected Retain Graphics, graphics will appear in the same position as they were in the original page. contrast you'll see l~ sarnple s of how yow Irnage will look with 15 c~fferent contrast settings Choices Before OCR Retain Fonts and Paragraph Formatting Select Retain Fonts and Paragraph Formatting as the OCR output option if you want your recognized document to retain the fbnt characteristics and paragraph formatting of the ori~inal ima~e. This option does not retain the original page layout; it formats recognized text in a single column. If you also selected Retain Graphics, any graphics in your document will appear at the bottom of the recognized page. aper ~}~ .: ~n~rocuctlon of the personal computer has caused dramatic changes in the way businesses and individuals access, retrieve, share, store, analyse, and present information Today, more than 60 percent of the work force spends its time creating processing or distributing information compared to just 17 percent in 1950 Yet in spite of the widespread use of computers, the promise of a paperless office is far from a reality More than 90 per cent of me information generated today resides on paper In fact more than 150 billion pages of information are generated each year An estmated 10 percent of the information Ignore Fonts and All Formatting Select Ignore Fonts and All Formatting as the OCR output option if you plan to do a lot of editing or reformatting of the text in your recognized document. OmniPage will remove the font characteristics and paragraph formatting. Recognized text will appear in the font that you select for Ignored Font Formats in the Settings Panel Font options. This option does not retain the original page layout; it formats recognized text in a single column. If you also selected Retain Graphics, any graphics in your document will appear at the bottom of the recognized page. Paper Proliferates Despite Information Age Expectations The introduction of the personal computer has caused dramatic individuals access, retrieve, share, store, analy~e ~oday, more than 60 percent of the work force spends its time creating, processing, or ùdistributing information, compared to just 17 percent in 1950 Yet in spite of the widespread use of computers, the promise of a paperless ùoffice is far from a reality More than 90 percent of the information generated today resides 'on paper In fact, more than 150 billion pages 'of information are generated each year An estimated 10 percent of the information ,used by an organization is reused in some way, ,and research shows that percentage is increas' ing Employees in large and small businesses, educational organi:~ations, and Choices Before OCR Font Options The font choices that you make in the Settings Panel determine the appearance of text in your recognized document. If you select True Page or Retain Fonts and Paragraph Formatting, select fonts to map to the original page's font styles. Depending on the OCR output option you select in the Settings Panel, you will select either Retained Font Formats or Ignored Font Formats in the Settings Panel Fonts options. The drop-down menus display all of the fonts installed on your system. 'on~ F~ _ ,. OC~ _ _ . 1~ , S~nS ~_~ , ~fe :I FRI1I ù Fe~f ~;;; S pellmg . _ ~d MQr~&~C~ [ o~ el ~I :I FRI~ #nui~ ~e~f ~h~ F~r~SIze: If you select Ignore Fonts and All Formattinq, select a universal font and font size for all text. Retaining Graphics OmniPage can retain graphics in the original page image, such as photos or diagrams, and display them in your recognized document. To do so, select Retain Graphics in the Settings Panel OCR options before recognition. OmniPage Professional users can select True Page as the OCR output option so that graphics appear in their original position. Otherwise, graphics appear at the end of the recognized page. To retain graphics, make sure to also: Select Save Page Images in Caere Document in the Settings Panel Preferences options before you scan or load an image. Select Multiple Columns as your Zones option so OmniPage can distinguish graphics from text. Or, if you select Single Column or Table, create manual zones around graphics and identify their zone contents with the Graphic zone contents file. After recognition, OmniPage Professional users can edit retained graphics by launching Image Assistant directly from OmniPage. Simply double-click the graphic in your recognized document and Image Assistant launches with the graphic in a new image window. For more information about Image Assistant, please read the Image Assistant tutorials booklet and refer to its online Help program. You can save retained graphics individually or with the entire page image. For more information about saving graphics, see Export Image... on page 2-19. Choices Before OCR Language Analyst The Language Analyst uses information about language context and usage rules to evaluate characters and words during the recognition process. This method returns more accurate results than simply spell-checking after recognition is complete. Select Use Language Analyst in the Settings Panel OCR options. Make sure to select the appropriate language, main dictionary, and user dictionary for the document you are recognizing. The Language Analyst uses the dictionaries to analyze and correct text during recognition. Words corrected by the Language Analyst appear in blue in your recognized document. The Language Analyst shuts itself off automatically when it detects that the dictionary information is not improving recognition results. For example, if the main dictionary does not match the primary language of your document, language analysis will terminate. If your original is very clean with crisp text, you may want to deselect the Language Analyst to increase recognition speed. Languages and Dictionaries For the best recognition results, be sure to select the appropriate language for your document and select main and user dictionaries specific to that language. Lclngu~ges OmniPage supplies the appropriate characters (such as circumflexes, umlauts, etc.) for recognizing the following languages: English German French Italian Dutch Spanish Swedish/Finnish Portuguese Danish Norwegian Irish/Gaelic Select one or more language character sets using the Select Language... command in the Settings menu. E ngEsh Flonch I lalian Dulch Spanish Swedish~Finmsh P(tl~uguese Danish For fastest recognition, use only the minimum number of languages that are necessary. You should select only one language if you use the Language Analyst or 3D OCR (Professional version only). See Foreign-Language and Multilingual Documents on page 256 for an explanation of how to recognize foreign-language and multilingual documents. Choices Before OCR Dictionaries Select the appropriate main and user dictionaries for your document in the Settings Panel Spelling options. The Worldwide English version of OmniPage is delivered with the United States English main dictionary, useng.ndx, and the United Kingdom main dictionary, ukeng.ndx. To order dictionaries for additional languages, call your local Caere distributor or call Caere at (800) 535-SCAN. You can create your own user dictionaries. To create a new user dictionary, follow these steps: Choose Edit User DictionaYy... in the Settings menu. The Select File dialog box opens. Choices BefoYe OCR 2 Click New. The File to Save dialog box opens. 3 Type in a name of eight characters or less for the new dictionary and click OK. For example, if you were creating a French user dictionary, you might type fruser. OmniPage automaticallv aPPends a .ud extension. The Edit User Dictionary dialog box for the new dictionary opens. You can add words to the dictionary directly or import wnr~l~ frnm a text file: ù Click Add to add the word that you type in the User word edit box to your dictionary. The word will appear in the list box. ù Click Import... to add words from another application to your user dictionary. A dialog box prompts you to enter the file name and directory of the file you want to import; it can be any document or word list in ASCII format. OmniPage will go through the selected text file discard words already in the main or other user dictionaries, and add the remaining words to the new dictionarv. A You can also add words to your user dictionary C~, interactively using the Check Recognition command after recognition. Editing Options After OCR After you recognize your document, it appears in the text window. At this point, you can check recognition, verify the image, do some page formatting and text editing, and save the document in the desired file format. Overview of the Text Window You can use various editing tools in the text window to edit your recognized document. line-spacing Alignment buttons buttons Caere designs, develops, manufactures and markets Informahon products The Company~s products provlde a low cost, accurate means bar code data mto computer usable form For many applicatlons, the atkactive altemative to manual data enky, which is slow, tedious and e offers two famlhes of information recogmtion products OmniPage, September 1988, is a page recognition software product with version SEJ30, the IBM PC AT and compatibles (with a coprocessor card) an Company also markets a line of OCR and bar code data enky products products in 19~ and its bar code products in 1983 As a pioneer in th believes that infommation recognition markets, whether OCR, bar code technology driven, cost sensitive and often slow to develop Building OCR, the Company's skategy is to identify and pursue markets in v ' can be cost effectively automated The earliest infommation recognition systems required propriet Lower cost recognition systems were subsequently inkoduced, but th~ recognizing a few type styles and slzes and were unable to recognize t~ C ha racterformatting buttons Ibold, italics, underlinel Checking Recognition The appearance of the text in your recognized document indicates the overall results of recognition. Look for: ù Blue text: words corrected by the Language Analyst. Green text: suspects, or questionable characters, which OmniPage made an attempt to recognize. Red text: reject characters (~ is the default) representing unrecognizable characters. Use the Check Recognition button or the Check Recognition... command in the Edit menu to identify possible OCR errors and missl~ellin~s in the reco~nized document. OmniPage uses the currently selected main and user dictionaries to check recognition. When OmniPage finds a possible error, the Check Recognition dialog box shows the image bitmap of the error in the context of the original page mage. To see character bitmaps, be sure Save Page Images in Caere Document is selected in the Settings Panel Preferences options before you scan or load an image. Caere designs~ develo The Change To drop-down list box provides a list of suggested replacements for a word flagged as an error. You can: Click Ignore to ignore the word in future instances. Click Change to change the word as suggested. Type in or select another word and click Change. ù Click Add to add the word to the user dictionary. After you choose an editing option for a word, OmniPage automatically continues to find the next possible error. Veri~ying the Im age You can compare text in your recognized document with its original image in the Verification Window. In order to verify images, be sure Save Page Images in ,/~, Caere Document is selected in the Settings Panel Preferences options before you scan or load an image. To see its original image, simply double-click a word in the text window or choose Verify Image in the Edit menu. The Verification Window opens displaying a clear close-up of the selected word and surrounding area of text. Caere designs~ develops, manufactllres a lucts. The Company's products provide a Click in the text window again to close the verification window. You cannot verify the image of text that is cut and pasted ~ . ~ from one page to another. Formatting the Page and Editing Text OmniPage provides several tools to help you edit the text and page format in your recognized document. If you plan to make a lot of changes to the recognized text, however, it is generally more efficient to do so in your word-processing or desktop publishing application. True Pc~ge Format~ing (Professionc~l version only) If you are using the OmniPage Professional version and selected True Page - Retain All Page Formatting before recognition, the font characteristics and page layout of your recognized document should closely match the ori~inal ima~e. Editing Options After OCR With True Page, OmniPage produces various text and graphic zones in the recognized page which you can resize or reposition to change the page layout. Choose the Select Recognized Zones command in the Edit menu to select all of the text and graphic zones in the page. Handles appear on each zone. 1~ ' 17 ' ! ~ ' ;r e~aper ù The mhroduchon of the~ersonal computer has caused dramahc changes m the way busi~lesses and individuals access, rehleve, share, store, analyze, and present informahon ù Today, more than 60 pe~cent of the work ù ~ To man force spends its hme creahng processlng or easy-to-u dishibuhmg mformahon compared to just 17 informaho percent in 1950 ~et m spite of the widespread ~uickly tr ~se of computers, the promlse of a paperless ~ puters offlce is far from a reality More than ~o per- cent of the informahon generated today resides on paper In facS more than 150 billion pages I OmniPag of informahon are generated each year Lproducts, n eshmated l o percent of the informahon ~sed by an organizahon is reused in some way, ~nrt r,s,~rrh shnwc th~t n,rr,nta~e is increas- Fecognlho computer mprove lt A You can also select zones individually by placing the mouse pointer inside a zone and doing an Alt-right mouse click. To resize a selected text or graphic zone, use your mouse to drag a handle in the direction that you want to enlarge or reduce the zone. To move a selected zone to another area of the recognized page, place the mouse pointer inside the zone, hold the mouse button down, and drag it to the desired location. Choose Select Recognized Zones in the Edit menu again to deselect the zones in the recognized document. Editinq O~tions After OCR Click in a selected zone and choose Delete Recognized Zone in the Edit menu to delete that zone. Paragraph Formatting You can change the line spacing and alignment attributes of a selected paragraph in your recognized document. Paragraph formatting will not be retained if you save a C~ file in ASCII or ANSI format. To apply paragraph formatting, follow these steps: Place the cursor somewhere within the paragraph that you want to format. 2 Choose Paragraph... in the Format menu. 3 Make the desired formatting selections in the Paragraph Format dialog box. 9~1~ You can select Single, Double, or Triple line spacing. For paragraph alignment, you can select Left, Center, Right, or JustifY. 4 Click OK to accept the formatting selections; the selected paragraph will change accordingly. Click Cancel to exit without applying the formatting selections. You can also use the line-spacing and alignment buttons in thf~ t~xt win~lnw f~r form~ttinu chnrt~lltc Tab Formatting Use the Tab-setting buttons in the text window to insert tabs in Your reco~nized document. Leftaligned tab button ~ E3E3~3~ Decimal-aligned tab button Center-aligned tab button Right-aligned tab button To apply tab formatting, follow these steps: Select the paragraph in which you want to set tab stops. 2 Click the appropriate Tab-setting button (left, center, right, or decimal). Click the area in the upper-half of the ruler where you want to place the tab stop. 4 Repeat steps 1 through 3 to continue setting tabs. Character Formatting You can change the attributes of a selected character or section of text in a recognized document. Character formatting will not be retained if you save a ù ~ file in ASCII or ANSI format. To apply character formatting, follow these steps: Position the text cursor at the start of the text, hold the mouse button down, and drag the cursor across the text to highlight it. Release the mouse button when you have selected the desired area of text. Choose Character... in the Format menu. Editin~ Options After OCR 4 Make the desired formatting selections in the Font dialog box. You can select multiple attributes for text, including font, font style, size, and effects. The Sample box illustrates the attributes that you select. 5 Click OK to accept the formatting selections. The selected text changes accordingly. Click Cancel to exit without applying the formatting selections. You can also use the Bold, Italics, and Underline buttons in the text window for convenient formatting shortcuts. Other Useful Editing Commands Choose SelectAII in Page in the Edit menu to all the text in the text window. This is an easy way to apply formatting changes universally. To deselect a selected page, click anywhere in the text window or choose Select All in Page again. Use the Cut button or the Cut command in the Edit menu to cut selected text in a recognized document. Use the Copy button or the Copy command in the Edit menu to copy selected text in a recognized document. Use the Paste button or the Paste command in the Edit menu to paste cut or copied text in a recognized document. Use the Find/Replace button or the Find/Replace command in the Edit menu to find and replace words in a recognized document. For more information about these commands, see their respective menu entries in Chapter 1, Commands and Settings. Saving Your Document Use the Save As... button or Save As... command in the File menu to save your recognized document to the desired file format. To save your recognized document in more than one file format, you can: Save the file as a Caere Document (*.met). By saving your document as a Caere Document, you can continue to reopen it in OmniPage, make edits, and save it in any other supported file format you wish. A Caere Document can have up to 255 pages; each page can include the original image, zones, and recognized text. Save the initially recognized document in each desired format using Save As... while it is open in the text window. Remember, only a Caere Document can be reopened (and resaved in a different format) in OmniPage . The way text appears when you open your recognized document in another application depends on the features of the chosen file format and application. For example, if you save a page with text and graphics in ASCII format, only the text will be displayed when you open the file in a new application because ASCII format does not retain graphics. Likewise, graphics are only displayed in applications that support that capability. Normal differences in typeface sizes between applications can result in differences in the page formatting and display of the text. The settings within the application, such as margins, also affect the page layout. Savina Your Document If you use the True Page option (Professional version only), OmniPage exports text in frames. If your application doesn't accept frames, the text frames are not maintained in their original positions and the text within the frames is displayed in one, vertical column. Chapter 6 Improving Performance You can make OmniPage run faster and recognize text more accurately by learning how to use a few different settings. Speed up OmniPage by selecting Manual Brightness, by turning the Language Analyst feature off, and by manually selecting zones for text recognition. Computing power is what affects speed the most. A 486 computer is dramatically faster than a 386. Also, 8MB of system RAM is a minimum; as with most CPU-intensive programs, more memory is better. Improve text-recognition accuracy with the Professional version by selecting 3D OCR with AnyPage/HPAccuPage 2 or by selecting Auto Brightness with AnyPage/HP AccuPage 2 with the non-Professional version. The Language Analyst feature improves accuracy considerably, as does taking into account document quality, scanning angle, and paper transparency. Improving Speed OmniPage is designed to run automatically, making text recognition easy and effortless. However, the automatic features can take longer to work. Using the Manual Brightness setting and turning off the Language Analyst can make OmniPage run faster. The 3D OCR with AnyPage/HP AccuPage Z (Professional version only) and Auto Brightness with AnyPage/HP AccuPage 2 features improve accuracy considerably with a variety of documents. However, these automatic features sacrifice speed to provide better accuracy. The brightness settings are available in the Settings Panel scanner options. To access this panel, choose Settings Panel in the Settings menu or click on the Settings Panel button in the toolbar. Then click on the Scanner icon in the left of the panel. Manual Brightness Use the Manual Brightness control in the Settings Panel Scanner options if you're scanning high-quality printed documents with crisp, black text printed on a white background . If text characters on your document tend to be thick and overlapping, adjust the brightness slide towards Lighten. If characters appear thin and broken, adjust the setting towards Darken. If characters appear at an angle, reposition the document in the scanner and rescan. The Character Window appears while OmniPage performs text recognition. ~ frc It shows samples of the scanned image as OmniPage sees them. The following figure shows how well-formed characters appear in the Character Window. No special brightness adjustment is needed. The following figure shows how thin, broken characters appear in the Character Window. Try adjusting the brightness control toward Darken and rescan. The following figure shows how thick, run-together characters appear in the Character Window. Try adjusting the brightness control toward Lighten and rescan. Language Anal~st The Language Analyst feature uses information about language context and usage rules to evaluate characters, compute likely errors, and determine replacement words. It improves text recognition on difficult documents considerably. However, as with the auto brightness features, if you scan high-quality documents with crisp, black letters printed on white paper, recognition is faster with the Language Analyst tnrn~ ff The Use Language Analyst setting is available in the Settings Panel OCR options. Choose Settings Panel in the Settings menu or click the Settings Panel button in the toolbar. Then click the o~R icon in the left side of the panel. Zones The Manual Zones feature lets you draw selection boxes around just the parts of a page you want recognized. Using this feature, you don't need to wait while OmniPage recognizes unnecessary text. See Complex Layouts on page 2-38 for detailed instructions on how to use the Manual Zones feature. Set Up a Permanent Windows Swap File To increase OmniPage's speed, set up a permanent Windows swap file (virtual memory) with at least 4MB of free, contiguous disk space. For more information, please see Setting up a Windows Swap File (Virtual Memory) on page 2-8. Improvina Accuracy Improving Accuracy If you scan typeset, high-quality printed pages, you will probably find that OmniPage recognizes text perfectly: the text that appears in your word processor matches the text in the scanned page letter for letter. With lesser-quality pages, text-recognition accuracy will be poorer. These factors most affect text-recognition accuracy: ù Document Quality ù Scanner Options ù Scanning Angle ù Scanner Glass Clarity ù Paper Transparency Docu ment Quality OmniPage recognizes characters in almost any font from 6 to 72 points in size. However, keep the following in mind when using OmniPage: ù The print should be reasonably clean and crisp. Characters must be distinct: separated from each other and not blotched together or overlapping. ù The document should be free of notes, lines, or doodles; anything that is not a printed character will slow OmniPage considerably, and any character distorted by a mark will be unrecognizable. ù The document font should be non-stylized; for example, OmniPage won't recognize the Zapf Chancery font accurately. ù It's hard to recognize underlined text accurately; the underline changes the shape of descenders on the letters q, g, y, p, and j. Scanner Options The Scanner options are your most powerful means of improving text-recognition accuracy. They are available in the Settings Panel Scanner options. The 3D OCR with AnyPage/HP AccuPage 2 feature (Professional version only) recognizes text most accurately on the widest range of documents: faxes, copies of copies, etc. This setting, when used with the Language Analyst, provides the best recognition accuracy possible. This feature is only available with grayscale scanners. The Auto Brightness with AnyPage/HPAccuPage 2 feature uses Caere AnyPage or HP AccuPage technology to improve accuracy considerably if your page is dirty, if text is printed on a colored background, or if the page has shading from a copy machine. It is slightly less accurate and slightly faster than the 3D OCR with AnyPage feature. This feature is only available with grayscale scanners. Scanning Angle Make sure that the document is positioned correctly in your scanner and is not slanted. Even if you put a page in the scanner correctly, it is still possible for the page to be turned slightly so that text will be difficult to recognize. The final document may have missing characters, split lines of text, or several words recognized incorrectly if the page is not scanned correctly. Improving Accuracy If you notice that the page is crooked in the Character Window, adjust and rescan it. If you are scanning a multiplepage document and notice poor recognition on certain pages, it may be that those pages were crooked in the scanner. Try scanning them again. Scanner Glass Clarity The sheet of glass on the flatbed of the scanner must be clear. If it gets dirty, wipe it gently with a soft, damp, lint-free cloth or tissue. Be sure it is completely dry before you put pages on Paper Transparency Some paper is thin enough that the scanner sees text printed on the opposite side of the scanned page. This is often the case with telephone-book pages. To correct this problem, put a black piece of paper behind the page between the page and the lid of the scanner. Chapter 7 Troubleshooting Use this chapter if you have trouble getting the program or your scanner to run properly. If the program runs but there are many errors in your text files, or the program seems too slow, see Chapter 4, Improving Performance. There are seven sections in this chapter: Before You Begin Installation Scanners Memory Operation Error Messages Caere Product Support As a general rule for any software product, if you're having trouble and you can't figure out what to do, it may help to reboot and restart the program. Reinstalling the software often eliminates inexplicable problems. If possible, be sure to save any open files in OmniPage or other applications before you reboot. Before You Begin Before You Begin Whatever problem you are experiencing with OmniPage, first verify that your computer, scanner, and other applications are functioning properly. ù Make sure that your system meets all the requirements for hardware, memory, and software as listed in Chapter 1, Installation. ù Verify that the scanner is plugged in, turned on, and that all cable connections are secure. ù Check to see that the image-scanning software that came with your scanner is installed and working properly. Resolve any problems that occur with Windows or your image-scanning software before you try using OmniPage again. You should also run virus-checking software regularly to ensure that performance problems are not caused by a virus. Installation problems may result from: An inadequate or incompatible system configuration-- double-check that your system meets the system requirements listed in Chapter 1, Installation. ù A bad disk or corrupted file. Installing OmniPage with the Norton Desktop Some versions of the Norton Desktop are incompatible with the OmniPage installation program. If you run Windows under Norton Desktop and you have difficulty installing OmniPage, follow these procedures: Open the Windows system.ini file in a text editor and change the line Shell=ndw.exe to Shell=progman.exe. This will make the Windows Program Manager appear when you start Windows rather than the Norton De sktop . If you're not sure how to edit the system.ini file, consult Your Windows User's Guide. 2 Save the file. 3 Restart Windows . 4 Install OmniPage according to the instructions in Chapter 1, Installation. Re-open the Windows system.ini file and change the linP ~h~ ro~man.exe back to Shell=ndw.exe. This will make the Norton Desktop appear when you start Windows. 6 Save the file. 7 Restart Windows . Conflicts with Disk Cache Programs Some disk cache programs interfere with memory allocation and will prevent you from installing successfully. If you are using a disk cache program other than Windows smartdrv.sys, temporarily disable it and try to install OmniPage again. Then A ~ A h I A t h A~h A r~ r ~1 ~ r A mA n d v e r i f vt h a t i t w o r k sP e r f o r m a n c e Installation will not be acceptable if you do not use a disk cache program: in most cases, you should use smartdrv.sys. Using E M M 386.EXE You may not be able to use this memory manager with OmniPage. Try running OmniPage with just the default Windows memory management programs. Consult your memory manager's documentation for instructions on how to de-install emm386.exe from your config.sys file. SETU P repeatedly requests the same disk. If the correct disk is in the disk drive, the disk is probably damaged. To check the disk, exit the installation program and Windows. From the DOS prompt, type dir B: (if you are installing from drive A:, type dir A:). If you receive an error message from DOS, the floppy disk is damaged. If you are able to see the disk directory, try to copy a file from the OmniPage disk to your hard disk. DOS may be unable to copy files from the disk even if it can read the directory. If the disk is damaged, contact Product Support for a replacement. See ~aere Product Support on page 7-25. Testing OmniPage with a Simplified System If OmniPage won't run correctly, try running it on a simplified setup: a system with no network commands, other drivers, memory managers, etc. This will eliminate the possibility of conflicts with any other devices or drivers. Use a text editor to comment out any memory-resident device drivers and applications from your autoexec.bat and config.sys files not used by Windows, OmniPage, your scanner, your hard drive, or your monitor and reboot your system. See your DOS documentation for instructions on how to edit your autoexec.bat and config.sys files. If OmniPage then runs, you'll know that your problem is a conflict with an item in your system's config.sys or autoexec.bat files. One by one, you can add items back to the config.sys and autoexec.bat files, reboot, and start OmniPage. When OmniPage no longer runs, you'll know that the last item you added is incompatible with OmniPage. Do not run emm386.exe in your config.sys file. Once a scanner is installed and working with its image scanning software, most users can install and use OmniPage with no other changes to their system. To get up and running as quickly as possible, install your scanner hardware and any software you received with it, including the scanner driver, according to the manufacturer's instructions. Use the scanning software supplied by the manufacturer to be sure that the scanner is working on your system before scanning with OmniPage. Consult your scanner documentation or the manufacturer's product support if your scanner does not work with the manufacturer-supplied scanning software. Resolve any problems before continuing. If your scanner operates with the manufacturer's image scanning software but not with OmniPage, use the following topics to pinpoint and correct the problem. The Scan Image commands are grayed out. This usually happens when OmniPage defaults to the scanner setting No Scanner. Make sure that your scanner is turned on and choose Select Scanner in the Settings menu. Select the scanner that you are using and click OK. "Can't Open Scanner" message displays. Make sure your scanner is turned on. If the scanner was turned off when you started OmniPage, turn it on and choose Select Scanner in the Settings menu. Select the scanner that you are using and click OK. If this does not work, attempt to scan with the software that came with your scanner to see if the problem is with your scanner hardware. Microtek Scanners Set the scanner speed to 2 for best accuracy. Consult your Microtek scanner documentation for instructions on how to set scanner speed. Testing OmniPage with the Sample Pages If OmniPage successfully recognizes text from an image file, the problem is probably scanner related. If OmniPage is unable to nerform reco~nition at all, review the installation and Scanners operation troubleshooting sections and the corresponding error messages in the Error Messages section of this chapter. Use one of the OmniPage Sample Pages (such as the Multiple Column Page Sample) to verify the functionality, recognition performance, and accuracy of OmniPage. Successful completion of the test procedure does the following: ù Verifies OmniPage's ability to perform text recognition. ù Provides a benchmark for recognition time on your system. ù Verifies recognition accuracy independently of your document or scanner image quality. If OmniPage seems to run slowly even with all other applications closed, use the test file to find the typical recognition time for a good-quality document and image on your system. The test file will produce a text file with near-100% text recognition accuracy. If you are unable to achieve a similar level of accuracy with your documents, review Chapter 4, Improving Performance, for possible problem areas and solutions. To load the TIFF file: Open the Settings panel by choosing Settings Panel... in the Settings menu. 2 Click Use Defaults in the Settings Panel. 3 Click OK to confirm that you want to reset all settings. 4 Select Load Image in the drop-down list under the Image button in the toolbar. 5 Click the AUTO button in the toolbar. The Load Image dialog box will appear. 6 Open the file test.tif in your omnipage/data directory. OmniPage should now determine text zones and perform recognition. Checking the Scanner Driver Name and Version At least two files allow OmniPage to communicate with the scanner hardware: One or more Device Driver files, supplied by the scanner manufacturer, which tell your computer how to communicate with the scanner. A Scmgr file, supplied by Caere, which tells OmniPage how to communicate with your computer and the scanner. The scmgr file is named in the format scmgrxx.exe where x is a number. Device Drivers Scanners require that a file called a device driver be installed on your hard disk. This file tells the computer how to communicate with the scanner. The name of the file is referenced in your config.sys or autoexec.bat file. When you turn your system on, DOS reads the config.sys and autoexec.bat files. It sees the reference to the device driver file and then reads that file. The name of the file referenced in the config.sys or autoexec.bat file must match the name of the file installed on your disk. Also, the version of the file must be compatible with OmniPage . Check the Supported Scanners list in the Release Notes to find the device driver name and version for your scanner. Compare this information with the contents of your config.sys or autoexec.bat file. Look for the driver version in the message given by the driver as it loads (when your system boots) or on the diskette label of your scanning software. Sometimes the scanning software provided by the scanner manufacturer is not the device driver OmniPage needs. If the device driver name and version in your config.sys or autoexec.bat file does not match the information listed for your scanner, check the Supported Scanners list or the drivers in your OmniPage directory to see if the correct driver is supplied with your disk set. Then modify your config.sys or autoexec.bat to use the Caere-suPplied driver. If the correct device driver is not supplied on your OmniPage disk set (which will happen if Caere does not have a license to distribute the driver), contact [ne scanner manufacturer and request the driver and version specified in the Supported ~nnf~rc li~t in th(~ R~l~a~tq Nnt~. If you use an extended memory manager, avoid loading your scanner driver high in memory. Consult your memory manager's documentation for detailed information about loading drivers high. For information on how to edit your config.sys or autoexec.bat file, see your DOS Operations Manual. Be sure to reboot your computer when you have finished editing. DOS will load the new device driver when the system reboots. Record any parameter information for the device driver. For a few scanners, this information will be required when you select your scanner for the first time or if you change your ~ann~r installation. Here are a few examples of device driver entries in a config.sys or autoexec.bat file. A typical config.sys entry for a Microtek scanner device driver would look like this: This line in your config.sys file tells DOS to load the device driver named mscan.sys, which is located in the omnipage (or omnipro for Professional users) directory on the C: drive. Some device drivers require one or more parameters designating port address, interrupt (IRQ), or Direct Memory Access (DMA) channel. The values for the parameters are determined by the switch settings and addresses used in the scanner hardware installation. See your scanner's documentation for information on these settin~s. An entry in your config.sys for the Complete Page Scanner might look like this: OmniPage Professional users would have a directory named omnipro. Other device drivers are loaded with a batch ~ile, usually the autoexec.bat file, instead of the config.sys. An entry in your autoexec.bat for the Canon IX-12 scanner could look like this: \omnipage\ixhnd2/08 This line in your autoexec.bat file tells DOS to load the device driver called IXHND2 and defines the port address and memory address for the device driver to use when communicating with the scanner interface card. /j\ Load only one device driver for your scanner at a time. C~ Multiple device drivers for the same scanner may cause problems between OmniPage and the scanner hardware. Scmgy The OmniPage Scmgr file lets OmniPage communicate with your system and your scanner. There is a different Scmgr file for each scanner or scanner family supported by OmniPage. The device driver entry in your config.sys or autoexec.bat must be the device driver the Scmgr file expects or an error will occur. Choose Select Scanner in the Settings menu if you are unsure which scanner is currently selected. Then check the Supported Scanners list in the Release notes and make sure the appropriate Scmgr file is located in your OmniPage directory. The scmgr file is named in the format scmgrxx.exe where x is a number. If you can't find the appropriate Scmgr file, reinstall OmniPage software. Checking the Scanner Hardware If you experience a problem between OmniPage and your scanner, make sure the hardware used matches the hardware OmniPage supports. For example, OmniPage supports a Canon IX-12 scanner, but only with the Canon IF-3 interface card, not the older IF-2 or the JLaser interface card for Canon scanners. Check the Supported Scanners list in the Release notes. For a few scanners, you will also need to know the scanner interface card switch settings and addresses and enter them in the Select Scanner dialog box. Refer to your scanner's owner's manual for more detailed information. Scanners Changing your Scanner Installation If you change scanners, you will need to register the new scanner with OmniPage. After you have installed and tested your scanner with the manufacturer's scanning software, start OmniPage and choo Select Scanner in the Settings menu. The Select Scanner dial~ box will appear. Your scanner's name should appear in the li box. Highlight the name of your scanner and click OK. With a few scanners, you may need to provide the 1/0 (po address and speed. Refer to your scanner's owner's manual for more detailed information. Scanning Causes System Crash If you experience a system crash when you try to scan, add the line EMMExc1ude=A000-EEFF under [386Enh] in your system.ini file and restart Windows. It may be difficult to determine that OmniPage is running poorly due to a memory problem. However, you can optimize your system to reduce the possibility. Here are a few tips: Use a known compatible extended memory manager like himem.sys. Add the EMMExclude=A000-EFFF line to the system.ini file [386enh] section. The characters after the A are zeroes. If using Quarterdeck's QEMM386 as the memory manager, add a NOEMS switch to the qemm386.sys statement in config.sys. You must also remove the stealth feature from this command (ST:x). The command should be as simple as possible: device=\drive\QEMM\QEMM386.SYS RAM NOEMS Use at least 8MB of physical RAM. Keep the virtual memory swap file greater than 4MB and less than 8MB. Keep at least 20MB of free disk space available on the drive where the temp files are stored. Use the Windows Smartdrive program for disk caching. Verify that there is at least 7MB RAM available in Windows: choose About... in the Program Manager Help menu and see the Memory entry. Operation The following topics cover some commonly seen problems. OmniPage No Longer Works If you have used OmniPage without difficulty in the past, you may have altered your system configuration. Make sure that your scanner still works with other software. If you have recently installed a new application, this may be the case. To determine the last time the autoexec.bat and config.sys files were modified, check the date and time displayed by the DOS DIR command. Edit the files if necessary. OmniPage works but operates slowly and frequently accesses the hard disk drive. As a Windows 3.1 application, OmniPage is able to take advantage of virtual memory when running low on memory. This may occur with a minimally configured system (only 8MB of RAM) if memory has become fragmented with use or if other applications are running in the background. When low memory conditions occur, Windows will use disk space to simulate the RAM it does not have available. Disk access time is much longer than RAM access time, so the computer system will run much more slowly when it has to use virtual memory. A quick fix for memory fragmentation problems is to quit OmniPage and Windows and reboot your system. This will clear any fragmentation of memory that has occurred (until, of course, it happens again). Try closing any other applications that are running in the background. This will usually free enough memory for OmniPage to operate without using the swap file. If you regularly work with long, complex documents, adding more RAM to your system is the best solution. For more information on optimizing your system and application performance under Windows, refer to the Windows User's Guide. OmniPage is too slow. If recognition accuracy is good and you have enough memory, you may want to run OmniPage on a faster computer to improve its speed. OCR is a very time- and memory-intensive process, so the processing power of your computer will determine the speed of processes. For example, you will notice a significant speed improvement if you upgrade from a 386\sx to 386 25Mhz or from a 386 25Mhz to a 486 33Mhz. Faxes are not recognized accurately. Recognizing faxes accurately can be difficult. A typical fax machine produces documents at 200x100 dpi; a typical scanner scans at 300x300 dpi. Because of the lost resolution, OCR has less information to work with, and the accuracy is not as good as it would be on the printed original. In addition, if the fax is printed on thermal paper, it is more difficult to scan than a document on regular white paper. There are three things you can do to improve the accuracy of fax recognition: If a fax modem is connected to your computer, you can receive a fax as a file without printing it. You can then load the fax file as an image in OmniPage and then recognize the text. However, OmniPage must support the file format that your fax card produces. These formats are listed in the Supported File Formats section of the Release Notes. If your fax card's file format is not supported, your fax software may be able to convert the fax to a PCX or Uncompressed TIFF image that OmniPage can open and recognize. Have senders select Fine Mode when they send you a fax. This improves scan resolution to 200x200 dpi. It also helps if they choose a sans-serif font that is 11 points or larger. Request that senders send faxes to you directly from a fax card in their computer. Transmissions directly from fax cards are much clearer than fax-machine transmissions because they do not involve low- resolution scanning. The scanner will begin to scan and stop. The entire system locks up and you have to reboot your computer. You may have an interrupt conflict between your scanner and another device. If you have a bus mouse and you usually do not use the mouse and scanner at the same time, check the interrupt used by the scanner and mouse for a possible conflict. The interrupt address typically used by some network cards may cause the same problem. Operation OmniPage hangs the system at the beginning of the recognition process. Many computer systems provide a feature called shadow RAM to enhance system performance. If OmniPage causes the system to hang, turn off the shadow RAM function of your computer and try again. Refer to your computer's operations manual for information on disabling shadow RAM. Some computer systems do not allow you to turn ù shadow RAM off. Incompatibilities with these systems are usually not related to shadow RAM. System hangs may be related to incompatibilities with memory resident applications or device drivers. Use a text editor to comment out any memory-resident device drivers and applications from your autoexec.bat and config.sys files not used by Windows, OmniPage, your scanner, or your hard drive and reboot your system. Do not remove a device driver unless you are aware of its function and know it may be safely removed. Hard disks often require special device drivers that should not be removed. Video displays that require special device drivers may need to be reconfigured instead of removed. Make a backup boot disk with your current operating system version, autoexec.bat, and config.sys to guard against potential mistakes. You receive garbage or nothing when you attempt text recognition AND when you select manual zones, you see vertical lines running through the document image or no image at all. The memory address for your scanner interface card is probably interfering with the memory address for your video display adaptor. Use the instructions in your scanner owner's manual to move the scanner interface card to a different memory address. Error Messages A maximum of 250 files can be selected. You tried to load over the maximum of 250 files. A maximum of 256 pages can be saved in a Caere Document. You tried to scan or load pages that would increase the size of the file over the 256-page limit. Cannot find the requested scanner driver. The scanner driver file has been deleted or moved from its proper location in the boot drive or in the OmniPage directory. Be sure your scanner works with the scanner manufacturer's scanning software and reinstall OmniPage. Cannot read file filename.ext. The SETUP installation program presents this error message when it cannot read a file on the OmniPage disk set. The file is probably corrupted. Contact Product Support for a replacement disk. Error accessing the user dictionary. Try freeing up hard disk space. Dictionary size is limited to 32K. You may have moved OmniPage to a different location on your hard disk, or you may have renamed directories in the path where OmniPage is located. Reinstall OmniPage. Error adding word to the user dictionary. You are low on free disk space or the user dictionary is full. The user dictionary capacity is 32K or about 5300 words. Free up some disk space or edit the user dictionary to remove unnecessary words. Error connecting the text editor. The file may be corrupt or moved; try reinstalling OmniPage. If the error persists, please call product support. An internal program file may have been damaged or is no longer in the OmniPage directory. Please reinstall OmniPage. If this doesn't work, call product support. Error converting to an image file. If the error persists, please call product support. An internal program file may not work with a particular file or may be corrupted. If the problem happens consistently with just one file, you may not be able to OCR that file. If the problem happens with different files, please call product support. Error creating a mail document. Be sure your mail application is working correctly. Try freeing up hard disk space in your TEMP directory. Test your mail application in another application to be sure it is working correctly. Be sure there is at least lMB of free disk space in your \temp directory. Your \temp directory is specified in your config.sys file in the line: set temp=c:\name Error creating the zone window. Try freeing up hard disk space or try closing other applications. OmniPage requires 8MB of available RAM configured for use with Windows running in Enhanced mode. The 4MB Windows permanent swap file usually provides enough memory to allow OmniPage to run at any time. However, if you have several applications open, you may not have enough memory to run OmniPage. Close one or more open applications to free enough memory to run OmniPa~e. You can check available memory by choosing About ProgYam Manager in the Program Manager Help menu. You should have at least 8MB available memory. Try deleting unnecessary files from your hard disk to free up space if you do have enough memory. You may have run out of disk space. Error deleting page. An internal program file may not work with a particular file or may be corrupted. If the problem happens with different files, please call product support. Error during conversion of the file (%i). Try freeing up hard disk space or closing other applications. You may be short of either volatile (RAM) memory or storage space. Try running OmniPage as the only application and delete all unnecessary files from your hard disk to maximize free hard disk space. Error during OCR. The page may be too complex. Try using Manual Zones to recognize smaller areas of the page. . The page has a very complex layout or has very small text and requires too much memory to recognize. Select Manual Zones in the drop-down list under the Zone process button and try drawing fewer, smaller areas on the page. Error finding blocks on the page. The page may be too complex. The page is very complex or has very small text and requires too much memory to recognize. Select Manual Zones in the drop-down list under the Zone process button and try drawing fewer, smaller areas on the page. Error finding or reading FFNNLOG.DAT file. Reinstall OmniPage. This file has been deleted or moved from its proper location in the OmniPage directory. Reinstall OmniPage. Error finding zones on the page. The page may be too complex. Try using Manual Zones to recognize smaller areas of the page. Your page has a very complex layout or has very small tex and requires too much memory to recognize. Select Manual Zones in the drop-down list under the Zone process button and try drawing fewer, smaller areas on the page. Error getting image from OCR. An internal program file may not work with a particular file or may be corrupted. If the problem happens with different files, please call product support. Error getting the image from the scanner. Please check your scanner settings in the Settings Panel and try again. You may have selected an inappropriate setting for your page. Open the Settings Panel and make sure that the selected Scanner options, such as page size and orientation, are correct Error Messages Error initializing OCR. Try closing other applications. OmniPage requires 8MB of available RAM configured for use with Windows running in Enhanced mode. The 4MB Windows permanent swap file usually provides enough memory to allow OmniPage to run at any time. However, if you have several applications open, you may not have enough memory to run OmniPage. Close one or more open applications to free enough memory to run OmniPage. You can check available memory by choosing About Program Manager in the Program Manager Help menu. You should have at least 8MB available memory. Error launching Notepad. Open a different text editor to read the Release Notes. You may have removed the Windows text editor program from its usual location in the Windows directory. The release notes file is titled readme.txt and should be located in your OmniPage directory. You can open the file with any word processing program. Error loading conversion code. The file may be corrupt or moved - try reinstalling OmniPage. If the error persists, please call product support. An internal program file may have been damaged or is no longer in the OmniPage directory. Please reinstall OmniPage. If this doesn't work, call product support. Error loading the training module. The file may be corrupt or moved - try reinstalling OmniPage. If the error persists, please call product support. An internal program file may have been damaged or is no longer in the OmniPage directory. Either restore the file (rtrain.dll) or reinstall OmniPage. If this doesn't work, call product support. Error loading the user interface module. The file may be corrupt or moved - try reinstalling OmniPage. If the error persists, please call product support. An internal program file may have been damaged or is no longer in the OmniPage directory. Please reinstall OmniPage. If this doesn't work, call product support. Error opening file. Try closing other applications and verify that you are opening a valid file type. OmniPage requires 8MB of available RAM configured for use with Windows running in Enhanced mode. The 4MB Windows permanent swap file usually provides enough memory to allow OmniPage to run at any time. However, if you have several applications open, you may not have enough memory to run OmniPage. Close one or more open applications to free enough memory to run OmniPage. You can check available memory by choosing About Program Manager in the Program Manager Help menu. You should have at least 8MB available memory. The file must also be in a format that OmniPage recogmzes. See Supported Input File Formats on page 1-8. Error opening the main dictionary. The file may be corrupt or moved - try reinstalling OmniPage. If the error persists, please call product support. You may have moved OmniPage to a different location on your hard disk, or you may have renamed the directories where OmniPage is located. Reinstall OmniPage. Error opening the text editor. You may have moved OmniPage to a different location on your hard disk, or you may have renamed the directories where OmniPa~e is located. Reinstall OmniPage. Error opening the user dictionary. The file may be corrupt or moved - try reinstalling OmniPage. If the error persists, please call product support. You may have moved OmniPage to a different location on your hard disk, or you may have renamed the directories where OmniPage is located. Reinstall OmniPage. Error preparing file for conversion. Try freeing up disk space. If the error persists, please call product support. The file you are saving may be corrupted. Try recognizing your document again with the Language Analyst deselected in the Settings Panel OCR options. Do not make any edits before saving.