OS/2 Upload Information Template for ftp-os2.nmsu.edu Archive Name: UNH204.ZIP Program Description: a command line utility to strip HTML codes Operating System Versions: OS/2 2.x and later Program Source: Don Hawkinson, author Replaces: UNH202.ZIP UNH175.ZIP UNH150.ZIP NOTE: UNHTMLxx.zip is a different utility Your name: Don Hawkinson Your email address: dwhawk@southwind.net Proposed directory for placement: ./os2/textutil This is an OS/2 command line utility to strip HTML codes from files saved from the WebX or other web browsers. UNH 2.04 HTML stripper by Don Hawkinson dwhawk@southwind.net usage: ..\unh file1 file2 file1 == html file file2 == stripped text output file file3 == URLs from html source file - optional UNH does not check for the existance of the output file, and will overwrite any existing file. UNH is HPFS aware. UNH does not attempt to recreate the format of the Web page. UNH does not attempt to force any format on the output text, nor does it attempt to remove any existing text format. While the layout of tables and lists is lost during stripping, data is sorted to separate lines for legibility. The HTML specification defines Character Entity Sets or tags to represent particular graphic characters which have special meanings in places in the markup, or may not be part of the character set available to the writer. UNH does not attempt to scan for all of the possible tags, but does try to resolve the most common tags. This version of UNH has support for codepages 437 and 850 and if codepage 850 is in use, the 850 character set is used. The codepages only make a difference when &xxxx; or &#nnn; tags are present in the file. If the correct character or an acceptable alternate is not &#nnn; available a space will be used. If an unrecognized tag is encountered, it is left in the output text. This version should be useable under OS/2 2.1, but it has not been tested. The special compression option for OS/2 Warp was not used when linking the executable. This program is free, but the author retains all rights. See the file license.txt file for further information. The command line utility UNH.EXE uses the same logic as the shareware PMStripper to strip the HTML codes from files. PMStripper is a PM utility that loads the stripped file into a MLE window to allow simple editing functions. PMStripper is distributed as PMS_xxx.ZIP with the version number replacing the xxx. For information on the current PMStripper version, contact send email to dwhawk@southwind.net .