Hamilton Laboratories 21 Shadow Oak Drive Sudbury, MA 01776-3165, U.S.A. 508-440-8307 FAX: 508-440-8308 February 17, 1996 Subject: Using tape drives with TAR The tar utility included with Hamilton C shell can be used to read and write tar format tapes and diskettes for interchange with UNIX machines or for backups of your system. It can also be used to read and write tar format files on your hard disk. Most QIC (quarter-inch cartridge), 4mm DAT and 8mm Exabyte drives are supported by Windows NT and will work just fine with tar. But note: tape device support is included only with Windows NT, not Windows 95. If you would like to read/write tar tapes, you must be running Windows NT. Choosing a suitable tape drive: If you don't yet own a tape drive and you're trying to choose the one that might be best for your own application, your first consideration should be the types of drives on any machines with which you want to exchange data. If your colleague's machine has a QIC and you buy a DAT, this is not going to work! If you have a choice, we generally recommend 4mm DAT drives. A single $10 DAT tape cartridge, weighing only 1.4 ounces, can store 2 gigabytes or more -- no other tape format comes close in terms of cost per megabyte or weight per megabyte for media. If you have to store large amounts of data or if you're concerned about shipping costs, DAT drives win hands down. Also, under release 3.5 and later builds of Windows NT, DAT drives are better-supported than any other type of drive. For example, on a DAT drive, you'll have more flexibility in the choice of a blocksize (which could be important if someone else is sending you tapes and you have little control over how they're written). Also, DAT drives allow rewriting of the last block on the tape; that's useful with tar because it means you can add files onto the end of a tar tape on DAT; with other types of drives, you may have to just rewrite the whole tape to add even one file. That said, both QIC and Exabyte drives do work quite well. QIC drives using DC-6525, DC-300, DC-600 and similar full-size, quarter-inch, streaming cartridges are based on technology that's been popular on engineering workstations for close to twenty years; they're very reliable. Exabyte 8mm drives are also quite good, but if you have an older Exabyte that just refuses to work, you may need a firmware upgrade from Exabyte. The only drives we do not recommend are the so-called "floppy tapes" -- those using DC2120 and similar minicartridges. Floppy tapes do not seem to work at all with Windows NT; no one we've talked to has ever been successful. On the other hand, even if they were better supported under NT, you still wouldn't want one. The technology is inherently unreliable, using a timing track that must be preformatted onto the tape to indicate where "sectors" are to be written. Floppy tapes are notorious for not being able to read their timing tracks, meaning any data is lost. Getting help information from TAR and MT: These pages will outline just a few of the options that tar and its accompanying mt (magnetic tape) utility provide. If you need additional help from these (or any other) Hamilton utilities, you can always get it using the -h option. Since the help information often runs to several screenfuls, it's generally useful to pipe it to more, e.g., using the mi (more interactive) alias for more: tar -h | mi Rewinding a tape: Tar normally rewinds the tape before and after reading or writing it. You can suppress that with the -N (no rewind) option. You can also manually rewind the tape using the mt utility: mt rewind Listing the contents of a TAR tape: To list the contents of a tape using the -L (Long Listing) option so you can see in detail what's there: tar -L \\.\tape0 The tape device is \\.\tape0 under NT. If you have more than one tape, the others will be called \\.\tape1, \\.\tape2, etc. When reading a tape, tar will automatically recognize either tar or cpio formats (including both ASCII and binary versions of cpio) and will automatically do whatever "byte- flipping" is required if the tape was written on machine with a different bytesex. (More about bytesex later.) Also, tar will do its best to determine what blocksize used even if you don't tell it. Under Windows NT 3.5, the blocksize does have to be a multiple of 512 bytes to be readable; tar will iterate through all the possibilities until it finds the right one. Under Windows NT 3.51 or later, if your tape drive supports "variable blocksize i/o", tar can directly determine the blocksize just by reading the first record on the tape. For a little more diagnostic information from tar, you can use the -v (verbose) option instead of -L. You'll get some additional information about what format (tar versus cpio), bytesex and blocksize tar is using and the offsets from the beginning of the archive at which each file in the archive appears. Extracting the contents of a TAR tape: If you can list the contents of a tar tape, you can extract it. Do this using the same procedure you used to list the contents of the tape, but adding the -x (Extract) option. tar -Lx \\.\tape0 By default, tar will extract everything on the tape into the current directory. If all you want is just a particular file, you can specify it on the command line. Wildcards can also be used. But remember that the C shell normally expands wildcards before it starts up the application you've asked for. To make sure that any wildcards get passed through to tar so it can do the pattern matching, put single or double quotes around each word that contains any wildcards. For example, to extract all the *.c files: tar -Lx \\.\tape0 "*.c" What to do if you can't read a tape: Most folks will never have any trouble at all reading any tape they're ever given. But if you do encounter difficulty, it's likely because of one of the following reasons: 1. You don't have the tape device driver installed. You should see a message from tar complaining that it wasn't able to open \\.\tape0. The solution is simple: open up the Windows NT Setup applet in Group Main, pull down Options, select "Add/Remove Tape Devices..." and add the driver you need. 2. The tape itself is just not compatible with your drive. Every tape drive technology is constantly being improved. New generation drives can usually read tapes written on the older drives, but because the newer drives typically use newer, higher-density media and more sophisticated recording formats, that compatibility can be one-way only. Examples are trying to read a DDS-2 DAT tape on a DDS- 1 drive or a DC-6525 (525MB) QIC tape on a drive that only accepts DC-600's or an Exabyte 8500 tape on an Exabyte 8200 drive. In each case, you just won't be able to read anything. The tape will appear to be blank. And because a blank tape is a perfectly legal, albeit empty archive, you won't even see any messages. Just nothing. That's what makes this failure mode frustrating. Often, it is possible to force the drive to write in an older, lower-density format. With QIC and DAT drives, it's as simple as just being sure to use low- density media. With an Exabyte, the solution is to set a jumper on the drive to configure it to act like the older models. 3. The blocksize that was used is not supported with your drive under the release of Windows NT you're running. For example, to pack the absolute greatest possible amount of data onto a tape, some UNIX machines support writing tapes with incredibly large blocksizes, sometimes 200KB or more. This is often a problem trying to read tapes from SGI workstations. Very few if any Windows NT machines support blocksizes that large. Another example would be tape that was written with a blocksize that's not a multiple of 512 bytes. That's no problem if you're running Windows NT 3.51 or later, but it's not supported by the device drivers that Microsoft shipped with Windows NT 3.5; you'll need to upgrade the operating system to read that tape. Finally, while the blocksize might indeed be a multiple of 512, it might not be a multiple that's supported by your drive. For example, QIC often will support only a limited set of blocksizes. If it's a problem with the blocksize, you should generally get a message from tar that makes that clear. The solution is to ask that the tape be rewritten using a blocksize you can read; some good choices might be 10,240 bytes (the POSIX standard), 1024 bytes (which everything can read and write) or the drive's default, which you can learn by typing "mt status". To rewrite the tape on the UNIX machine with a 10,240-byte blocksize, the UNIX tar's -b (blocking factor) option should be used to specify a blocking factor of 20. (On UNIX, you multiply the blocking factor times 512 to get the number of bytes per block.) 4. You need a firmware update for your tape drive. This is a particularly likely source of trouble if you've scavenged your drive from an older system. All tape drives today are microprocessor-controlled and the manufacturers have made a lot changes to the firmware embedded in these drives over the years. Windows NT depends on that firmware being up-to-date so it can properly handshake with the drive. Updating the firmware is generally quite simple. Usually, you put a special update tape from the manufacturer in the drive and the drive will automatically recognize it and do the update. Contact your drive vendor for more information if you suspect a firmware problem. 5. Your drive has a firmware bug. Some drives may claim to support variable block i/o but not actually implement it properly. If the drive claims to support this mode, tar will use it because it allows tapes be read quickly and easily even if the blocksize is unknown. But if the drive has a firmware bug, the tape may look blank in this mode. The workaround is to use tar's -V option to tell tar to ignore the drive's claims of supporting variable block i/o. If you find that none of these explanations seems to fit the problems you're having, it's time to try writing a scratch tape just to see if your drive can at least read its own tapes. Writing a new TAR tape: To create a new tape with one or more files or directories in the archive, use the -c option: tar -Lc \\.\tape0 file1 file2 file3 ... filen You can list as many files or directories on the line as you like. All the usual wildcards can be used and since it's okay to let the C shell do the wildcard expansion, you don't need to put quotes around anything. If one of the items is a directory, the entire contents of that directory will be copied to the tape. By default, tar will try use a blocksize of 10,240 bytes, which is generally considered standard on most UNIX machines. (It's actually part of the POSIX standard for tar-format tapes.) If your drive doesn't support that blocksize, tar will choose something that is supported. Adding more files to an existing TAR tape: To add one or more files or directories to an existing archive, use the -a option: tar -La \\.\tape0 file1 file2 file3 ... filen Not all drives support this function since it requires that tar be able to read the entire archive, then back up to overwrite just the last record before continuing with the new files. If you get a message from tar telling you it wasn't able to write to archive when you use -a, that's probably the reason. In that case, you'll have to use the - c option instead and just plan on writing everything you want on the tape in a single operation. Check that you can read the tape you just wrote! After writing a tape, do be sure to check that you can read it back. This is just a safety precaution the first time you try using your drive or a new setting for blocksize, etc. Exchanging tapes with a UNIX system: If the UNIX machine can read the data from your tape, but it comes out garbled, the problem is probably that you've written the data in the wrong bytesex for that machine. Bytesex refers to the order in which the bytes are laid out on the tape. A little-endian machine (which is what all Windows NT machines are by edict from Microsoft) writes the data out starting with the byte containing the least significant bits (the "little" end). Many UNIX machines are big-endian, meaning they start at the other end. Reading a tape from any UNIX machine, regardless of bytesex, is no problem since Hamilton tar knows how to detect the bytesex and automatically do any byteswapping that might be required. UNIX tar utilities lack this feature and leave it up to the operator to figure out what's going on and, if necessary, to use a separate "dd" utility to swap bytes. You can verify that this is the problem if you have a tape written on the UNIX machine. Read it using the -v (verbose) option and tar will tell you what bytesex was used. This is easy to fix. Use the tar -b (bytesex) option to specify a different ordering when you write the tape next time. Here's an example, writing all the .c and .h files in the current directory to a tape in big-endian format: tar -LcbB \\.\tape0 *.[ch] If the tape is simply unreadable or appears blank on the UNIX machine, chances are they have an older drive that does not support compression. (Since most NT machines are fairly new, the drives installed in most of them use hardware compression to pack more data onto a tape. But this is a recent improvement in tape technology; an older UNIX machine may have been built before hardware compression was available.) To write tapes with hardware compression turned off, use the -Hoff option: tar -LcHoff \\.\tape0 *.[ch] ASCII text versus binary files: One final consideration is that UNIX and NT differ on their line-end conventions. UNIX uses a single newline (\n) character to mark the end of a line; NT uses a carriage return-newline (\r\n) combination. Hamilton tar assumes that because tar is fundamentally a UNIX format, that any ASCII files stored in a tar file will probably follow the UNIX convention. Consequently, when extracting an ASCII text file, tar will convert from \n to \r\n; when adding a file to the archive, it will do the reverse. Binary files are not converted. You can override this default behavior by specifying either -r to turn off any conversions. The -R option causes conversions to always be done, even on files that appear to be binary. You can also use the TARBINARY and TARASCII environment variables to list files that should be considered as being one type versus the other based on the filename, regardless of content. For example, database products often create files with a lot of ASCII data but which really should be considered as binary. Postscript files with encapsulted images are another example. These files should never translated as if they were ordinary ASCII files. You can indicate that by setting the TARBINARY environment variable. For example, in the System applet in the Control Panel, you might set TARBINARY = *.ps to make it treat all Postscript files as binary data. Remember: there's a guarantee! If you follow the suggestions outlined here, chances are very good you'll get your tape drive to work. Certainly, if you have any questions or if you get stuck and need help going through these procedures, give me a call. That's what you've paid for. Finally, rest assured that the unconditional satisfaction guarantee offered with Hamilton C shell means what it sounds like: if you decide you're not completely satisfied for any reason -- or even for no reason! -- you get your money back. I guarantee it! Best regards, Douglas A. Hamilton