CLEANWAD v1.0b WAD file cleaner and optimizer by Serge Smirnov (sxs111@po.cwru.edu) Comments and suggestions welcome. Cleanwad 1.0 is freeware. Overview -------- This is my second doom programming effort, and it is intended for a much wider range of users than my first one, CUSTMWAD (a set of two very low level WAD tools). This program owes much of its existence to Olivier Montanuy, the author of DeuTex and WinTex, who flamed me for CUSTMWAD, but ended up asking me to write a WAD cleaner :-) Features -------- Cleanwad simply copies a WAD file to a file with another name, rebuilding it as it goes. This procedure includes the following: * sorting directory entries by type; * eliminating any repeated entries; * eliminating 'null' entries (i.e., entries with "\0\0\0..." for name); * truncating WAV resources whose length exceeds their number of samples; * rebuilding pictures and (optionally) blockmaps; * lossless picture and blockmap compression/decompression (optional). What you should end up with is a cleaner, smaller copy of your WAD file, which is identical in functionality to the original. The amount of space saved will vary, but may be quite significant. Here are the results of running cleanwad on some well known WADs with the following switches: cleanwad +rb +pb +rp +pp +is. WAD file | Original size | New size | Space Saved | Ratio | | | | DOOM2.WAD (1.9) | 14,604,584 | 14,317,392 | 287,192 | 2.0% DOOM.WAD (Ult. doom 1.9) | 12,408,242 | 11,363,608 | 1,066,634 | 7.2% HERETIC.WAD (1.2) | 11,095,516 | 10,905,856 | 189,660 | 1.7% | | | | TRINITY.WAD | 882,464 | 877,040 | 5,424 | 0.6% ALIENGFX.WAD (v2.2) | 3,035,383 | 2,813,260 | 222,123 | 7.3% ALIENLEV.WAD (v2.2) | 1,245,357 | 1,228,196 | 17,161 | 1.4% ALNSND2.WAD (v2.2) | 738,510 | 678,608 | 59,902 | 8.1% ALITCSF.WAD (for 1.9) | 1,168,480 | 1,154,800 | 13,680 | 1.2% ALITCSND.WAD (for 1.9) | 687,862 | 682,810 | 5,052 | 0.7% ALITCWAD.WAD (for 1.9) | 3,601,904 | 2,880,188 | 721,716 | 20.0% OBTIC1.WAD | 4,204,061 | 4,032,742 | 179,319 | 4.1% OBTIC2.WAD | 403,927 | 385,988 | 17,939 | 4.4% OBTIC3.WAD | 210,047 | 207,721 | 2,326 | 1.1% Cleanwad requires about 285K of conventional memory to rebuild DOOM2.WAD with picture optimizations on. Usage ----- The basic command line for cleanwad is very simple: cleanwad.exe The original file is left intact, the output goes to output-file. If output-file already exists, it will be OVERWRITTEN WITHOUT ANY QUESTIONS, so please be careful. if you use any of the optional command line switches, male sure that and preceed them. While cleanwad processes a WAD file, it will occasionally spit out 'progress' messages. These fall into 4 categories: 1) Warnings (something in the input file looked suspicios); 2) Completion of large steps (only a few of these); 3) Successful optimizations (saved space); 4) detailed progress (way excessive for most people's needs). You can to some extent control which of these are displayed by setting the "verbosity level" to a number from 0 to 4. 4 means "all", 0 means "nothing". 3 (default) displays everything except 4. Here is an example of how this switch can be used: cleanwad a.wad b.wad v2. You might want to do this if cleanwad generates more optimization reports than you want to see while it runs. You can set certain optimization options by using the following command line switches. These switches must be preceeded by a plus or a minus, depending on whether you want to enable or disable them. The following turns the 'rp' option off, while enabling 'rb' and 'pb' (stay tuned for an explanation of what these do): cleanwad a.wad b.wad -rp +rb +pb Some switches are on by default. They're the ones suggested by Olivier, and are thus considered part of the cleanwad design spec. The rest are what I implemented just for the fun of it. A list of all switches follows. To see which ones default to ON, run cleanwad with no arguments. +rr (or -rr) stands for 'remove redundant entries'. Turning this option on causes cleanwad to check for a previous occurrence before adding an entry to the output file. If an entry with the same name already exists, it is replaced with the latest one. This happens as many times as necessary to insure that the output file contains no redundant entries. Maps are handled the same way, i. e., a second occurrence of E1M1 causes the first one, including all the map data following it, to be removed. Of course, a second occurrence of THINGS in the same WAD isn't considered redundant. +al (or -al) stands for 'align'. It causes every single directory entry in the output file to start at a 4-byte boundary. I've been told that it has significant effects on the speed of 32-bit applications, but I haven't noticed it. You can turn the option off to save a little space. +is (or -is) stands for 'ignore WAD syntax'. Some wads have misplaced and/or mismatched .._START/.._END entires. Normally this causes cleanwad to exit with a 'parse error'. If you have one of those wads and want to use cleanwad on it anyway, you may try this option. Cleanwad will then assume that any .._END it encounters is appropriate wherever it is. This works nicely if a SS_START is (incorrectly) paired with a S_END. However, when you use this option there is a chance that cleanwad will misinterpret the type of some entry and crash. The best way to use cleanwad is to have a wad that uses proper .._START/.._END syntax. +tw (or -tw) stands for 'truncate WAV resources'. Some editors create WAV entries that take up more space than is necessary for the number of samples. Cleanwad can truncate such sounds. +rp (or -rp) stands for 'rebuild pictures'. The structure of DOOM picture resources allows empty holes within such resources. Leaving this option on gets rid of them. You must NOT disable this option if you want to utilize either 'pp' or 'up'. +pp (or -pp) stands for 'pack pictures'. A more radical approach to saving disk space. Cleanwad can compress picture resources up to 8 times (though ratios between 1.1 and 1.6 are much more typical), without changing their functionality in any way. I got the idea from some node builders, which do blockmap compression. Such compression is completely lossless, and I do not see any rational reason for why it could slow down the game. If anything, it might reduce the amount of memory required by the doom refresh daemon during initialization. +up (or -up) stands for 'unpack pictures'. If you have compressed the pictures in your WAD with '+pp', but for some reason want to bring them back to normal, cleanwad can uncompress them for you. An attempt to uncompress a picture that was not compressed should do no damage. Similarly, you can compress a picture more than once -- its length will simply not change after the first time. Note that some pictures don't compress at all to begin with. +rb (or -rb) stands for 'rebuild blockmaps'. Very similar to 'rp'; +pb (or -rb) stands for 'pack blockmaps'. Very similar to 'pp'; +ub (or -rb) stands for 'unpack blockmaps'. Very similar to 'up' ('pb' and 'ub' require 'rb' to be ON to work). For more information on what the last six switches do, look at the source code. Known problems ----- -------- Certain Heretic pictures. Apparently, they are not in the standard doom picture format, so when cleanwad tries to process them it generates warnings like WARNING: picture CREDIT has an invalid heaader -- not processed (here, CREDIT can also be HELP1 or HELP2). These warnings can be safely ignored with Heretic. The reason I can't get read of them is that doom.wad and doom2.wad have similarly named resources that ARE real doom pictures. Another problem encountered when processing heretic.wad is this: WARNING: WAV sound GFRAG has an invalid header -- not processed. A hex dump of GFRAG shows that it indeed does not resemble a standard WAV resource; in fact, the first half of it is just one character repeated over and over. However, GFRAG is in the middle of other sounds, so it may be a sound. Anyway, the GFRAG warning means nothing. Another issue that may cause confusion is how cleanwad handles entries like S_START, FF_END, etc. My assumption is that the '_START's and '_END's in a WAD must form something similar to a set of matched paranthesis. In other words, if every type of _START/_END is assingned a unique type of paranthesis, based on the characters that preceed the underscore, then the WAD must look like a proper algebraic expression, with data entries for variables and operators. While testing cleanwad, I found that this rule is broken all the time, and since I haven't seen a document describing the rules for using _STARTs/_ENDs, I'm at a loss as to what's right and what's wrong. If anyone can enlighten me on this, please do. For now, I am just providing the '+is' option, so that if my understanding is wrong you can still use cleanwad. If you think this section is missing something, please let me know. Bug reports --- ------- Cleanwad may crash if the input file contains garbage (i. e., pointers to places past the end of file, or entries whose names imply a certain type but contain something else). It will exit to DOS with either a 'file IO error' or an 'out of memory' message. I consider such crashes normal behavior if the input file is corrupt. If cleanwad crashes while processing a file that you believe is a valid WAD file, send it my way and I'll attempt to figure out what's wrong. Cleanwad should never write to uninitialized places in memory, no matter what the input file is. I realize this is a very abmitious claim, but I have put some effort into making cleanwad bullet proof in this sense, so if it locks up your computer, let me know and I'll try to fix it. I have made every effort to ensure that whatever cleanwad produces works exactly like the original file. I realize that it won't take long until somebody finds a WAD that would be damaged by cleanwad. If you're one such lucky person, don't hesitate to Email me. Finally, feel free to report anything you find annoying about using cleanwad. This includes things like excessive output to screen, confusing error/warning messages, bad command line design, anything that you think can be improved. If/when I receive a sufficient number of complaints of adequate seriousness, I will release another version of cleanwad, provided I have time. Credits ------- Id software -- wrote doom Matt Fell -- wrote The Unofficial Doom Specs Olivier Montanuy -- gave me the idea and suggested some of the design me -- spent long hours optimizing blockmap/picture compression %-)