AV Product Test Protocol: ------------------------- This document specifies the test procedures applied to test the precision of detection as well as the reliability of detection of PC-based boot, file and macro viruses. Where relevant, details concerning updates against VTC 94-07 test are given. 1) Hardware and System Software used: ------------------------------------- (Hardware for test 97-02 significantly different from test 94-07). The virus database of BOOT, FILE and MACRO viruses was held on a Windows NT 3.51 Server. The server is based on a Pentium (100 MHz) with 64 Mbytes of RAM. The Pentium is equipped with a 3.5 " 1.44 Mbyte disk drive and two hard disk (1 Gbyte and 2 Gbyte, respectively). Additionally, 3 MS-DOS clients are used for the test. The clients are essentially used to test the AV-products with boot viruses. Clients work on MS-DOS 6.20 operating systems. The hard disk are only used for the boot process. The 3 clients have the following hardware: - Pentium 100 MHz, 8 MB RAM, 540 MB hard disk - Pentium 90 MHz, 8 MB RAM, 540 MB hard disk - 486 33 MHz, 8 MB RAM, 240 MB hard disk The software consists of batch programs and scripts (PERL and AWK). Some UNIX programs like AWK, GAWK, JOIN etc. have also been applied. 2) The Databases of File/Boot/Macro viruses: -------------------------------------------- (File and boot virus databases 97-02 are about doubled against 94-07; macro virus database is newly established). An overview of entries in the VTC virus databases (status: November 30, 1996) is given in Appendix 3: "A3TSTBED.zip". Following the description of CARO boot/file as well as CARO macro naming conventions (TESTBED.VTC), the following indexes are contained (in ZIPped form): BOOTVIRS.VTC Index of VTC boot virus database FILEVIRS.VTC Index of VTC file virus database MACRVIRS.VTC Index of VTC macro virus database These entries (which also indicate the multiplicity of infected objects in the resp. entry) also conform with related entries in scanner evaluation protocols. All file and boot viruses are sorted into their resp. database according to diagnostic messages of three "standard" scanners (AVP, DSAV, F-Prot). The database of the file viruses consists of two parts. If the three scanners identify a virus with the same name, it is stored in the first part of the resp. database. This part is named "CARO", as those standard scanners reflect an agreed CARO name. All viruses for which no such agreement on their name is visible, are stored in the second directory "NYETCARO" (Not- Yet-CARO). If some virus is not detected by any of the three scanners, it is stored in "UNKNOWN". The following file extensions are present in the file viruses database: EXE, COM, SYS and BAT. Contents of the file database: 58,000 files infected with exactly ONE virus 10,704 different viruses, belonging to 3,012 different families. Different from the file viruses database, all boot viruses are stored on one drive on root level. The boot viruses are stored in images of boot sectors. The following extensions are existing in the boot viruses database: boo, img, mbr. Contents of the boot virus database: 2,577 images representing exactly ONE virus, 827 different viruses, belonging to 466 different families. The macro virus database is organised according to the CARO macro name convention.For each macro virus, different goat documents were stored to test consistent identification and reliable detection. Contents of the macro virus database: 472 infected documents representing exactly ONE virus, 143 different viruses, belonging to 63 different families. 2A) Additional Macro Malware Database: -------------------------------------- (This test is new in 97-02). Concerning non-viral malware, VTC has collected several trojans, virus generators, droppers, intended and first generation viruses etc. As a by-product of VTC 97-02 test, the subset of non-viral macro malware was tested because it is well documented (see VTCs "List of Known Macro Malware" which summarizes both viral and non-viral macro malware). The testbed included: 15 strains of macro malware, including 7 trojan horses and macro virus droppers, 3 file (COM/BAT) viruses dropped from macro viruses, 1 virus generator, and 4 intended macro viruses. 3) Testing scanners on standard database of file infecting viruses: ------------------------------------------------------------------- (Text essentially same as in Vesselin Bontchev's test 94-07) The viruses are stored in a huge subdirectory tree, the hierarchical structure of which reflects the CARO virus naming scheme, with the samples of each virus stored in the leaf directories of the tree. A virus can be (and usually is) represented by more than one replicant, although the different viruses are not represented by one and the same number of replicants. All replicants that contain one and the same virus, are stored in one and the same directory. If two files are in two different directories, this means that they contain two different viruses. Each sample was at least reported by two scanners. All efforts have been made to ensure that the samples used during the test are natural replicants of working viruses - no Germs, or Corrupted files, or Intended viruses. Nevertheless, it is possible that we have made some mistakes in this aspect. If somebody notices any mistakes of this kind, we shall appreciate being told about them. Each scanner is run on this directory tree and the resulting report file is preprocessed. The preprocessing is done with a set of batch files, some Unix utilities ported to DOS (sort, join, cut, paste, awk), and a set of awk scripts. The preprocessed report contains four columns. The first column contains the directory containing viruses. The second contains the number of scanned files in the directory. The third contains the number of detected files. The fourth contains the information whether all files are reliable detected (with the same name). For each scanner the report and the preprocessed data are stored in special directory. Not the whole output of the scanner is contained in the third column, because this output often tends to be too verbose. We have put there only the distilled information that we have judged important for that particular scanner. If we have missed some important information, we shall appreciate being told about it. Additional remark for Test 97-02: with the linear but fast growth of virus numbers, naming became less organised. When this test was prepared, less than 25% virus names could be regarded as "CARO agreed", esp. as members of the CARO naming committee were overloaded in their daily fight against new viruses and in helping victims of viral events. While VTC testers hope that the chaotic situation of virus naming may improve, we have left the 2nd column out of this report. 4) Testing scanners on standard database of boot sector infecting viruses: -------------------------------------------------------------------------- (Text essentially same as in Vesselin Bontchev's test 94-07) The boot sector viruses are kept in a similar subdirectory tree, as files, containing the images of the infected boot sectors. For the purposes of the test, we used a program, called SimBoot, developed by Dmitry Gryaznov. This program is still under development and is not available to the general public, but we will make it available to those producers of the scanners, who have reasons to suspect that the program has unfairly interferred with their product and has not allowed it to be tested properly. The program takes a file, of which the first 512 bytes are supposed to contain the first sector of a boot sector virus. It then emulates a blank, formatted floppy disk in drive A:, the boot sector of which is replaced by the image in the file. If the file is smaller than 512 bytes, it is padded with zeroes. If the image contains a valid diskette BPB which indicates a particular diskette size, a diskette with that particular size is emulated. If a valid BPB is not found, a 360 Kb diskette is emulated. Currently only the first sector of the boot sector virus is put on the emulated diskette. The program SimBoot is able to handle complete viruses, consisting of several sectors, but this requires that the file image of the virus conforms to a particular format. We did not have the time to prepare all our boot sector viruses in this way, although we are considering to do this in the future. One major flaw of this approach is that hard disk, and respectively MBRs are not emulated. The testing of a virus which infects only MBRs (e.g., Tequila) but not boot sectors of floppy disks, is still done by putting an image of the infected MBR on the boot sector of the simulated diskette. We understand that this is not very correct - a scanner may refuse to look for a particular virus on a diskette boot sector, if it knows that this particular virus just cannot be there. The author of SimBoot is considering to improve it in the future, in order to make it able to simulate hard disks too. Once SimBoot creates the simulated infected diskette, it runs the scanner to be tested, as specified in the configuration file for this scanner. (The configuration files are available in the archive SCRIPTS.ZIP.) The scanner is supposed to scan the diskette (SimBoot intercepts all INT 13h requests to drive A: and redirects them to access the simulated diskette), reports its status in the report file, and prompt the user to insert the next diskette to be scanned. SimBoot intercepts the prompt and simulates user input from the keyboard. Both the prompt and the required user input are specified in the configuration file for each scanner. SimBoot is able to handle scanners that write their prompts directly to the video RAM. It is also able to handle scanners that poll directly the keyboard when waiting for user input instead of using the BIOS. SimBoot is even able to simulate changing the status of the floppy drive from Closed to Open and then again to Closed, in order to handle those scanners which poll the DiskChanged line and in order to figure out when the user has put a new diskette. Methodological remark: SIMBOOT is selected as more "realistic" test methods would be difficult to practice (e.g. tesing viruses on diskettes requires either a permanent formatting/infection/testing or a sequential test of many diskettes). But as any simulated method (even if as well done as SIMBOOT), this method may be unfair to scanners which scan for real floppy characteristics. We have been informed that McAfee's Scan works in such a way; in this case, the real detection rate of such a product can only be assessed using some different test method. The resulting report of each scanner is further preprocessed with a similar set of batch files and awk scripts as the report of the file virus scanning. (Same changes apply in test 97-02 as part 4). 5) Testing scanners on ITW databases of boot/file infecting viruses: -------------------------------------------------------------------- (This part is new in test 97-02). Based on VTCs "full" virus databases, 2 different ways for determination of In-The-Wild viruses are possible: 5.1) A subset database is collected which contain only ITW viruses; tests could then be performed on this databases. 5.2) From each scanner log, all related entries are collected into a subset log, containing only ITW diagnosis. Generally, one would assume that both procedures give the same results. To minimize workload, the second process is selected. Resulting reports are processed by suitable awk scripts to yield related summaries. 6.) Testing scanners on standard database of Macro Viruses: ----------------------------------------------------------- (This part is new in test 97-02). All AV scanners are tested against three different macro-related databases. In the first, most macro viruses known on Nov. 30, 1996 are stored. The second contains all In-The-Wild (ITW) macro viruses, and the third has all other malware known at that date except viruses (trojans, droppers, intendeds etc). All malware included in the databases mentioned above matches the contents of the VTC Macro Virus List, which is published at the end of each month (see ftp.informatik.uni-hamburg.de/pub/macro/macrolst.*). The malware database contains some file viruses which are being created ("dropped") by macro viruses. We decided to test them in the context of the macro malware test because they only appear in the context of macro malware. The directory structure of the virus database reflects the CARO naming scheme for macro viruses with all samples of one variant stored in one subdirectory. Starting from the root directory of the database, the first level contains directories describing the host software (Word, Excel, Lotus123, AmiPro). The second level contains subdirectories with the names of the families of the viruses and the next level hosts subdirectories of all variants of that family, in which the viruses can be found. Optionally (only in the malware database), we have another subdirectory called "FILE" which contains the file viruses mentioned above. The number of samples for each virus varies between one and 78 samples (for Concept.A), although the average is 2-3 files each. Our results are split into two sections: "detection of viruses" and "detection of files", where "detection of viruses" has two subsections: "unreliable detection" and "unreliable identification". An index of the malware databases is available in a3tstbed.zip:macrovirs.vtc. After each scanner is run, all report files are preprocessed by those AWK scripts already mentioned in the desciption of the file virus test. 8) Creating the final summary of the results: --------------------------------------------- (Text essentially same as in Vesselin Bontchev`s test 94-07; details adapted). The final evaluations for all tests are very similar. Only one report of file and macro viruses tests is used to get the total number of files in the directory. As for the boot viruses, the configuration file from Simboot is used. Three new files result from these processes. The new files contain the directory name and the total number of files in this directory. Each preprocessed report is joined with the new file. One AWK-scripts evaluates the result of the joining. The results are as follows: - The number of viruses (+malware) detected: it is not necessary that all examples of the virus are detected. - The number of viruses with unreliable (=inconsistent) identification: all files of a viruses are detected but at least one example is identified with a different name. - The number of viruses with unreliable detection: here, not all samples of a virus are detected but at least one. The files containing the preprocessed information mentioned above are huge, although they are reduced to contain essentially the virus names. For all tested scanners (latest version), they are included in a separate archive (Scan-Res) for anonymous ftp.