This file contains information explaining the workaround that Microsoft has implemented in their compiler and run-time library for the Intel Pentium processor's which contain a potential flaw in four instructions. ---------- Compiler By default, the compiler generates safe code for FDIV and FPREM. Hooks are also provided for (user-written) replacements for FPTAN and FPATAN. Uses of the flawed instructions in inline-assembly code are flagged (warning C4725, -W4) but not corrected by the compiler. Safe runtime routines for a single, flexible form of the FDIV and FPREM instructions are provided to aid manual user conversion of the code to the safe form (see the "Runtimes" section below). The compiler generates the above-mentioned safe (or replacable) sequences by default. You can turn off the fix by using the compiler option "-QIfdiv-". ---------- Runtimes The following runtime routines are provided. The names given here are C names; prefix the name with an underscore ( _ ) to get the "true" assembler/OBJ name. _adjust_fdiv -- Flag which tells if there is a 'flawed' Pentium installed. Used to 'short-circuit' calls to the safe routines in speed-critical sections of code. e.g. of use: ... pushfd ; save flags (if needed) fld op1 ; load dividend fld op2 ; load divisor cmp __adjust_fdiv,0 ; 0=ok, !0=flawed jeq ok ; brif ok call __safe_fdiv ; safe version jmp done ok: fdivp st(1),st(0) ; hdwr version done: ; either way, args gone, result on top of NDP popfd ; restore flags ... Alternately one could just always call the safe version (slower, but safe): pushfd ; save flags (if needed) ... fld op1 fld op2 call xxx_fdiv ... popfd ; restore flags _safe_fdiv -- safe divide routine Interface is same as for the x87 NDP 'FDIV' instruction (aka FDIVP ST(1),ST(0)) Takes two arguments on the NDP, pops them, does divide, pushes result onto NDP. Routine does 'safe' version of divide. _safe_fdivr -- safe reverse divide routine As for _safe_fdiv, but does reverse operation. Interface is the same as for the x87 NDP 'FDIVR' instruction (aka FDIVRP ST(1),ST(0)) _safe_fprem -- safe remainder routine (x87 compatible) As for _safe_fdiv, but does remainder. Interface is the same as for the x87 NDP 'FPREM' instruction. _safe_fprem1 -- safe remainder routine (IEEE conformant) As for _safe_fdiv, but does IEEE remainder. Interface is the same as for the x87 NDP 'FPREM1' instruction. _adj_fptan -- unsafe tangent routine (replacable) As for _safe_div, but (n.b.!!!) provides hooks only; does *not* do a Safe version. Interface is the same as for the x86 NDP 'FPTAN' instruction. Users who want a safe version must replace this routine with one of their own. _adj_fpatan -- unsafe arctangent routine (replacable) As for _adj_tan, but does atan. Interface is the same as for the x86 NDP 'FPATAN' instruction. Note: Does *not* do a safe version. In summary: routine safe? replacable? ------- ----- ----------- _adjust_fdiv n/a n/a _safe_fdiv y y _safe_fprem y y _safe_fprem1 y y _adj_fptan n y _adj_fpatan n y ---------- Performance Performance of FDIV on the following two interesting cases: -- Worst Case: an (unrealistic) program which did nothing but FDIVs ran -- Realistic Case: FPSpec, a set of FP-intensive programs is as follows: - Worst Case: flawed pentium good pentium unsafe code (error) 1.0 safe code 2.0 1.1 - Realistic Case: flawed pentium good pentium unsafe code (error) 1.0 safe code 1.10 1.01 In other words, the (extremely unlikely) worst case penalty is 10% on a good Pentium, 2x on a flawed Pentium; and the realistic penalty is <1% on a good Pentium, and 10% on a flawed Pentium. As always, "Your mileage may vary." That is, you may see no slowdown in a realistic program, or you may see 2x in a realistic program. If performance is an issue for you, measure it and see what your actual results are. ---------- CAVEAT: We have tested this fix extensively. However, as with all software, there is always a possibility that bugs remain. We assume that customers will rigorously test their applications to ensure correctness. Accuracy of floating point operations is a complex subject. Even with an accurate set of 'atomic' operations, such as +,-,*,/, a program can give unexpected results. The C/C++ standard does not in general guarantee a specific order of evaluation for expressions, nor does it guarantee that intermediate results will be forced to a particular precision, so two programs that are logically equivalent on the surface may yield different results. For a more detailed discussion of some of the above, see: Visual C++ documentation, "-Op" compiler option IEEE Floating-Point Standard C Language Standard The first of these references is probably the most readable overview. Most people need not worry about this, either because they do not use floating point at all, or because they do not need an extremely high degree of accuracy. Those that do need to worry are urged to make sure they understand the issues rather than blindly assume that the tools will "just work."