Cavendish Software Ltd Memory Allocation Tracking for C++ NewTrack v2.0 Document Reference: NEWTK200 Last Printed: 16/03/93 17:18 Revision 1.01 by John Spackman, Tuesday 16th March 1993 Copyright c 1991-1993 Cavendish Software Ltd. All Rights Reserved. Table Of Contents Introduction .......................................1 How NewTrack can ease development ..................3 Multiple or Invalid Deletions ..................3 Pointer Overruns and Underruns .................4 Unfreed Memory .................................4 Uninitialised Data .............................5 Using Deleted Memory ...........................6 Out of Memory ..................................6 How to Use NewTrack ................................7 Adding NewTrack to the build ...................7 Starting and Stopping NewTrack .................7 Temporarily Stopping NewTrack ..................8 Dialogue Box ...................................9 Debugging with NewTrack ........................9 Pitfalls and Programming Considerations ............10 Malloc() .......................................10 Automatics .....................................10 Porting to other Environments and Compilers ........11 Memory Model ...................................11 Versions of operator new() .....................11 Getting the callers address ....................11 Inline assembler ...............................12 Shared libraries/DLLs ..........................12 User I/O .......................................13 Comments, and where to find us .....................14 Copyright, Use & Copying ...........................16 Copyright c 1991-1993 Cavendish Software Ltd. All Rights Reserved. Introduction NewTrack is an extension to C++ programs which validates deletes, discovers memory over and underruns, and shows up any unfreed memory allocations. Written for Borland C++ v3.1 and MS- Windows, NewTrack is a surprisingly small and simple library and DLL combination which can discover many intermittent and unreproducable faults within an application. Although Windows is a protected mode environment, you might think that illegal memory access such as that caused by overrunning an allocated block or deleting an invalid pointer would be trapped with a GP Fault. Not true. Although Windows apps have a theoretical address space of 64Gb in 386 enhanced mode, you can only make 8192 calls to GlobalAlloc(). As a result, the compiler of choice has to allocate much larger "pages" of memory and suballocate them itself when malloc() is called. If you then overrun your allocation, the chances are you write on other parts of your data, and invalid deletions (ie calls to free()) make the compiler quietly disfigure it's own suballocation tables. By making a list of all allocations a program makes via calls to new, NewTrack can determine whether this is valid and tell the user. By allocating extra memory either side of the block requested by new and filling with a known value, NewTrack can detect if the program overran (or underran) the requested block by comparing the additional memory with the known value when it is deleted. Another side-effect of the suballocation scheme is that memory remains not only accessible but also has once-sensible values in it. Immediately before deletion, NewTrack fills the memory with 0xFF. Similarly, when the memory is first allocated the contents are unknown, although often zero. NewTrack fills the memory with a known non- zero value when allocated, forcing the programmer to explicitly initialise data. NewTrack currently only attempts to replace the new and delete calls, because to replace malloc(), farmalloc(), etc would require a private memory suballocation scheme. As in the standard new and delete functions, memory is allocated and deallocated by using farmalloc() and free() respectively. Cavendish Software Ltd 1 Please note that NewTrack has only been tested (although very extensively) in Large memory model under Borland C++ v3.0 and 3.1. Cavendish Software Ltd 2 How NewTrack can ease development Multiple or Invalid Deletions During the development of any large program, a number of bugs will creep into the code regardless of the best efforts (or BS standards!), and it's not surprising if sometimes a pointer is deleted more than once, or a critical block is not called to initialise a pointer. Sometimes this is easy to spot: under Windows an instant GP Fault or "UAE" may occur, because the free() function attempts to write on a header block of memory that is no longer available, allowing a simple and quick solution. But sometimes that memory will have been already reallocated, and free() will erroneously write on memory and inexplicably corrupt data. A further problem is that Windows (like all good operating systems) can often corrupt it's own internal data structures, leaving you (or rather, the user) with an unstable system that is likely to crash at anytime, possibly even after your application has terminated cleanly. NewTrack will detect and prevent any attempt to delete an invalid pointer (including NULL), by keeping a list of all the allocations which have been made, and checking the pointer against this list. If the pointer is not in the list, the user will be allowed to continue or abort the program using the dialogue box described below. Cavendish Software Ltd 3 Pointer Overruns and Underruns A not so common pitfall is to overrun or under run a block of memory. An Overrun is to simply use too much memory, and exceed the allocated block; an underrun is to exceed the limits in the other direction, and use memory in front of the allocated block. Probably the most common example of an Overrun is allocating a block to hold a copy of a string that is strlen(my_string) long; when strcpy() is used to copy my_string into the allocated block, it will add a trailing NULL character, overrunning the available space by just 1 byte. Again, these bugs are sometimes easy to spot in a "protected" environment such as Windows; a GP Fault will occur and you can hopefully still trace it in the debugger. However, it is quite likely that next time you run the program (eg under the debugger), the Overrun or Underrun writes on memory which you have already allocated and therefore will quietly corrupt memory without telling you. NewTrack will detect these bugs by always allocating a small amount (12 bytes) of memory extra to act as headers and footers to the block which you allocate. It then fills these blocks with a known value, and checks them again when the pointer is deleted. If a value in the header or footer has changed by the time the pointer is deleted, an Overrun or Underrun has occurred, and the user will be allowed to continue or abort the program using the dialogue box described below. Unfreed Memory A much more frequent and harder bug is failing to free up allocated memory. This is less noticeable under Windows because of the virtual memory it provides, but still very difficult to track down, as well as to diagnose. NewTrack was initially developed explicitly to solve this problem during a project that eventually grew to over 2.5Mb of C++ source code before the first release. Cavendish Software Ltd 4 Even if you do know that "somewhere" inside 2.5Mb of event-driven code in executables and DLLs, you're losing approximately 2K a minute (for example) or that after running for three days Windows says it's low on memory, this does nothing for the programmer except cause a very big headache. When new is called to allocate memory, NewTrack will record the allocation and the address of the code that called it. This means that, using a debugger, you can set a breakpoint at the hexadecimal address recorded, restart the program, and the debugger will instantly show you where the unfreed allocation was made, often reducing the debugging time from hours or days to minutes. At the end of a program when NewTrack is terminated, it will check it's list for any unfreed allocations and report the number in a dialogue box. So that the list is easy to read in the debugger, it (NewTrack) will then collate a list of the allocations in an array. Uninitialised Data A bug which is quite obvious as to it's source when it appears but is often difficult to reproduce is uninitialised data. A program always works fine until the first day at the customer site when the very first thing it does is print two-and-a-half pages of garbage across the screen and dies, because you've just tried to print an uninitialised string. As programmers, we tend to keep pretty much the same working environment; under Windows or DesqView, I can move between my 3 apps with my eyes shut, because they are always in the same configuration (BC++, MSDOS, & Program Manager). I also reboot fairly often. This means that I have large expanses of zero-ed memory in my 16Mb PC, and uninitialised data won't show up as easily as on well used user's PC which is short on RAM which quickly becomes unzero-ed. NewTrack solves this problem by always initiating allocated memory to garbage. An arbitrary number (in fact 0x23, or '#') is used to clear any allocated block before it is returned to the application. Cavendish Software Ltd 5 Using Deleted Memory Just as using uninitialised allocated data can cause a problem, using initialised de-allocated memory can also be very destructive, just when you don't want it. The Windows protected-mode kernel will stop you from using areas of memory which are no longer in use by the suballocation scheme, but it won't otherwise do anything for you. And while you inadvertently carry on using this area of memory, it may become corrupt at any time (from your point of view), and if you write to it, you corrupt it for another piece of code. Just as with uninitialised data, it's easy not to notice that you're using a deleted block because the program happens to work on your machine. NewTrack can only partially solve this problem, by setting all the data in a block to a known garbage (0xFF) value immediately before calling free(). This makes it far easier to detect (pointers especially). Out of Memory Because new is used in C++ much more than malloc() and its equivalents were in C, it is very inconvenient as well as time and space consuming to always check that the allocation succeeded. This is the only function of NewTrack that is also provided as standard in any other C++ compiler; the function set_new_handler() can be used to set a pointer to a function which is called when the default new function runs out of memory when calling malloc(). NewTrack handles this condition itself, and ignores the function set by set_new_handler(). As with other errors, it will notify the user with a dialogue box. Cavendish Software Ltd 6 How to Use NewTrack Adding NewTrack to the build NewTrack comprises of two source files; one called NEWTRACK.CPP which must be compiled and linked into every .DLL as well as the .EXE, which contains new versions of new and delete. These call functions in the NewTrack .DLL so that NewTrack can share it's list of allocations; if it could not do this, data new'ed by one module (ie .DLL or .EXE) could not be delete'ed by another (this does not mean that tasks are allowed to free each other's memory). When NewTrack allocates additional memory for over- and underrun checking, it allocates a single block and returns a pointer offset (by the size of the header) from that returned by malloc(). This pointer cannot be freed by any function other than the NewTrack version of delete, so NEWTRACK.CPP must be built into each and every module that makes up an application during it's development. The .DLL is made up of a single source file called NEWTDLL.CPP. The import library generated from this .DLL and NEWTRACK.CPP should then simply be added to each module. Starting and Stopping NewTrack NewTrack has to be explicitly started and stopped by calling the following two functions: void NT_Initialise(void); void NT_Terminate(void); The best place for this at the start and end of main() respectively. All calls to new and delete will be passed through NewTrack if you have NewTrack linked in; however, NewTrack does not run (ie no validation is performed, and no extra memory is allocated), except after calling NT_Initialise() and before calling NT_Terminate(). If NewTrack is not running, new and delete will have the same effect as the default. Cavendish Software Ltd 7 In order to detect memory overruns and underruns, NewTrack will allocate a larger amount of memory than asked for, fill in the known values, and return the pointer returned by malloc() + the size of the header. This means that because NewTrack has added extra bytes to the pointer, the pointer returned by new cannot be directly returned to free() either by you or by delete, because it is no longer pointing to the start of a memory block. Therefore, a pointer allocated before NT_Terminate() cannot be freed by either a call to free() or by calling delete after the call to NT_Terminate(). Remember that this includes memory deallocated by a global constructor. Any memory new'ed while NewTrack is running must also be delete'ed while NewTrack is running, and vice versa. Remember that this includes automatic variables which are implicitly new'ed and delete'ed by the compiler, and memory freed by a global destructor; the compiler will new automatics where you declare them, but delete them after the last statement in the block, so you should not declare automatics in the same block as the NT_Initialise()/NT_Terminate() pair. If necessary simply declare a block around the code between the functions. Temporarily Stopping NewTrack Sometimes new's and delete's are not wanted to be passed to NewTrack; this is true of anything which is to be allocated and freed when NewTrack is and is not running. NewTrack includes macros which can temporarily disable NewTrack for the current module. As described above, anything allocated when NewTrack is running must also be freed when NewTrack is running, and vice-versa. This is also true of temporarily stopping NewTrack, because the calls to new and delete are passed on directly to malloc() and free(). There are four macros, defined in NEWTRACK.HPP: NEWTRACK_ON() NEWTRACK_OFF() NEWTRACK_PUSH() NEWTRACK_POP() Cavendish Software Ltd 8 NEWTRACK_ON() and NEWTRACK_OFF() absolutely disable NewTrack. NEWTRACK_PUSH() and NEWTRACK_POP() save and restore the state in a temporary variable. The NEWTRACK_PUSH() macro declares a variable in the code where it is called, and must therefore be called from either the same block as NEWTRACK_POP() or at a higher level; both must be called within the same function. Dialogue Box In order to report errors detected and to determine the next course of action, NewTrack uses the MessageBox() function to throw up a system- modal dialogue box with a short description of the error. The box has two buttons - OK and Cancel. OK will allow the error to be ignored (whilst preventing any destructive action, such as invalid pointer deletion), and Cancel will cause the program to abort. Debugging with NewTrack As described in the previous section, when NewTrack discovers an error it displays a dialogue box. If the user presses OK, NewTrack then calls INT 3, the standard method of communicating a breakpoint to a debugger. The programmer can then trace back through the stack to the particular call to delete that failed, and find out why. When NT_Terminate() is called, it will also call INT 3 after it has generated the list of unfreed allocations. After the debugger has broken in for the first time, you should "Run" or "Go" (not a step or trace) to reach the second INT 3. The list is pointed to in the NewTrack code by a local variable called blist, each of whose members is a pointer to a structure identifying the memory allocated. Each structure contains a far void pointer called caller, which identifies the address of the call to new that allocated the pointer. By using the debugger to set a break point at that address(es) and restarting, you can see where the memory was allocated from the next time the program breaks. The structure also contains a pointer to the real address of the allocation (ie the pointer returned by malloc(), which will contain a 12-byte header), and size, etc. Cavendish Software Ltd 9 Pitfalls and Programming Considerations Malloc() NewTrack only recognises allocations passed though it's new function as valid for its delete function, so always make sure that allocations are deallocated with the corresponding function for the allocation. Functions such as malloc(), calloc(), realloc(), etc, may only be used with free(); calling delete will result in an "Invalid Pointer Deletion" error message. Note also that strdup() uses malloc(), not new. Automatics Automatic variables are implicitly new'ed by the compiler where they are declared, and delete'ed after the last statement in the block, so take care not to declare automatic variables in between the calls to NT_Initialise() and NT_Terminate() in the same block as the calls. Cavendish Software Ltd 10 Porting to other Environments and Compilers There are several areas which could cause a problems when porting NewTrack, but these are pretty much off the top of my head, so apologies for any mistakes. Memory Model NewTrack has only ever been used in Borland C++ v3.0 and v3.1 in Large memory model, under protected mode Windows 3.1. The current BC++ version is v3.1, but the only difference to BC++ v3.0 is an additional version of new. NewTrack should port easily to other memory models, with judicial additions of the far keyword for pointers in small models. Versions of operator new() Firstly, starting with Borland C++ v3.1, there are two versions of new - one for small (0 to 64Kb) allocations, and one for huge (64Kb to 4Gb) allocations. It doesn't matter to NewTrack which is used; it handles 0 to 4Gb (theoretically - when I have 4Gb in my PC I'll try it!) If your compiler only has one version, just exclude the other with conditional compilation. Getting the callers address Secondly, the caller's address is obtained by getting the 4-byte address from the stack as BP- 2 - . I worked this out through trial and error, by tracing the assembler under the debugger, and looking at BP before call and at the first statement in operator new(). This will vary between memory models. Cavendish Software Ltd 11 Inline assembler The debugger hook (ie calling INT 3) is done by inline assembler; not all compiler support this, or may have a different syntax. If the compiler does not support inline assembler, call the Windows function DebugBreak(). This will place the debugger in the assembler code of that function, on a RET instruction. Simply trace through the RET to return to the source. NewTrack uses inline assembler to prevent the inconvenience of trace back out of the function. Under DOS, the function can be created by allocating a two byte area and filling it with the values 0xC3 0xCC (eg unsigned short fred = 0xC3CC;). This is the mnemonics for INT 3 in reverse order (Intel chips reverse the bytes when storing integers). If a function pointer is created to point to this value, it can be called to create the interrupt (eg void (*myint03function)(void) = &fred;). Shared libraries/DLLs You must only have one copy of NEWTDLL.CPP in the entire executable. This probably won't be a problem except where, like Windows, DLLs are linked into separate modules with almost all the qualities of a separate executable. The actual requirement (which is solved for Windows by having NEWTDLL.CPP in a DLL) is that there must be a single global 'collector' or administrating piece of code for all allocations that are made, whether by the main executable or by another shared library or DLL. Having a separate version of NEWTDLL.CPP for each library will mean that all allocations made by one library will be collected by that library and must then be deleted by the same library or the allocation will not be recognised. Remember that the calls to new and delete made by NEWTDLL.CPP must not be validated by the NEWTDLL code, because the allocation and deallocation will become infinitely recursive. This is achieved by turning off the __newtrack flag (which is what the NEWTRACK_XXX() macros access) in that module. Cavendish Software Ltd 12 User I/O You will need to put in conditional compilation to replace the call to MessageBox(), but little else is required. Cavendish Software Ltd 13 Comments, and where to find us Please let us know what you think, even if it's just to say you use it. We'd really like to know whether it's good or bad, what can be improved, what works and what doesn't. Also, if you send code changes to improve it or port it to another platform/compiler (for example), we'll try and put them in if we have a suitable environment to develop in, but no promises. This is only one of a series of C++ tools that we're working on. Soon to arrive (but not shareware) is an inter-process communication class called IPC (!) which allows transparent communication across any network, no matter how far or wide. Also, a portable filing system class library (called FileServ) which supports data dictionaries, automated file version control, transparent interchange of filing systems, retrieving records as of a given date and time (eg the entire record that was in use at 4:23pm on 23 April 1992, but has since been deleted or modified beyond recognition) and much more. FileServ will support many popular databases such as Paradox, Btrieve (already in use), C-ISAM, and the ability to easily add others. NEWTRACK IS NOT A REGISTERABLE PRODUCT - sorry, but we can make no warranties for NewTrack. We probably will no doubt maintain a list of NewTrack users who send mail and tell them about new versions (unless they ask otherwise), but it depends on volume, so no guarantees. The upside is that there is no charge (this is completely free software), and things may change in the future. On the other hand, if you feel it's a useful product and/or you feel so inclined, we suggest something around the UK#10 or US$15 mark is about right. Contact John Spackman on Compuserve at 100034,207, on CIX as nasrudin or as nasrudin@cix.compulink.co.uk, or snail mail at: Cavendish Software Ltd 50 Avenue Road Trowbridge Wilts BA14 0AQ England Cavendish Software Ltd 14 You can also ring or fax us on 0225-763598 or 0225-777359 respectively; international callers will have to replace the leading 0 with 44. Cavendish Software Ltd 15 Copyright, Use & Copying NewTrack is copyright to Cavendish Software Limited, from 1991, and is released to the public free of charge or restriction of use, except that the package may not be sold in part or in whole except in a compiled form as part of an executable that cannot form part of another executable. NewTrack may be freely distributed as source so long as the entire package is given. NewTrack may be modified as required, but will remain copyright of Cavendish Software Ltd, until such changes have been made to render the source code and function of NewTrack unrecognisable from the original. This software is provided by Cavendish Software Limited "as is'' and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall Cavendish Software Limited be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage. Cavendish Software Ltd 16