new object causes corruption on the heap - c++

I've been struggling with a heap corruption problem for a few days. I was first warned by the vs 2005 debugger that I may have corrupted the heap, after deleting an object I had previously new'ed. Doing research on this problem led me to gflags and the page heap setting. After enabling this setting for my particular image, it supposedly pointed me to the line that is actually causing the corruption.
Gflags identified the constructor for the object in question as the culprit. The object derives as follows:
class POPUPS_EXPORT MLUNumber : public MLUBase
{
...
}
class POPUPS_EXPORT MLUBase : public BusinessLogicUnit
{
...
}
I can instantiate an MLUNumber in a separate thread, and no heap corruption occurs.
I can instantiate a different class, that also inherits from MLUBase, that does not cause heap corruption.
The access violation raises due to the corruption occurs on the opening brace of the constructor, which appears to be because of the implicit initializing of the object (?).
The base class constructor (MLUBase) successfully finishes.
From digging with the memory window in vs 2005, it appears that there was not enough space allocated for the actual object. My guess is that enough was allocated for the base class only.
The line causing the fault:
BusinessLogicUnit* biz = new MLUNumber();
I'm hoping for either a reason that might cause this, or another troubleshooting step to follow.

Unfortunately, with the information given, it's not possible to definitively diagnose the problem.
Some things you may want to check:
Make sure BusinessLogicUnit has a virtual destructor. When deleteing objects through a base pointer, a virtual destructor must be present in the base class for the subclass to be properly destructed.
Make sure you're building all source files with the same preprocessor flags and compiler options. A difference in flags (perhaps between debug/release flags?) could result in a change in structure size, and thus an inconsistency between sizes reported in different source files.
It's possible for some types of heap corruption to go undetected, even with your gflags settings. Audit your other heap uses to try to find the source of your issues as well. Ideally you should put together a minimal test case that will reliably crash, but with a minimum amount of activity, so you can narrow down the cause.
Try a clean solution and rebuild; I've occasionally seen timestamps getting screwed up, and an old object file can get in with an out-of-date structure definition. Worth checking at least :)

BusinessLogicUnit* biz = new MLUNumber();
How do you delete the memory? Using the base-class pointer? Have you made the destructor of BusinessLogicUnit virtual? It must be virtual.
class BusinessLogicUnit
{
public:
//..
virtual ~BusinessLogicUnit(); //it must be virtual!
};
Otherwise deleting the derived class object through the base-class pointer invokes undefined behavior as per the C++ Standard.

BusinessLogicUnit is not an MLUNumber. Why would you allocate this way? Instead
BusinessLogicUnit* biz = new BusinessLogicUnit();

Or maybe you do something like this?
struct A
{
SomeType & m_param;
A(SomeType & param) : m_param(param)
{
...use m_param here...
}
};
A a(SomeType()); // passing a temporary by reference
Then that's undefined behaviour, because the referenced temporary dies right after m_param(param) happens..

I agree with bdonlan that there isn't enough information yet to figure out what's wrong. There are a lot of good suggestions here, but just guessing possible reasons why an application is crashing is not a smart way to root cause an issue.
You've done the right thing by enabling instrumentation (pageheap) to help you narrow down the problem. I would continue down this path by finding out exactly which memory address is causing the access violation (and where the address came from).

Related

Should I reset primitive member variable in destructor?

Please see following code,
class MyClass
{
public:
int i;
MyClass()
{
i = 10;
}
};
MyClass* pObj = nullptr;
int main()
{
{
MyClass obj;
pObj = &obj;
}
while (1)
{
cout << pObj->i; //pObj is dangling pointer, still no crash.
Sleep(1000);
}
return 0;
}
obj will die once it comes out of scope. But I tested in VS 2017, I see no crash even after I use it.
Is it good practice to reset int member varialbe i?
Accessing a member after an object got destroyed is undefined behavior. It may seem like a good to set members in a destructor to a predictable and most likely unexpected value, e.g., a rather large value or a value with specific bit pattern making it easy to recognize the value in a debugger.
However, this idea is flawed and dwarved by the system:
All classes would need to play along and instead of concentrating on creating correct code developers would spent time (both development time as well as run-time) making pointless change.
Compilers happen get rather smart and can detect that changes in destructor are not needed. Since a correct program cannot detect whether the change was made they may not make the change at all. This effect is an actual issue for security applications where, e.g., a password should be erased from memory so it cannot be read (using some non-portable mean).
Even if the value gets set to a specific value, memory gets reused and the values get overwritten. Especially with objects on the stack it is most likely that the memory is used for something else before you see the bad value in a debugger.
Even when resetting values you would necessarily see a "crash": a crash is caused by something being setup to protect against something invalid. In your example you are accessing an int on the stack: the stack will remain accessible from a CPU point of view and at best you'd get an unexpected value. Use of unusual pointer values typically leads to a crash because the memory management system tries to access a location which isn't mapped but even that isn't guaranteed: on a busy 32 bit system pretty much all memory may be in use. That is, trying to rely on undefined behavior being detect is also futile.
Correspondingly, it is much better to use good coding practices which avoid dangling references right away and concentrate on using these. For example, I'm always initializing members in the member initializer list, even in the rare cases they end up getting changed in the body of the constructor (i.e., you'd write your constructor as MyClass(): i() {}).
As a debugging tool it may be reasonable to replace the allocation functions (ideally the allocator object but potentially the global operator new()/operator delete() and family with a version which doesn't quickly hand out released memory and instead fills the released memory with a predictable pattern. Since these actions slow down the program you'd only use this code in a debug build but it is relatively simple to implement once and easy to enable/disable centrally it may be worth the effort. In practice I don't think even such a system pays off as use of managed pointers and proper design of ownership and lifetime avoid most errors due to dangling references.
The behaviour of code you gave is undefined. Partial case of undefined behaviour is working as expected, so here is nothing strange that the code works. Code can work now and it can broke anyway at any time depending of compiler version, compiler options, stack content and a moon phase.
So first and most important is to avoid dangling pointers (and all other kinds of undefined behaviour) everywhere.
What about clearing variables in destructor, I found a best practice:
Follow coding rules saving me from mistakes of access to unallocated or destroyed objects. I cannot describe it in a few words but rules are pretty common (see here and anywhere).
Analyze code by humans (code review) or by statical analyzers (like cppcheck or PVS-Studio or another) to avoid cases similar to one you described above.
Do not call delete manually, better use scoped_ptr or similar object lifetime managers. When delete is reasonable, I usually (usually) set pointer to nullptr after deletion to keep myself from mistakes.
Use pointers as rare as it possible. References are preferred.
When objects of my class used outside and I suspect that somebody can access it after deletion I can put signature field inside, set it to something like 0xDEAD in destructor and check at enter or every public method. Here be careful to not slow down your code to unacceptable speed.
After all of this setting i from your example to 0 or -1 is redundant. As for me it's not a thing you should focus your attention.

Segmentation fault at the end of destructor

I don't know if this question is going to be clear, since I can't give too many details (I'm using a TPL and wrote a huge amount of lines myself). But I'll give it a try.
I am experiencing a segmentation fault which I can't understand. There is a structure (which I didn't design but should be well tested) whose destructor looks like this
Data::~Data()
{
if(A_ != 0) {
delete A_;
A_ = 0;
}
if(B_ != 0) {
delete B_;
B_ = 0;
}
if(C_ != 0) {
delete C_;
C_ = 0;
}
} // HERE
What's bothering me is that, while debugging, I get that the segfault happens at the line marked with 'HERE'. The class Data has only A_, B_ and C_ as dynamically allocated attributes. I also tried to explicitly call the destructor on the other non-dynamic composite attributes, to see if something went wrong during their destruction, but again the segfault happens at the end of the destructor. What kind of errors can give a segfault at that point?.
I hope the question is clear enough, I will add details if needed.
Edit: thanks for the replies. I know it's a scarse piece of code, but the whole library is of course too big (it comes from Trilinos, by the way, but I think the error is not their fault, it must be my mistake in handling their structures. I used short names to keep the problem more compact). Some remarks that somebody asked in the comment replies:
about the checks before the delete(s) and the raw pointers: as I said, it's not my choice. I guess it's a double protection in case something goes wrong and A_, B_ or C_ has been already deleted by some other owner of the data structure. The choice raw-pointers vs shared_ptr or other safe/smart pointers is probably due to the fact that this class is almost never used directly but only by an object of class Map that has a pointer to Data. This class Map is implemented in the same library, so they probably chose raw pointers since they knew what they were handling and how.
yes, the data structure is shared by all the copies of the same object. In particular, there is a Map class that contains a pointer to a Data object. All the Map's that are copies of one each other, share the same Data. A reference counter keeps track of how many Map's are holding a pointer to the data. The last Map to be destroyed, deletes the data.
the reference counter of the Data structure works correctly, I checked it.
I am not calling the destructor of this class. It is called automatically by the destructor of an object of class Map that has a pointer to Data as attribute.
Data inherits from BaseData, whose (virtual) destructor doesn't do anything, since it's just an interface defining class.
It's hard to post the code that reproduce the problem. For many reasons. The error appears only with more than 2 processes (it's an mpi program), and my guess it that a process has some list that is empty and tries to access some element.
about the error details. I can give you here the last items in the backtrace of the error during debugging (I apologize for the bad format, but I don't know how to put it nicely):
0x00007ffff432fba5 in raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
0x00007ffff43336b0 in abort () at abort.c:92
0x00007ffff436965b in __libc_message (do_abort=, fmt=) at ../sysdeps/unix/sysv/linux/libc_fatal.c:189
0x00007ffff43736d6 in malloc_printerr (action=3, str=0x7ffff4447780 "free(): corrupted unsorted chunks", ptr=) at malloc.c:6283
0x00007ffff4379ea3 in __libc_free (mem=) at malloc.c:3738
0x0000000000c21f71 in Epetra_BlockMapData::~Epetra_BlockMapData ( this=0x1461690, __in_chrg=) at /home/bartgol/LifeV/trilinos/trilinos-10.6.4-src/packages/epetra/src/Epetra_BlockMapData.cpp:110
To conclude, let me restate my doubt: what kind of errors can appear AT THE END of the destructor, even if ALL the attributes have been deleted already? Thanks again!
One problem that can cause a segfault at a function exit is heap or stack corruption.
It is possible that some other part of your program is causing problems. Something like double-destruction, or buffer overrun can cause memory corruption.
Often, debug builds of programs will include a check at function exit to ensure that the stack is intact. If it's not, well, you see the results.
When the explicit body of the class destructor completes, it proceeds to perform some implicit actions: it calls base class and member destructors (in case you have base classes and members with non-trivial destructors) and, if necessary, it calls raw memory deallocation function operator delete (yes, in a typical implementation operator delete is actually called from inside the destructor). One of these two implicit processes caused the crash in your case, apparently. There's no way to say precisely without more information.
P.S. Stylistically the code is awful. Why are they checking for null before doing delete? What is the point of nulling deleted pointers in the destructor?
It's hard to tell from the scarce code you show. It could be easily that you already released resources one of your class members or your base class uses in it's own destructor.

Why is vector deleting destructor being called as a result of a scalar delete?

I have some code that is crashing in a large system.
However, the code essentially boils down to the following pseudo-code.
I've removed much of the detail, as I have tried to boil this down to the bare bones;
I don't think this misses anything crucial though.
// in a DLL:
#ifdef _DLL
#define DLLEXP __declspec(dllexport)
#else
#define DLLEXP __declspec(dllimport)
#endif
class DLLEXP MyClass // base class; virtual
{
public:
MyClass() {};
virtual ~MyClass() {};
some_method () = 0; // pure virtual
// no member data
};
class DLLEXP MyClassImp : public MyClass
{
public:
MyClassImp( some_parameters )
{
// some assignments...
}
virtual ~MyClassImp() {};
private:
// some member data...
};
and:
// in the EXE:
MyClassImp* myObj = new MyClassImp ( some_arguments ); // scalar new
// ... and literally next (as part of my cutting-down)...
delete myObj; // scalar delete
Note that matching scalar new and scalar delete are being used.
In a Debug build in Visual Studio (2008 Pro),
in Microsoft's <dbgheap.c>,
the following assertion fails:
_ASSERTE(_CrtIsValidHeapPointer(pUserData));
Near the top of the stack are the following items:
mydll_d.dll!operator delete()
mydll_d.dll!MyClassImp::`vector deleting destructor'()
I think this ought to be
mydll_d.dll!MyClassImp::`scalar deleting destructor'()
That is, the program is behaving as if I'd written
MyClassImp* myObj = new MyClassImp ( some_arguments );
delete[] newObj; // array delete
The address in pUserData is that of myObj itself (as opposed to a member).
The memory around that address looks like this:
... FD FD FD FD
(address here)
VV VV VV VV MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
FD FD FD FD AB AB AB AB AB AB AB AB EE FE EE FE
...
where the four VVs are presumably the address of the virtual function table,
the MM...MM is recognisable member data,
and the other bytes are various special markers put in place by the debugger
(e.g. the FD FDs are 'guard bytes' around the object's storage).
Shortly before the assertion failure I do see the VVs change,
and wonder if that is due to a switch to the base class's virtual function table.
I'm aware of the problem of the wrong level in the class hierarchy undergoing destruction.
That's not the problem here; my destructors are all virtual.
I note Microsoft's page
"BUG: Wrong Operator Delete Called for Exported Class"
http://support.microsoft.com/kb/122675
but that seems to be regarding the wrong executable (with the wrong heap) attempting to take responsibility for destruction of the data.
In my case, it's that the wrong 'flavour' of deleting destructor appears to be being applied:
i.e. vector rather than scalar.
I am in the process of trying to produce minimal cut-down code that still exhibits the problem.
However, any hints or tips to help with how to investigate this problem further would be much appreciated.
Perhaps the biggest clue here is the mydll_d.dll!operator delete() on the stack.
Should I expect this to be myexe_d.exe!operator delete(),
indicating that the DLLEXPs have been 'lost'?
I suppose this could be an instance of a double-delete (but I don't think so).
Is there a good reference I can read regarding what _CrtIsValidHeapPointer checks for?
Sounds like this could be an issue of allocating off of one heap and trying to delete on another. This can be an issue when allocating objects from a dll as the dll has its own heap. From the code you're showing it doesn't seem like this would be the problem but maybe in the simplification something was lost? In the past I've see code like this use factory functions and virtual destroy methods on the objects to make sure that the allocation and deletion happens in the dll code.
Microsoft provides the source for their C runtime; you can check there to see what _CrtIsValidHeapPointer does. On my installation, it's under C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\crt\src\dbgheap.c.
One other suggestion is to check the disassembly of
delete newObj; // scalar delete
and compare it to the disassembly generated for
delete[] newObj;
and
delete pointerToClassLikeMyClassThatIsInExeAndNotDll;
to test your theory about delete[] being called. Similarly, you could check the call stack for
delete pointerToClassLikeMyClassThatIsInExeAndNotDll;
to test your theory about mydll_d.dll!operator delete() versus myexe_d.exe!operator delete().
Thank you for all the answers and comments.
All have been useful and relevant.
Any further information is still welcome.
The following was a comment on my question from Hans Passant:
Once you start exporting classes from DLLs,
compiling with /MD becomes very important.
Looks like /MT to me.
As a result of that, I took a closer look at the linkage setting throughout the project.
I found a 'buried' instance of /MT and /MTd that should have been /MD and /MDd,
plus some related inconsistencies in other settings.
Having corrected those,
no assertion is now thrown, and the code appears to be behaving correctly.
Here are some of the things to check when experiencing crashes or assertion failures at execution leaves scopes and destructors are called.
Ensure that throughout all projects (including dependencies)
and in all configurations (especially in the problematic one):
(Here the *.vcproj paths are relative to </VisualStudioProject/Configurations/Configuration/>.)
The correct runtime is selected in
C/C++ | Code Generation | Runtime Library
<Tool[#Name="VCCLCompilerTool"]/#RuntimeLibrary>;
Appropriate definitions (if any) are made in
C/C++ | Preprocessor | Preprocessor Definitions
<Tool[#Name="VCCLCompilerTool"]/#PreprocessorDefinitions>
especially relating to the use of static versus dynamic libraries
(e.g. _STLP_USE_STATIC_LIB versus _STLP_USE_DYNAMIC_LIB for STLport);
Appropriate versions of libraries are selected in
Linker | Input | Additional Dependencies
<Tool[#Name="VCLinkerTool"]/#AdditionalDependencies>
especially relating to static runtime libraries versus 'wrappers' for DLLs
(e.g. stlport_static.lib versus stlport.N.M.lib).
Interestingly, the scalar 'flavour' of delete I'd expect still does not appear to be being called (the breakpoint is never hit).
That is, I still only see the vector deleting destructor.
Therefore, that may have been a 'red herring'.
Perhaps this is just a Microsoft implementation issue,
or perhaps there's still some other subtlety I've missed.
This behavior is special to MSVC 9, where the delete operator for a exported class, that has a virtual destructor is implicitly generated and mangled to vector dtor with an associated flag, where 1 means (scalar) and 3 means (vector).
The true problem with this thing is, that it breaks the canonical form of new/delete, where the client coder is not able to disable the vector delete operator in its code, if he thinks, that it is a bad idea to use it.
Moreover the vector dtor, also seems to be executed wrong, if the new is allocated in another module than the module the implementation resides in and is then hold in a static variable via a reference count, which executes a delete this (here the vector dtor comes into play) on process shutdown.
This matches to the heap problem 'bshields' already mentioned before, the dtor is executed on the wrong heap and the code crahses either with 'cannot read that memory location' or 'access violation' on shutdown - such problems seem to be very common.
The only way around this bug, is to forbid the usage of a virtual destructor and execute it your own, by enforcing the usage of a delete_this function from the base class - stupid as it is you imitate the stuff the virtual dtor should do for you. Then also the scalar dtor is executed and on shutdown any ref counted objects shared between modules, can be instantiated in a safe way, because heap is always addressed correctly to the origin module.
To check, if you have such problems, simply forbid the usage of the vector delete operator.
In my case, it's that the wrong 'flavour' of deleting destructor
appears to be being applied: i.e. vector rather than scalar.
That's not the problem here. As per the pseudocode in Mismatching scalar and vector new and delete, the scalar deleting destructor simply calls through to the vector deleting descructor with a flag saying "Do scalar destruction rather than vector destruction".
Your actual problem, as noted by other posters, is you're allocating on one heap, and deleting on another. The clearest solution is to give your classes overloads of operator new and operator delete, as I described in an answer to a similar question: Error deleting std::vector in a DLL using the PIMPL idiom

Access violation exception when calling a method

I've got a strange problem here. Assume that I have a class with some virtual methods. Under a certain circumstances an instance of this class should call one of those methods. Most of the time no problems occur on that stage, but sometimes it turns out that virtual method cannot be called, because the pointer to that method is NULL (as shown in VS), so memory access violation exception occurs. How could that happen?
Application is pretty large and complicated, so I don't really know what low-level steps lead to this situation. Posting raw code wouldn't be useful.
UPD: Ok, I see that my presentation of the problem is rather indefinite, so schematically code looks like
void MyClass::FirstMethod() const { /* Do stuff */ }
void MyClass::SecondMethod() const
{
// This is where exception occurs,
// description of this method during runtime in VS looks like 0x000000
FirstMethod();
}
No constructors or destructors involved.
Heap corruption is a likely candidate. The v-table pointer in the object is vulnerable, it is usually the first field in the object. A buffer overflow for some kind of other object that happens to be adjacent to the object will wipe the v-table pointer. The call to a virtual method, often much later, will blow.
Another classic case is having a bad "this" pointer, usually NULL or a low value. That happens when the object reference on which you call the method is bad. The method will run as usual but blow up as soon as it tries to access a class member. Again, heap corruption or using a pointer that was deleted will cause this. Good luck debugging this; it is never easy.
Possibly you're calling the function (directly or indirectly) from a constructor of a base class which itself doesn't have that function.
Possibly there's a broken cast somewhere (such as a reinterpret_cast of a pointer when there's multiple inheritance involved) and you're looking at the vtable for the wrong class.
Possibly (but unlikely) you have somehow trashed the vtable.
Is the pointer to the function null just for this object, or for all other objects of the same type? If the former, then the vtable pointer is broken, and you're looking in the wrong place. If the latter, then the vtable itself is broken.
One scenario this could happen in is if you tried to call a pure virtual method in a destructor or constructor. At this point the virtual table pointer for the method may not be initialized causing a crash.
Is it possible the "this" pointer is getting deleted during SecondMethod's processing?
Another possibility is that SecondMethod is actually being called with an invalid pointer right up front, and that it just happens to work (by undefined behavior) up to the nested function call which then fails. If you're able to add print code, check to see if "this" and/or other pointers being used is something like 0xcdcdcdcd or 0xfdfdfdfd at various points during execution of those methods. Those values are (I believe) used by VS on memory alloc/dealloc, which may be why it works when compiled in debug mode.
What you are most likely seeing is a side-effect of the actual problem. Most likely heap or memory corruption, or referencing a previously freed object or null pointer.
If you can consistently have it crash at the same place and can figure out where the null pointer is being loaded from then I suggest using the debugger and put a breakpoint on 'write' at that memory location, once the breakpoint is trigerred then most likely you are viewing the code that has actually caused the corruption.
If memory access violation happens only when Studio fails to show method address, then it could be caused by missing debug information. You probably are debugging the code compiled with release (non-debug) compiler/linker flags.
Try to enable some debug info in C++ properties of project, rebuild and restart debugger. If it will help, you will see all normal traceable things like stack, variables etc.
If your this pointer is NULL, corruption is unlikely. Unless you're zeroing memory you shouldn't have.
You didn't say if you're debugging Debug (not optimized) or Release (optimized) build. Typically, in Release build optimizer will remove this pointer if it is not needed. So, if you're debugging optimized build, seeing this pointer as 0 doesn't mean anything. You have to rely on the deassembly to tell you what's going on. Try turning off optimization in your Release build if you cannot reproduce the problem in Debug build. When debugging optimized build, you're debugging assembly not C++.
If you're already debugging a non-optimized build, make sure you have a clean rebuild before spending too much time debugging corrupted images. Debug builds are typically linked incrementally and incremental linker are known to produce problems like this. If you're running Debug build with clean build and still couldn't figure out what went wrong, post the stack dump and more code. I'm sure we can help you figure it out.

Can objects be unwinded before they are created on stack?

We have been debugging a strange case for some days now, and have somewhat isolated the bug, but it still doesn't make any sense. Perhaps anyone here can give me a clue about what is going on.
The problem is an access violation that occur in a part of the code.
Basically we have something like this:
void aclass::somefunc() {
try {
erroneous_member_function(*someptr);
}
catch (AnException) {
}
}
void aclass::erroneous_member_function(const SomeObject& ref) {
// { //<--scope here error goes away
LargeObject obj = Singleton()->Object.someLargeObj; //<-remove this error goes away
//DummyDestruct dummy1//<-- this is not destroyed before the unreachable
throw AnException();
// } //<--end scope here error goes away
UnreachableClass unreachable; //<- remove this, and the error goes away
DummyDestruct dummy2; //<- destructor of this object is called!
}
While in the debugger it actually looks like it is destructing the UnreachableClass, and when I insert the DummyDestruct object this does not get destroyed before the strange destructor are called. So it is not seem like the destruction of the LargeObject is going awry.
All this is in the middle of production code, and it is very hard to isolate it to a small example.
My question is, does anyone have a clue about what is causing this, and what is happening? I have a quite full featured debugger available (Embarcadero RAD studio), but now I am not sure what to do with it.
Can anyone give me some advise on how to proceed?
Update:
I placed a DummyDestruct object beneath the throw clause, and placed a breakpoint in the destructor. The destructor for this object is entered (and its only us is in this piece of code).
With the information you have provided, and if everything is as you state, the only possible answer is a bug in the compiler/optimizer. Just add the extra scope with a comment (This is, again, if everything is exactly as you have stated).
Stuff like this sometimes happens due to writing through uninitialized pointers, out of bounds array access, etc. The point at which the error is caused may be quite removed from the place where it manifests. However, based on the symptoms you describe it seems to be localized in this function. Could the copy constructor of LargeObject be misbehaving? Is ref being used? Perhaps somePtr isn't pointing to a valid SomeObject. Is Singleton() returning a pointer to a valid object?
Compiler error is also a possibility, especially with aggressive optimization turned on. I would try to recreate the bug with no optimizations.
Time to practice my telepathic debugging skills:
My best guess is your application has a stack corruption bug. This can write junk over the call stack, which means the debugger is incorrectly reporting the function when you break, and it's not really in the destructor. Either that or you are incorrectly interpreting the debugger's information and the object really is being destructed correctly, but you don't know why!
If stack corruption is the case you're going to have a really tough time working out what the root cause is. This is why it's important to implement tonnes of diagnostics (eg. asserts) throughout your program so you can catch the stack corruption when it happens, rather than getting stuck on its weird side effects.
This might be a real long shot but I'm going to put it out there anyway...
You say you use borland - what version? And you say you see the error in a string - STL? Do you include winsock2 at all in your project?
The reason I ask is that I had a problem when using borland 6 (2002) and winsock - the header seemed to mess up the structure packing and meant different translation units had a different idea of the memory layout of std::string, depending on what headers were included by the translation unit, with predictably disastrous results.
Here's another wild guess, since you mentioned strings. I know of at least one implementation where (STL) string copying is done in a lazy manner (i.e., no actual copying of the string contents takes place until a change is made; the "copying" is done by simply having the target string object point to the same buffer as the source). In that particular implementation (GNU) there is a bug whereby excessive copying causes the reference counter (how many objects are using the same actual string memory after supposedly copying it) to roll over to 0, resulting in all sorts of mischief. I haven't encountered this bug myself, but have been told about it by someone who has. (I say this because one would think that the ref counter would be a 32 bit number and the chances of that ever rolling over are pretty slim, to say the least, so I may not be describing the problem properly.)