Using STL containers without exception handling, in low memory situation - c++

I've been trying to deal with low memory situation in my VC++ code.
I've used std::nothrow and checking returns value of new operator for NULL. Application works fine.
But problem is at very low system memory and it crashes abruptly anywhere especially inside STL containers calls (map, vector, queue etc) and the error is "Exception bad_alloc". Obviously these containers cannot allocate required memory so they simply throw bad_alloc.
Now since I've used these containers liberally in my code, I just don't want each and every function inside "try...catch" block. It would clutter the code. (And moreover, the code uses event based library. So, many of the functions are callbacks. Hence, its not like one or few parent caller function(s) I can put in try/catch block and solve this problem)
Without using try/catch, how can this problem be addressed?
At least can someone please tell which of these containers and methods throw bad_alloc (So that I will try putting only that particular code in try/catch block)

If you're not using dynamic_cast or any other features that that it gives you, you can turn off RTTI - that might save you a bit, but probably not enough.
The only other option I can offer is to profile your memory usage, and optimize your code so that you're freeing things you no longer need earlier.

You ask: "how can this problem be addressed?"
Well, what is the problem?
"I don't have enough memory to run my program" — procure more
"My program uses too much memory" — use less
You can't magically work around it any other way.

Related

How to manage mix use of raw pointers and unique_ptr in different class ? (Exceptions ?)

I have a container of objects stored with unique_ptr, for simplicity say i have only one object :
class Container { std::unique_ptr<A> ptrA; }
I also have class that uses the object. These class take a raw pointer to these objects when they are constructed:
class B { A* a;
B(*A param) : a(param) }
They are created with : B b = B(Container.ptrA.get() );
The Container class is supposed to outlive the class B. However I'd like my whole program not to crash in the case there is an issue or a bug in my class Container and the unique_ptr goes out of scope and get deleted.
My question is about the design you would take to manage this 1% case so my program can try to reload the data and avoid crashing suddenly, would you use exceptions ? If so where would you do try/catch ?
Thanks !
When you use std::unique_ptr you're making a design decision: Container owns the pointer. Trying to work around that fact is only going to make your life harder.
But in fact you said Container outlives B. Why don't you just enforce that instead of being overly defensive against bugs that would probably break your program in several other ways?
I would say don't use shared_ptr to hide bugs. If your unique_ptr is designed to outlive the raw pointer then I would want the program to crash if there is a bug. Then I have something to fix. It's much worse when the bugs go undetected because they are hidden from you. Remember, a crash gives you a point of failure to investigate. But if the bugs go undetected you may not be able to find what's making things go wrong.
If you'd like your program not to crash, then use std::shared_ptr for both pointers.
That would be the easiest solution.
Otherwise, you will need to put in some kind of a mechanism by which the Container class tracks the number of instances of the B class, that use the same pointer, then throw an exception in the destructor if the Container is getting destroyed while there are still an instance of B somewhere. If its unique_ptr is getting blown away for some other reason, other than the destructor getting invoked, the same check would apply there, as well.
That's presuming that throwing an exception is what you would like to do to handle this edge case. It's not clear what you mean "can try to reload the data", but as then designer and the implementer of your application you need to decide how you are going to handle this situation. Nobody else can make the call for you, you know more about your overall application than anyone else. There is no universal, single answer here that will work best for every application in every situation.
But whatever you decide should be an appropriate course of action: throw an exception; or create a new instance of the object, stuff it into the unique_ptr and then update all native pointers in all the B classes that you're keeping track of, somehow; that would be your call to make. What's the best approach is a subjective call to make. There is no objective answer for that part.
Now, getting back to the technical aspects, keeping track of how many instances of the B class can be as simple as keeping a counter in the container, and have B's constructor and destructor update it accordingly. Or maybe have Container keep a container of pointers to all instances of B. In either case, don't forget to do the right thing in the copy constructor and the assignment operator.
But I think it's just easier to use use a std::shared_ptr in both classes, and not worry about any of this. Even though doing this kind of class bookkeeping is not rocket science, why bother when you can simply have std::shared_ptr do this for you.
Philosophically: this is not a great idea, at least in C++.
The Container class is supposed to outlive the class B. However I'd like my whole program not to crash in the case there is an issue or a bug ...
It sounds like you want a "safer" language.
The idea that you can write code that "should" work but is robust against ownership/lifetime errors is...pretty much anathema to the goals of low-level languages like C++ with explicit lifetime management, I think.
If you really want to write a program that simply doesn't crash, use a language with a runtime that manages memory and lifetimes for you—that is, a garbage-collected language like Java or Python. These languages are designed to "protect you from yourself," so to speak. In theory, they prevent you from encountering the sorts of errors you're describing by managing memory for you.
But part of the point of using low-level languages is to take advantage of explicit memory management. With C++ you can (in theory) write software that, in comparison to software written in managed languages, runs faster, has a smaller memory footprint, and releases system resources (such as filehandles) sooner.
The right approach in C++ is the one you're already using.
Explicitly letting your container class own the underlying objects and representing this ownership using unique_ptr is exactly correct in modern C++, and there is no reason to expect this approach not to work if your system is carefully engineered.
The key question, though, is how can you guarantee that your container class will stay alive and keep your owned objects alive throughout the entire lifetime of the "user" objects (in this case class B instances)? Your question doesn't provide enough details about your architecture for us to answer this, because different designs will require different approaches. But if you can explain how your system (in theory) provides this guarantee, then you are probably on the right track.
If you still have concerns, there are some approaches for dealing with them.
There are many reasons to have valid concerns about lifetime management in C++; a major one is if you are inheriting a legacy codebase and you're not sure it manages lifetimes appropriately.
This can happen even with modern C++ features such as unique_ptr. I'm working on a project that only got started last year, and we've been using C++14 features, including <memory>, since the beginning, and I definitely consider it a "legacy" project:
Multiple engineers who were on the project have now left; 60,000+ lines are "unowned" in the sense that their original author is no longer on the project
There are very few unit tests
There are occasional segfaults :D
Note that a bug in your lifetime management may not cause a crash; if it did, that would be fantastic, because, as Galik says in their answer, this would give you a point of failure to investigate. Unfortunately, there's no way to guarantee that dereferencing a stale pointer will cause a crash, because this is (obviously) undefined behavior. Thus your program could keep running and silently do something utterly disastrous.
Signal-catching
However, a crash—specifically, a segfault—is the most likely result of the error you describe, because a segfault is something you can (sort of) program around.
This is the weakest approach in terms of what kinds of fault-handling behavior you can implement: simply catch the SEGFAULT signal and try to recover from it. Signal-catching functions have some pretty severe limitations, and in general if your lifetime management is screwed up there's probably no way to make reasonable guarantees about what memory you can trust and what memory you can't, so your program might be doomed no matter what you do when you catch the signal.
This is not a good approach to "fixing" broken software; however, it is a very reasonable way to provide a clean exit path for unrecoverable errors (e.g. it will allow you to emulate the classic "memory error at " error messages). Additionally, if all you want to do is to restart your entire application and hope for the best, you can probably implement this using a signal-catcher, although a better approach may be to implement a second "watcher" application that restarts your software when it crashes.
std::shared_ptr
Joachim Pileborg is correct that a std::shared_ptr will work in this case, but (1) shared_ptr has some overhead compared to raw pointers (if you care about that) and (2) it requires changing your entire lifetime-management scheme.
Also, as pointed out by Galik in the comments, when there is a lifetime-management bug, the lifetime of the owned object will be extended; the object will still exist after the shared_ptr has been removed from the container if any shared_ptrs in your B class instances are still active.
std::weak_ptr
Your best bet might be a weak_ptr. This also requires changing your lifetime-management scheme to use shared_ptr, but it has the benefit of not keeping old objects around just because a shared_ptr to them exists somewhere outside of your lifetime-managing containers.
Not all low-level languages are this unforgiving, though.
I'm a bit biased because I love the philosophies behind the Rust language, so this is a bit of a plug. Rust enforces correct lifetime-management at compile-time. Rust is just as low-level as C++ in the sense that it gives full control over memory management, memory access, etc., but it's a "modern" high-level language in that it's closer to a redesigned version of C++ than it is to, say, C.
But the key point for our purposes is that the limitations Rust puts on you in terms of what it considers an "ownership" or lifetime-management error enable far better guarantees of program correctness than any possible static analysis of a C or C++ program ever could.

C++ memory allocation errors without use of new

I am having issues with my program throwing a large number of memory allocation exceptions and I am having a very hard time diagnosing the problem...I would post code, but my program is very large and I have proprietary information concerns, so I am hoping to get some help without posting the code. If you plan on responding with some form of SSCCE comment, just stop reading now and save both of us some time. This is a case where I cannot post succinct code - I will try to be as clear and concise as possible with my problem description and some specific questions.
Program Background - my program is basically a data cruncher. It takes a bunch of data tables as inputs, performs calculations on them, and spits out new data tables based on the calculation results. All of my data structures are user-defined classes (consisting of int, double and string types with vector containers for arrays). In all cases, I initiate instances of class variables without the use of new and delete.
Problem Description - my program compiles without warnings, and runs fine on smaller datasets. However, once I increase the dataset (from a 20x80 array to 400x80), I start throwing bad_alloc exceptions (once I've processed the first 35 entries or so). The large datasets runs fine in 17 of my 18 modules - I have isolated one function where the errors are occurring. The calculations needed for this function would result in about 30,000 rows of data being created, whereas other functions in my code generate 800,000+ rows without incident.
The only real unique attribute in this module is that I am using resize a lot (about 100 times per function call), and that the function uses recursive loops during the resize operation (the function is allocating square feet out of a building one tenant at a time, and then updating the remaining feet to be allocated after each tenant lease size and duration is simulated, until all square feet are allocated). Also, the error is happening at nearly the same place each time (but not the exact same location because I have a random number generator that is throwing in some variation to the outcomes). What really confounds me is that the first ~34 calls to this function work fine, and the ~35 call does not require more memory than the previous 34, yet I am having these bad_alloc exceptions on the 35th call nonetheless...
I know it's difficult to help without code. Please just try to give me some direction. My specific questions are as follows:
If I am not using "new" and "delete", and all of my variables are being initialized INSIDE of local functions, is it possible to have memory leaks / allocation problems through repeated function calls? Is there anything I can or should do to manage memory when initializing variables include of local function using "vector Instance;" to declare my variables?
Is there any chance I am running low on stack memory, if I am doing the whole program through the stack? Is it possible I need to load some of my big lookup tables (maps, etc.) on the heap and then just use the stack for my iterations where speed is important?
Is there a problem with using resize a lot related to memory? Could this be an instance where I should use "new" and "delete" (I've been warned in many instances not to use those unless there is a very strong, specific reason to do so)?
[Related to 3] Within the problem function, I am creating a class variable, then writing over that variable about 20 times (once for each "iteration" of my model). I don't need the data from the previous iteration when I do this...so I could ostensibly create a new instance of the variable for each iteration, but I don't understand how this would help necessarily (since clearly I am able to do all 20 iterations on one instance on the first ~34 data slices)
Any thoughts would be appreciated. I can try to post some code, but I already tried that once and everyone seemed to get distracted by the fact that it wasn't compilable. I can post the function in question but it doesn't compile by itself.
Here is the class that is causing the problem:
// Class definition
class SpaceBlockRentRoll
{
public:
double RBA;
string Tenant;
int TenantNumber;
double DefaultTenantPD;
int StartMonth;
int EndMonth;
int RentPSF;
vector<double> OccupancyVector;
vector<double> RentVector;
};
// Class variable declaration (occuring inside function)
vector<SpaceBlockRentRoll> RentRoll;
Also, here is a snippet from the function where the recursion occurs
for (int s=1; s<=NumofPaths; ++s) {
TenantCounter = 0;
RemainingTenantSF = t1SF;
if (RemainingTenantSF > 0) {
while (RemainingTenantSF > 0) {
TenantCounter = TenantCounter + 1;
// Resize relevant RentRoll vectors
ResizeRentRoll(TenantCounter, NumofPaths, NumofPeriods, RentRoll);
// Assign values for current tenant
RentRoll[TenantCounter] = AssignRentRollValues(MP, RR)
// Update the square feet yet to be allocated
RemainingTenantSF = RemainingTenantSF - RentRoll[TenantCounter].RBA;
}
}
}
bad_alloc comes from heap problems of some kind, and can be thrown by any code that indirectly allocates or frees heap memory, which includes all the standard library collections (std::vector, std::map, etc) as well as std::string.
If your programs do not use a lot of heap memory (so they're not running out of heap), bad_allocs are likely caused by heap corruption, which is generally caused by using dangling pointers into the heap.
You mention that your code does a lot of resize operations -- resize on most collections will invalidate all iterators on the collection, so if you reuse any iterator after a resize, that may cause heap corruption that manifests bad_alloc exceptions. If you use unchecked vector element accesses (std::vector::operator[]), and your indexes are out of range, that can cause heap corruption as well.
The best way to track down heap corruption and memory errors in general is to use a heap debugger such as valgrind
Classes like std::vector and std::string are allowed to throw bad_alloc or other exceptions. After all, they have to use some memory that comes from somewhere, and any computer only has so much memory to go around.
Standard 17.6.5.12/4:
Destructor operations defined in the C++ standard library shall not throw exceptions. Every destructor in the C++ standard library shall behave as if it had a non-throwing exception specification. Any other functions defined in the C++ standard library that do not have an exception-specification may throw implementation-defined exceptions unless otherwise specified. [Footnote 1] An implementation may strengthen this implicit exception-specification by adding an explicit one.
Footnote 1: In particular, they can report a failure to allocate storage by throwing an exception of type bad_alloc, or a class derived from bad_alloc (18.6.2.1). Library implementations should report errors by throwing exceptions of or derived from the standard exception classes (18.6.2.1, 18.8, 19.2).
If I am not using "new" and "delete", and all of my variables are being initialized INSIDE of local functions, is it possible to have memory leaks / allocation problems through repeated function calls?
Unclear. If all the variables you refer to are local, no. If you're using malloc(), calloc(), and free(), yes.
Is there any chance I am running low on stack memory, if I am doing the whole program through the stack?
Not if you get bad_alloc. If you got a 'stack overflow' error, yes.
Is it possible I need to load some of my big lookup tables (maps, etc.) on the heap and then just use the stack for my iterations where speed is important?
Well, it's hard to believe that you need a local copy of a lookup table in every stack frame of a recursive method.
Is there a problem with using resize a lot related to memory?
Of course. You can run out.
Could this be an instance where I should use "new" and "delete"
Impossible today without knowing more about your data structures.
(I've been warned in many instances not to use those unless there is a very strong, specific reason to do so)?
By whom? Why?
Within the problem function, I am creating a class variable,
You are creating an instance of the class on the stack. I think. Please clarify.
then writing over that variable about 20 times (once for each "iteration" of my model).
With an assignment? Does the class have an assignment operator? Is it correct? Does the class itself use heap memory? Is it correctly allocated and deleted on construction, destruction, and assignment?
Since, as you said, you are using std::vector with default allocator, problem occurs when you use a lot of std::vector::resize(...) and it occurs after some iterations, my guess is that you run into heap fragmentation problem.

Creating a scoped custom memory pool/allocator?

Would it be possible in C++ to create a custom allocator that works simply like this:
{
// Limit memory to 1024 KB
ScopedMemoryPool memoryPool(1024 * 1024);
// From here on all heap allocations ('new', 'malloc', ...) take memory from the pool.
// If the pool is depleted these calls result in an exception being thrown.
// Examples:
std::vector<int> integers(10);
int a * = new int [10];
}
I couldn't find something like this in the boost libraries, or anywhere else.
Is there a fundamental problem that makes this impossible?
You would need to create a custom allocator that you pass in as a template param to vector. This custom allocator would essentially wrap the access to your pool and do whatever size validations that it wants.
Yes you can make such a construct, it's used in many games, but you'll basically need to implement your own containers and call memory allocation methods of that pool that you've created.
You could also experiment with writing a custom allocator for the STL containers, although it seems that that sort of work is generally advised against. (I've done it before and it was tedious, but I don't remember any specific problems.)
Mind- writing your own memory allocator is not for the faint of heart. You could take a look at Doug Lea's malloc, which provides "memory spaces", which you could use in your scoping construct somehow.
I will answer a different question. Look at 'efficient c++' book. One of the things they discuss is implementing this kind of thing. That was for a web server
For this particular thing you can either mess at the c++ layer by overriding new and supplying custom allocators to the STL.
Or you can mess at the malloc level, start with a custom malloc and work from there (like dmalloc)
Is there a fundamental problem that makes this impossible?
Arguing about program behavior would become fundamentally impossible. All sorts of weird issues will come up. Certain sections of the code may or may not execute though this will seeminly have no effect on the next sections which may work un-hindered. Certain sections may always fail. Dealing with the standard-library or any other third party library will become extremely difficult. There may be fragmentations at run-time at times and at times not.
If intent is that all allocations within that scope occur with that allocator object, then it's essentially a thread-local variable.
So, there will be multithreading issues if you use a static or global variable to implement it. Otherwise, not a bad workaround for the statelessness of allocators.
(Of course, you'll need to pass a second template argument eg vector< int, UseScopedPool >.)

How do detect if an address will cause an access violation?

I'm creating a class for a Lua binding which holds a pointer and can be changed by the scripter. It will include a few functions such as :ReadString and :ReadBool, but I don't want the application to crash if I can tell that the address they supplied will cause an access violation.
Is the a good way to detect if an address is outside of the readable/writable memory? Thanks!
A function library that may be useful is the "Virtual" function libraries, for example VirtualQuery
I'm not really looking for a foolproof design, I just want to omit the obvious (null pointers, pointers way outside the possible memory location)
I understand how unsafe this library is, and I'm not looking for safety, just sanity.
There are ways, but they do not serve the purpose you intend. That is; yes, you can determine whether an address appears to be valid at the current moment in time. But; no, you cannot determine whether that address will be valid a few clock cycles from now. Another thread could change the virtual memory map and a formerly valid address would become invalid.
The only way to properly handle the possibility of accessing suspect pointers is using whatever native exception handling is available on your platform. This may involve handling the signal SIG_BUS or it may involve using the proprietary __try and __catch extensions.
The idiom to use is the one wherein you attempt the access, and explicitly handle the resulting exception, if any does happen to occur.
You also have the problem of ensuring that the pointers you return point to your memory or to some other memory. For that, you could make your own data structure, a tree springs to mind, which stores the valid address ranges your "pointers" can achieve. Otherwise, the script code can hand you some absolute addresses and you will change memory structures by the operating system for the process environment.
The application you write about is highly suspect and you should probably start over with a less explosive design. But I thought I would tell you how to play with fire if you really want to.
Check out Raymond Chen's blog post, which goes more deeply into why this practice is bad. Most interestingly, he points out that, once a page is tested by IsBadReadPtr, further accesses to that page will not raise exceptions!
There is no, and that's why you should never do things like this.
Perhaps try using segvcatch, which can convert segfaults into C++ exceptions.

Whats the right approach for error handling in C++

One is to use C++ exceptions: try catch blocks. But freeing dynamic memory will be an issue when an exception is raised.
Second is to use C style: errno variable
Third is just to return -1 on error and 0 on success :)
Which way should be chosen for a mid-size project and why? Any other better approach..?
But freeing dynamic memory will be an issue when an exception is raised.
No it's not. std::vector<int> v(100); Done.
The concept here is called Scope-Bound Resource Management (SBRM), also known by the much more common (and awkward) name Resource Acquisition Is Initialization (RAII). Basically, all resources are contained in some object which will clean up the resource in the destructor (which is always guaranteed to be run for an automatically allocated object). So whether or not the function exists normally or via exception, the destructor is run and your resource is cleaned up.
Never do an allocation where you need to free it explicitly, use containers and smart pointers.
Second is to use C style: errno variable
Third is just to return -1 on error and 0 on success :)
And how do they help solving your problem of freeing dynamic memory? They also use an early-exit strategy, same as throw.
So in summary, they don’t have an advantage over C++ exceptions (according to you).
In the first place, you should strive for a program with minimum error cases. (Because errors are not cool.)
Exceptions are a nice tool but should be used conservatively: reserve them for "exceptional cases", do not use them to control the flow of your program.
For example, do not use exceptions to test whether a user input is correct or not. (For such a case, return an error code.)
One is to use C++ exceptions: try
catch blocks. But freeing dynamic
memory will be an issue when an
exception is raised.
#see RAII.
Exceptions should be your preferred method of dealing with exceptional runtime situations like running out of memory. Note that something like std::map::find doesn't throw (and it shouldn't) because it's not necessarily an error or particularly exceptional case to search for a key that doesn't exist: the function can inform the client whether or not the key exists. It's not like a violation of a pre-condition or post-condition like requiring a file to exist for a program to operate correctly and finding that the file isn't there.
The beauty of exception-handling, if you do it correctly (again, #see RAII), is that it avoids the need to litter error-handling code throughout your system.
Let's consider a case where function A calls function B which calls C then D and so on, all the way up to 'Z'. Z is the only function that can throw, and A is the only one interested in recovering from an error (A is the entry point for a high-level operation, e.g., like loading an image). If you stick to RAII which will be helpful for more than exception-handling, then you only need to put a line of code in Z to throw an exception and a little try/catch block in A to catch the exception and, say, display an error message to the user.
Unfortunately a lot of people don't adhere to RAII as strictly as they should in practice, so a lot of real world code has more try/catch blocks than should be necessary to deal with manual resource cleanup (which shouldn't have to be manual). Nevertheless, this is the ideal you should strive to achieve in your code, and it's more practical if it's a mid-sized project. Likewise, in real world scenarios, people often ignore error codes returned by functions. if you're going to put the extra mile in favor of robustness, you might as well start with RAII because that will help your application regardless of whether you use exception handling or error code handling.
There is a caveat: you should not throw exceptions across module boundaries. If you do, you should consider a hybrid between error codes (as in returning error codes, not using a global error status like errno) and exceptions.
It is worth noting that if you use operator new in your code without specifying nothrow everywhere, ex:
int* p = new int(123); // can throw std::bad_alloc
int* p = new(std::nothrow) int(123); // returns a null pointer on failure
... then you already need to catch and handle bad_alloc exceptions in your code for it to be robust against out of memory exceptions.
Have a look at this comment by Herb Sutter on try catch for C++ GOTW. And do go through his whole set of articles. He does have a lot to say on when and how to check and save yourself from error conditions and how to handle them in the best ways possible.
Throw an exception. Destructors of variables are always called when an exception is thrown, and if your stack-based variables don't clean up after themselves (if for example you used a raw pointer when you need to delete the result), then you get what you deserve. Use smart pointers, no memory leaks.
But freeing dynamic memory will be an issue when an exception is raised.
Freeing memory (or any other resource for that matter) doesn't suddenly become a non-issue because you don't use exceptions. The techniques that make dealing with these problems while exceptions can be thrown easy, also make it easier when there can be "error conditions".
Exceptions are good for passing control from one context to another.
You let the compiler do the work of unrolling the stack between the contexts then in the new context compensate for the exception (and then hopefully continue).
If your error happens and can be corrected in the same context then error codes are a good method to do error handling and clean up (Don't take this to mean you should not be using RAII you still need that). But for example within a class an error occurs in a function and the calling function can correct for that type of error (then it probably is not an exceptional circumstance so no exceptions) then error code are useful.
You should not use error codes when you have to pass information out of a library or sub system as you are then relying on the developer using the code to actually check and handle the code to make sure it works correctly and more often than not they will ignore error codes.