Is it possible to prevent or detect the following bug, in any way (static analysis), where the stack allocated object is not captured and goes out of scope on the same line that it was constructed?
Resource resourceA, resourceB;
void someFunction()
{
ScopedResourceBinder resourceBindA( resourceA );
ScopedResourceBinder( resourceB ); // <--- BUG
}
The first ScopedResourceBinder is correct, but the second one doesn't do anything, since it immediately 'unbinds' right after it 'binds' (hypothetically speaking).
This is clearly a programmers error, but I have debugged this a few times now (for hours on a couple occassion) and it is extremely difficult to spot. Once you see it you think "ah that was a stupid one", but in practice, it is easy to make the mistake and the compiler is defenseless... or is it?
Background info: I work with a library that makes heavy use RAII class for pushing an popping state, for example OpenGL resources. Managing the bindings with scope is a big improvement of manually calling bind() / unbind() functions, but the potential bug listed here comes up with this new pattern.
ScopedResourceBinder( resourceB ); is the same as ScopedResourceBinder resourceB;, i.e. it declares a named variable with name resourceB, calling the default constructor of ScopedResourceBinder.
This persists until the end of the function, however it is still a bug because it does not carry out your intent of binding to the variable resourceB.
To prevent this particular case you could make sure that ScopedResourceBinder does not have a default constructor.
In general coding practice, we should explicitly express our intent in the class definition itself.
For eg:
If we want to have singleton class, then all the constructors should be made private.
If we want to constructs an object with some variables, then only the required constructor should be exposed publicly, rest all should be mentioned as private.
In this manner we would be able to flag all wrong usage during compile time only.
Related
I have a handrolled and simplistic logsystem - however the way it works it does have to track some global state. The way it is done is through a meyer's singleton that gets initialized on first use. However, this seems to have the drawback: it is possible to call for something to get logged after the singleton has been destroyed (unless the order is known - which can be difficult to assert in larger program) - leading to UB (crash on shutdown most likely).
low-level Log function looks something like this:
void logImpl(const char* log, const std::string& message, Severity::Type level) {
static LogSys& logSys = LogSys::instance();
...
}
I could of course force the problem onto the 'user' of the library, but that doesn't really solve the issue (still manual handling). Will making it an inline static in .h solve anything ?(I guess not). We have the destructor of the singleton run, but is it meaningfull to write to anything to indicate it was destroyed ? another meyer's singleton ? What happends if you initialize a meyer's singleton during static destruction ?
Single responsibility principle: Create a function whose sole purpose is to create the singleton on first use, and return it.
The only case where a static object could be destroyed before logging is if the log message is sent from from the destructor of a(nother) static object. If a destructor (or any other member function) may want to log, then call the singleton getter in the constructor. This enforces the correct order of destruction.
There is still bit of a problem in case the destructor calls a function and logging happens indirectly and unexpectedly. The above convention solves the problem when calling any member function, but the problem remains if there is call to any free function which cannot enforce the order. This problem could be avoided if you constrain yourself from ever using the singleton from any free function. If you cannot bring yourself to such constraint (understandable), then you could simply add call to the constructor of every static object to enforce the order just in case.
None of these constraints can be enforced within the language however, which is unfortunate. This is why dependencies between static objects are to be avoided like the plague - an static objects in general to some lesser degree.
I have "inherited" a design where we are using a few global objects for doing stuff when the application exits (updating application status log files, etc ... not important to the question).
Basically the application creates dummy helper objects of specific classes and lets their destructor do these extra works when the application exits either normally or when an error was encountered (and the application knows what to do in all the cases, again not relevant to the question).
But now I have encountered a situation where I do not want to call these destructors, just leave the application without executing these "termination jobs". How can I do this in a decent, platform independent way? I do not want a solution such as divide with zero :)
Edit: I know the design is broken :) We are working on fixing it.
Edit2: I would like to avoid any "trace" of abnormal exit... Sorry for late specification.
Edit3: Obtaining access to the source code for the destructors in order to modify them is very difficult. This happens when politicians take over the keyboard and try to write programs. We just know, that "their" destructor will run on exit...
How can I do this in a decent, platform independent way?
You can't. At least not in a decent way.
You can accomplish this by throwing an exception and not catching it. The end result will be that your application will terminate quite ungracefully. Destructors will not be called. This is quite ugly, very hackish. If your design relies on this behavior to function correctly, well not only is your design completely demented, but it will be near-impossible to maintain.
I would prefer setting a boolean flag in the objects you don't want to run the destructors for. If this flag is set, then the destruction code would not be run. The destructors will still fire, but the actual code you want to avoid running can be skipped.
If you control the construction of the global, you might be able to leverage operator placement-new. Construct a global char buffer big enough for your global, then placement-new your global there. Since objects constructed this way must be destroyed by explicitly calling the destructor, simply don't call the destructor on shutdown for the global.
abort();
Aborts the current process, producing an abnormal program termination.
The function raises the SIGABRT signal (as if raise(SIGABRT) was
called). This, if uncaught, causes the program to terminate returning
a platform-dependent unsuccessful termination error code to the host
environment.
The program is terminated without destroying any object and without
calling any of the functions passed to atexit or at_quick_exit.
Taking your comment of "we know it's broken"...
Put a global bool somewhere
bool isExiting;
Set this to true when exiting.
In your destructors do
if( !global::isExiting )
{
// destruction code here
}
Hide somewhere, and be thoroughly ashamed.
The easiest is to replace the global objects by heap-allocated pointers to the objects. That way, in order to run their destructors, you have to manually delete them. Be aware that this is of course very, very nasty. A better way than using a raw pointer is to use a std::unique_ptr and, in case the destructor shouldn’t be called, release it.
If you don’t want to change the client code which is interacting with the global objects (and expect a non-pointer type), just wrap the actual pointer into a global proxy object.
(PS: There are few but very legitimate reasons for such a design. Sometimes it’s entirely safe not to call destructors, and calling them may be detrimental, e.g. because it’s known that they only release memory, but will take very long due to many nested destructor calls. This still needs to be done very carefully but is not at all “bad design”.)
It is ugly but it works:
Create simple maker class.
Maker's constructorr should allocate the object of your liking and assign it to a static pointer.
Make sure that nothing is done in maker's destructor.
Create single static instance of maker object.
To make things look better, make the static pointer private andwrite inline accessor function that will convert the static pointer you have to a public reference.
This may be a subjective question, but I'm more or less asking it and hoping that people share their experiences. (As that is the biggest thing which I lack in C++)
Anyways, suppose I have -for some obscure reason- an initialize function that initializes a datastructure from the heap:
void initialize() {
initialized = true;
pointer = new T;
}
now When I would call the initialize function twice, an memory leak would happen (right?). So I can prevent this is multiple ways:
ignore the call (just check wether I am initialized, and if I am don't do anything)
Throw an error
automatically "cleanup" the code and then reinitialize the thing.
Now what is generally the "best" method, which helps keeping my code manegeable in the future?
EDIT: thank you for the answers so far. However I'd like to know how people handle this is a more generic way. - How do people handle "simple" errors which can be ignored. (like, calling the same function twice while only 1 time it makes sense).
You're the only one who can truly answer the question : do you consider that the initialize function could eventually be called twice, or would this mean that your program followed an unexpected execution flow ?
If the initialize function can be called multiple times : just ignore the call by testing if the allocation has already taken place.
If the initialize function has no decent reason to be called several times : I believe that would be a good candidate for an exception.
Just to be clear, I don't believe cleanup and regenerate to be a viable option (or you should seriously consider renaming the function to reflect this behavior).
This pattern is not unusual for on-demand or lazy initialization of costly data structures that might not always be needed. Singleton is one example, or for a class data member that meets those criteria.
What I would do is just skip the init code if the struct is already in place.
void initialize() {
if (!initialized)
{
initialized = true;
pointer = new T;
}
}
If your program has multiple threads you would have to include locking to make this thread-safe.
I'd look at using boost or STL smart pointers.
I think the answer depends entirely on T (and other members of this class). If they are lightweight and there is no side-effect of re-creating a new one, then by all means cleanup and re-create (but use smart pointers). If on the other hand they are heavy (say a network connection or something like that), you should simply bypass if the boolean is set...
You should also investigate boost::optional, this way you don't need an overall flag, and for each object that should exist, you can check to see if instantiated and then instantiate as necessary... (say in the first pass, some construct okay, but some fail..)
The idea of setting a data member later than the constructor is quite common, so don't worry you're definitely not the first one with this issue.
There are two typical use cases:
On demand / Lazy instantiation: if you're not sure it will be used and it's costly to create, then better NOT to initialize it in the constructor
Caching data: to cache the result of a potentially expensive operation so that subsequent calls need not compute it once again
You are in the "Lazy" category, in which case the simpler way is to use a flag or a nullable value:
flag + value combination: reuse of existing class without heap allocation, however this requires default construction
smart pointer: this bypass the default construction issue, at the cost of heap allocation. Check the copy semantics you need...
boost::optional<T>: similar to a pointer, but with deep copy semantics and no heap allocation. Requires the type to be fully defined though, so heavier on dependencies.
I would strongly recommend the boost::optional<T> idiom, or if you wish to provide dependency insulation you might fall back to a smart pointer like std::unique_ptr<T> (or boost::scoped_ptr<T> if you do not have access to a C++0x compiler).
I think that this could be a scenario where the Singleton pattern could be applied.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How much work should be done in a constructor?
I'm strugging with some advice I have in the back of my mind but for which I can't remember the reasoning.
I seem to remember at some point reading some advice (can't remember the source) that C++ constructors should not do real work. Rather, they should initialize variables only. The advice went on to explain that real work should be done in some sort of init() method, to be called separately after the instance was created.
The situation is I have a class that represents a hardware device. It makes logical sense to me for the constructor to call the routines that query the device in order to build up the instance variables that describe the device. In other words, once new instantiates the object, the developer receives an object which is ready to be used, no separate call to object->init() required.
Is there a good reason why constructors shouldn't do real work? Obviously it could slow allocation time, but that wouldn't be any different if calling a separate method immediately after allocation.
Just trying to figure out what gotchas I not currently considering that might have lead to such advice.
I remember that Scott Meyers in More Effective C++ recommends against having a superfluous default constructor. In that article, he also touched on using methods liked Init() to 'create' the objects. Basically, you have introduced an extra step which places the responsibility on the client of the class. Also, if you want to create an array of said objects, each of them would have to manually call Init(). You can have an Init function which the constructor can call inside for keeping the code tidy, or for the object to call if you implement a Reset(), but from experiences it is better to delete an object and recreate it rather than try to reset its values to default, unless the objects is created and destroyed many times real-time (say, particle effects).
Also, note that constructors can perform initialization lists which normal functions could not.
One reasons why one may caution against using constructors to do heavy allocation of resources is because it can be hard to catch exceptions in constructors. However, there are ways around it. Otherwise, I think constructors are meant to do what they are supposed to do - prepare an object for its initial state of execution (important for object creation is resource allocation).
The one reason not to do "work" in the constructor is that if an exception is thrown from there, the class destructor won't get called. But if you use RAII principles and don't rely on your destructor to do clean up work, then I feel it's better not to introduce a method which isn't required.
Depends on what you mean by real work. The constructor should put the object into a usable state, even if that state is a flag meaning it hasn't yet been initialised :-)
The only rationale I've ever come across for not doing real work would be the fact that the only way a constructor can fail is with an exception (and the destructor won't be called in that case). There is no opportunity to return a nice error code.
The question you have to ask yourself is:
Is the object usable without calling the init method?
If the answer to that is "No", I would be doing all that work in the constructor. Otherwise you'll have to catch the situation when a user has instantiated but not yet initialised and return some sort of error.
Of course, if you can re-initialise the device, you should provide some sort of init method but, in that case, I would still call that method from the constructor if the condition above is met.
In addition to the other suggestions regarding exception handling, one thing to consider when connecting to a hardware device is how your class will handle the situation where a device is not present or communication fails.
In the situation where you can't communicate with the device, you may need to provide some methods on your class to perform later initialization anyway. In that case, it may make more sense to just instantiate the object and then run through an initialization call. If the initialization fails, you can just keep the object around and try to initialize communication again at a later time. Or you may need to handle the situation where communication is lost after initialization. In either case, you will probably want to think about how you will design the class to handle communication problems in general and that may help you decide what you want to do in the constructor versus an initialization method.
When I've implemented classes that communicate with external hardware, I've found it easier to instantiate a "disconnected" object and provide methods for connecting and setting up initial status. This has generally provide more flexibility connecting/disconnecting/reconnecting with the device.
The only real reason is Testability. If your constructors are full of "real work", that usually means the objects can only be instantiated within a fully initialized, running application. It's a sign the object/class needs further decomposition.
When using a constructor and an Init() method you have a source of error. In my experience you will encounter situation where someone forgets to call it, and you might have a subtle bug in your hands. I would say you shouldn't do much work in your constructor but if any init method is needed, then you have a non-trivial construction scenario, and it is about time to look at the creational patterns. A builder function or a factory be wise to have a look at. With a private constructor making sure that no one except your factory or builder function actually build the objects, so you can be sure that it is always constructed correctly.
If your design allow for mistakes in implementation, someone will do those mistakes. My friend Murphy told me that ;)
In my field we work with loads of similar hardware related situations. Factories gives us both testability, security and better ways of failing construction.
It is worth considering lifetime issues and connecting/reconnecting, as Neal S. points out.
If you fail to connect to a device at the other end of a link then it is often the case that the 'device' at your end is usable and will be later if the other end gets its act together. Examples being network connections etc.
On the other hand if you try and access some local hardware device that does not exist and will never exist within the scope of your program (for example a graphics card that is not present) then I think this is a case where you want to know this in the constructor, so that the constructor can throw and the object can not exist. If you don't then you may end up with an object that is invalid and will always be so. Throwing in the constructor means that the object will not exist, and thus functions can't be called on that object. Obviously you need to be aware of cleanup issues if you throw in a constructor, but if you don't in cases like this then you typically end up with validation checks in all functions that may be called.
So I think that you should do enough in the constructor to ensure you have a valid, usable, object created.
I'd like to add my own experience there.
I won't say much about the traditional debate Constructor/Init... for example Google guidelines advise against anything the in the Constructor but that's because they advise against Exceptions and the 2 work together.
I can speak about a Connection class I use though.
When the Connection class is created, it will attempt to actually connect itself (at least, if not default constructed). If the Connection fails... the object is still constructed and you don't know about it.
When you try to use the Connection class you are thus in one of 3 cases:
no parameter has ever been precised > exception or error code
the object is actually connected > fine
the object is not connected, it will attempt to connect > this succeeds, fine, this fails, you get an exception or an error code
I think it's quite useful to have both. However, it means that in every single method actually using the connection, you need to test whether or not it works.
It's worth it though because of disconnection events. When you are connected, you may lose the connection without the object knowing about it. By encapsulating the connection self-check into a reconnect method that is called internally by all methods needing a working connection, you really isolate the developers from dealing with the issues... or at least as much as you can since when everything fails you have no other solution that letting them know :)
Doing "real work" in a constructor is best avoided.
If I setup database connections, open files etc inside a constructor and if in doing so one of them raise an exception then it would lead to a memory leak. This will compromise your application's exception safety.
Another reason to avoid doing work in a constructor is that it would make your application less testable. Suppose you are writing a credit-card payment processor. If say in CreditCardProcessor class's constructor you do all the work of connecting to a payment gateway, authentate and bill the credit card how do I ever write unit tests for CreditCardProcessor class?
Coming to your scenario, if the routines that query the device do not raise any exceptions and you are not going to test the class in isolation then there is its probably preferable to do work in the constructor and avoid calls to that extra init method.
There are a couple reasons I would use separate constructor/init():
Lazy/Delayed initialization. This allows you to create the object quickly, fast user response, and delay a more lengthy initialization for later or background processing. This is also a part of one or more design patterns concerning reusable object pools to avoid expensive allocation.
Not sure if this has a proper name, but perhaps when the object is created, the initialization information is unavailable or not understood by whoever is creating the object (for example, bulk generic object creation). Another section of code has the know-how to initialize it, but not create it.
As a personal reason, the destructor should be able to undo everything the constructor did. If that involves using internal init/deinit(), no problem, so long as they are mirror images of each other.
We're trying to build a class that provides the MFC CRecordset (or, really, the CODBCRecordset class) thread safety. Everything actually seem to work out pretty well for the various functions like opening and moving through the recordset (we enclose these calls with critical sections), however, one problem remains, a problem that seems to introduce deadlocks in practice.
The problem seems to lie in our constructor, like this:
CThreadSafeRecordset::CThreadSafeRecordset(void) : CODBCRecordset(g_db)
{ // <-- Deadlock!
}
The other thread might be one having ended up in CThreadSafeRecordset::Close() despite us guarding the enclosed Close call, but that doesn't really matter since the constructor is threading unaware. I assume the original CRecordset class is the culprit, doing bad things at construction time. I've looked around for programming techniques to work around this problem, but I'm unsure what could be the best solution? Since we have no code and can't control other code in our constructor, we can't wrap anything special in a critical section...?
Update: Thanks for the input; I've marked what I ended up with as the answer to my question. That, in combination with returning a shared_ptr as the returned instance for ease of updating the existing thread-unaware code.
You can make the CThreadSafeRecordset constructor private, then provide a public factory method that participates in your locking and returns an instance.
If there's no way to make CODBCRecordset move its thread-unsafe operations out of the constructor (a default constructor followed by an Initialize() call, say), you can always use composition instead of inheritance. Let CThreadSafeRecordset be a wrapper around CODBCRecordset instead of a subclass of it. That way, you can explicitly construct your recordset whenever you like, and can defend it with whatever rigor is appropriate.
The drawback, of course, is that you'll have to wrap every CODBCRecordset method you wish to expose, even the ones that don't relate to your threading guarantees. A C macro in the cpp file (so that it can't escape and afflict your clients) may help.