Is there a way to implement a singleton object in C++ that is:
Lazily constructed in a thread safe manner (two threads might simultaneously be the first user of the singleton - it should still only be constructed once).
Doesn't rely on static variables being constructed beforehand (so the singleton object is itself safe to use during the construction of static variables).
(I don't know my C++ well enough, but is it the case that integral and constant static variables are initialized before any code is executed (ie, even before static constructors are executed - their values may already be "initialized" in the program image)? If so - perhaps this can be exploited to implement a singleton mutex - which can in turn be used to guard the creation of the real singleton..)
Excellent, it seems that I have a couple of good answers now (shame I can't mark 2 or 3 as being the answer). There appears to be two broad solutions:
Use static initialisation (as opposed to dynamic initialisation) of a POD static variable, and implementing my own mutex with that using the builtin atomic instructions. This was the type of solution I was hinting at in my question, and I believe I knew already.
Use some other library function like pthread_once or boost::call_once. These I certainly didn't know about - and am very grateful for the answers posted.
Basically, you're asking for synchronized creation of a singleton, without using any synchronization (previously-constructed variables). In general, no, this is not possible. You need something available for synchronization.
As for your other question, yes, static variables which can be statically initialized (i.e. no runtime code necessary) are guaranteed to be initialized before other code is executed. This makes it possible to use a statically-initialized mutex to synchronize creation of the singleton.
From the 2003 revision of the C++ standard:
Objects with static storage duration (3.7.1) shall be zero-initialized (8.5) before any other initialization takes place. Zero-initialization and initialization with a constant expression are collectively called static initialization; all other initialization is dynamic initialization. Objects of POD types (3.9) with static storage duration initialized with constant expressions (5.19) shall be initialized before any dynamic initialization takes place. Objects with static storage duration defined in namespace scope in the same translation unit and dynamically initialized shall be initialized in the order in which their definition appears in the translation unit.
If you know that you will be using this singleton during the initialization of other static objects, I think you'll find that synchronization is a non-issue. To the best of my knowledge, all major compilers initialize static objects in a single thread, so thread-safety during static initialization. You can declare your singleton pointer to be NULL, and then check to see if it's been initialized before you use it.
However, this assumes that you know that you'll use this singleton during static initialization. This is also not guaranteed by the standard, so if you want to be completely safe, use a statically-initialized mutex.
Edit: Chris's suggestion to use an atomic compare-and-swap would certainly work. If portability is not an issue (and creating additional temporary singletons is not a problem), then it is a slightly lower overhead solution.
Unfortunately, Matt's answer features what's called double-checked locking which isn't supported by the C/C++ memory model. (It is supported by the Java 1.5 and later — and I think .NET — memory model.) This means that between the time when the pObj == NULL check takes place and when the lock (mutex) is acquired, pObj may have already been assigned on another thread. Thread switching happens whenever the OS wants it to, not between "lines" of a program (which have no meaning post-compilation in most languages).
Furthermore, as Matt acknowledges, he uses an int as a lock rather than an OS primitive. Don't do that. Proper locks require the use of memory barrier instructions, potentially cache-line flushes, and so on; use your operating system's primitives for locking. This is especially important because the primitives used can change between the individual CPU lines that your operating system runs on; what works on a CPU Foo might not work on CPU Foo2. Most operating systems either natively support POSIX threads (pthreads) or offer them as a wrapper for the OS threading package, so it's often best to illustrate examples using them.
If your operating system offers appropriate primitives, and if you absolutely need it for performance, instead of doing this type of locking/initialization you can use an atomic compare and swap operation to initialize a shared global variable. Essentially, what you write will look like this:
MySingleton *MySingleton::GetSingleton() {
if (pObj == NULL) {
// create a temporary instance of the singleton
MySingleton *temp = new MySingleton();
if (OSAtomicCompareAndSwapPtrBarrier(NULL, temp, &pObj) == false) {
// if the swap didn't take place, delete the temporary instance
delete temp;
}
}
return pObj;
}
This only works if it's safe to create multiple instances of your singleton (one per thread that happens to invoke GetSingleton() simultaneously), and then throw extras away. The OSAtomicCompareAndSwapPtrBarrier function provided on Mac OS X — most operating systems provide a similar primitive — checks whether pObj is NULL and only actually sets it to temp to it if it is. This uses hardware support to really, literally only perform the swap once and tell whether it happened.
Another facility to leverage if your OS offers it that's in between these two extremes is pthread_once. This lets you set up a function that's run only once - basically by doing all of the locking/barrier/etc. trickery for you - no matter how many times it's invoked or on how many threads it's invoked.
Here's a very simple lazily constructed singleton getter:
Singleton *Singleton::self() {
static Singleton instance;
return &instance;
}
This is lazy, and the next C++ standard (C++0x) requires it to be thread safe. In fact, I believe that at least g++ implements this in a thread safe manner. So if that's your target compiler or if you use a compiler which also implements this in a thread safe manner (maybe newer Visual Studio compilers do? I don't know), then this might be all you need.
Also see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2513.html on this topic.
You can't do it without any static variables, however if you are willing to tolerate one, you can use Boost.Thread for this purpose. Read the "one-time initialisation" section for more info.
Then in your singleton accessor function, use boost::call_once to construct the object, and return it.
For gcc, this is rather easy:
LazyType* GetMyLazyGlobal() {
static const LazyType* instance = new LazyType();
return instance;
}
GCC will make sure that the initialization is atomic. For VC++, this is not the case. :-(
One major issue with this mechanism is the lack of testability: if you need to reset the LazyType to a new one between tests, or want to change the LazyType* to a MockLazyType*, you won't be able to. Given this, it's usually best to use a static mutex + static pointer.
Also, possibly an aside: It's best to always avoid static non-POD types. (Pointers to PODs are OK.) The reasons for this are many: as you mention, initialization order isn't defined -- neither is the order in which destructors are called though. Because of this, programs will end up crashing when they try to exit; often not a big deal, but sometimes a showstopper when the profiler you are trying to use requires a clean exit.
While this question has already been answered, I think there are some other points to mention:
If you want lazy-instantiation of the singleton while using a pointer to a dynamically allocated instance, you'll have to make sure you clean it up at the right point.
You could use Matt's solution, but you'd need to use a proper mutex/critical section for locking, and by checking "pObj == NULL" both before and after the lock. Of course, pObj would also have to be static ;)
.
A mutex would be unnecessarily heavy in this case, you'd be better going with a critical section.
But as already stated, you can't guarantee threadsafe lazy-initialisation without using at least one synchronisation primitive.
Edit: Yup Derek, you're right. My bad. :)
You could use Matt's solution, but you'd need to use a proper mutex/critical section for locking, and by checking "pObj == NULL" both before and after the lock. Of course, pObj would also have to be static ;) . A mutex would be unnecessarily heavy in this case, you'd be better going with a critical section.
OJ, that doesn't work. As Chris pointed out, that's double-check locking, which is not guaranteed to work in the current C++ standard. See: C++ and the Perils of Double-Checked Locking
Edit: No problem, OJ. It's really nice in languages where it does work. I expect it will work in C++0x (though I'm not certain), because it's such a convenient idiom.
read on weak memory model. It can break double-checked locks and spinlocks. Intel is strong memory model (yet), so on Intel it's easier
carefully use "volatile" to avoid caching of parts the object in registers, otherwise you'll have initialized the object pointer, but not the object itself, and the other thread will crash
the order of static variables initialization versus shared code loading is sometimes not trivial. I've seen cases when the code to destruct an object was already unloaded, so the program crashed on exit
such objects are hard to destroy properly
In general singletons are hard to do right and hard to debug. It's better to avoid them altogether.
I suppose saying don't do this because it's not safe and will probably break more often than just initializing this stuff in main() isn't going to be that popular.
(And yes, I know that suggesting that means you shouldn't attempt to do interesting stuff in constructors of global objects. That's the point.)
Related
It has been asked whether set_new_handler is safe in a multi-threaded environment. I would like to know if C++11 and later implementations of the C++ standard library use the thread_local feature to store the std::new_handler. If implementations do not use thread_local, why not? That would seem to make the feature a bit more robust in multi-threaded programs.
I also don't really see how set_new_handler would work for a class that overloads new and sets a std::new_handler and creates (and owns reference to) an object that also overloads new and sets its own std::new_handler. I would expect that would be especially egregious if the owning class has options to reap/free memory, while the owned object decides to call abort/terminate.
I expect the usual advice is to use nothrowversion of new and forget set_new_handler. If that is a universal opinion, why isn't set_new_handler a deprecated feature in the standard library?
The new_handler is not, and was never expected to be, a thread-local construct. It was always global and that is by design.
The new handler is intended to be set once and left alone; it's not meant for temporary changes that are local to a single execution thread. You can build such a facility for yourself, by creating a new handler that defers its operations to a thread_local variable which your code can set. But the C++ new handler feature itself is not meant for such use cases.
Now, one could say that this was a legacy decision. C++98/03's memory model does not consider the possibility of threading. So any pre-C++11 threaded implementations essentially decided for themselves which operations were thread-safe, were thread-local, and so forth. Implementations that decided to implement the new-handler as a global construct were no less correct than implementations that made it per-thread. Indeed, implementations probably settled on the global option at some point, and C++11 just adopted it. And changing it to be thread-local would break peoples' code.
But at the same time, the point of the new handler is to allow the user to change the default handling of allocation failures. Default error handing is not a thread-local concept; it's something that is universal throughout the program. Much like the termination function and so forth, there is precisely one that is expected to be used.
To put it simply, code that used a class-local new handler was always wrong to some degree, even in the C++98/03 days. It's just that until C++11 standardized threading, it was something that you could get away with (at least as far as the standard was concerned). And it isn't deprecated because it is useful for its intended purpose. nothrow is only useful locally; set_new_handler is useful globally.
I recently came across the Nifty Counter Idiom. My understanding is that this is used to implement globals in the standard library like cout, cerr, etc. Since the experts have chosen it, I assume that it's a very strong technique.
I'm trying to understand what the advantage is over using something more like a Meyer Singleton.
For example, one could just have, in a header file:
inline Stream& getStream() { static Stream s; return s; }
static Stream& stream = getStream();
The advantage is you don't have to worry about reference counting, or placement new, or having two classes, i.e. the code is much simpler. Since it's not done this way, I'm sure there's a reason:
Is this not guaranteed to have a single global object across shared and static libraries? It seems like the ODR should guarantee that there can only be one static variable.
Is there some kind of performance cost? It seems like in both my code and the Nifty Counter, you are following one reference to get to the object.
Is there some situations where the reference counting is actually useful? It seems like it will still just lead to the object being constructed if the header is included, and destroyed at program end, like the Meyer Singleton.
Does the answer involve dlopen'ing something manually? I don't have too much experience with that.
Edit: I was prompted to write the following bit of code while reading Yakk's answer, I add it to the original question as a quick demo. It's a very minimal example that shows how using the Meyer Singleton + a global reference leads to initialization before main: http://coliru.stacked-crooked.com/a/a7f0c8f33ba42b7f.
The static local/Meyer's singleton + static global reference (your solution) is nearly equivalent to the nifty counter.
The differences are as follows:
No .cpp file is required in your solution.
Technically the static Steam& exists in every compilation unit; the object being referred to does not. As there is no way to detect this in the current version of C++, under as-if this goes away. But some implementations might actually create that reference instead of eliding it.
Someone could call getStream() prior to the static Stream& being created; this would cause difficulty in destruction order (with the stream being destroyed later than expected). This can be avoided by making that against the rules.
The standard is mandated to make creating the static Stream local in the inline getStream thread safe. Detecting that this is not going to happen is challenging for the compiler, so some redundant thread-safety overhead may exist in your solution. The nifty counter does not support thread safety explicitly; this is considered safe as it runs at static initialization time, prior to when threads are expected.
The call to getStream() must occur in each and every compilation unit. Only if it is proven that it cannot do anything may it be optimized out, which is difficult. The nifty counter has a similar cost, but the operation may or may not be be simpler to optimize out or in runtime cost. (Determining this will require inspecting resulting assembly output on a variety of compilers)
"magic statics" (statics locals without race conditions) where introduced in C++11. There could be other issues prior to C++11 magic statics with your code; the only one I can think of is someone calling getStream() directly in another thread during static initialization, which (as mentioned above) should be banned in general.
Outside the realm of the standard, your version will automatically and magically create a new singleton in each dynamicly linked chunk of code (DLL, .so, etc). The nifty counter will only create the singleton in the cpp file. This may give the library writer tighter control over accidentally spawning new singletons; they can stick it into the dynamic library, instead of spawning duplicates.
Avoiding having more than one singleton is sometimes important.
Summarizing the answers and comments:
Let's compare 3 different options for a library, wishing to present a global Singleton, as a variable or via a getter function:
Option 1 - the nifty counter pattern, allowing the use of a global variable that is:
assured to be created once
assured to be created before the first usage
assured to be created once across all shared objects that are dynamically linked with the library creating this global variable.
Option 2 - the Meyers singleton pattern with a reference variable (as presented in the question):
assured to be created once
assured to be created before the first usage
However, it will create a copy of the singleton object in shared objects, even if all shared objects and the main are linked dynamically with the library. This is because the Singleton reference variable is declared static in a header file and must have its initialization ready at compile time wherever it is used, including in shared objects, during compile time, before meeting the program they will be loaded to.
Option 3 - the Meyers singleton pattern without a reference variable (calling a getter for retrieving the Singleton object):
assured to be created once
assured to be created before the first usage
assured to be created once across all shared objects that are dynamically linked with the library creating this Singleton.
However, in this option there is no global variable nor inline call, each call for retrieving the Singleton is a function call (that can be cached on the caller side).
This option would look like:
// libA .h
struct A {
A();
};
A& getA();
// some other header
A global_a2 = getA();
// main
int main() {
std::cerr << "main\n";
}
// libA .cpp - need to be dynamically linked! (same as libstdc++ is...)
// thus the below shall be created only once in the process
A& getA() {
static A a;
return a;
}
A::A() { std::cerr << "construct A\n"; }
All of your questions about utility / performance of Nifty Counter aka Schwartz Counter were basically answered by Maxim Egorushkin in this answer (but see also the comment threads).
Global variables in modern C++
The main issue is that there is a trade-off taking place. When you use Nifty Counter your program startup time is a bit slower (in large projects), since all these counters have to run before anything can happen. That doesn't happen in Meyer's singleton.
However, in the Meyer's singleton, every time you want to access the global object, you have to check if it's null, or, the compiler emits code that checks if the static variable was already constructed before any access is attempted. In the Nifty Counter, you have your pointer already and you just fire away, since you can assume the init happened at startup time.
So, Nifty Counter vs. Meyer's singleton is basically a trade-off between program startup time and run-time.
With the solution you have here, the global stream variable gets assigned at some point during static initialization, but it is unspecified when. Therefore the use of stream from other compilation units during static initialization may not work. Nifty counter is a way to guarantee that a global (e.g. std::cout) is usable even during static initialization.
#include <iostream>
struct use_std_out_in_ctor
{
use_std_out_in_ctor()
{
// std::cout guaranteed to be initialized even if this
// ctor runs during static initialization
std::cout << "Hello world" << std::endl;
}
};
use_std_out_in_ctor global; // causes ctor to run during static initialization
int main()
{
std::cout << "Did it print Hello world?" << std::endl;
}
Recently a fellow worker showed to me a code like this:
void SomeClass::function()
{
static bool init = false;
if (!init)
{
// hundreds of lines of ugly code
}
init = true;
}
He wants to check if SomeClass is initialized in order to execute some piece of code once per Someclass instance but the fact is that only one instance of SomeClass will exist in all the lifetime of the program.
His question were about the init static variable, about when it's initialized. I've answered that the initialization occurs once, so the value will be false at first call and true the rest of its lifetime. After answering I've added that such use of static variables is bad practice but I haven't been able to explain why.
The reasons that I've been thinking so far are the following:
The behaviour of static bool init into SomeClass::function could be achieved with a non-static member variable.
Other functions in SomeClass couldn't check the static bool init value because it's visibility is limited to the void SomeClass::function() scope.
The static variables aren't OOPish because they define a global state instead of a object state.
This reasons looks poor, unclever and not very concrete to me so I'm asking for more reasons to explain why the use of static variables in function and member-function space are a bad practice.
Thanks!
This is certainly a rare occurrence, at least, in good quality code, because of the narrow case for which it's appropriate. What this basically does is a just-in-time initialization of a global state (to deliver some global functionality). A typical example of this is having a random number generator function that seeds the generator at the first call to it. Another typical use of this is a function that returns the instance of a singleton, initialized on the first call. But other use-case examples are few and far between.
In general terms, global state is not desirable, and having objects that contain self-sufficient states is preferred (for modularity, etc.). But if you need global state (and sometimes you do), you have to implement it somehow. If you need any kind of non-trivial global state, then you should probably go with a singleton class, and one of the preferred ways to deliver that application-wide single instance is through a function that delivers a reference to a local static instance initialized on the first call. If the global state needed is a bit more trivial, then doing the scheme with the local static bool flag is certainly an acceptable way to do it. In other words, I see no fundamental problem with employing that method, but I would naturally question its motivations (requiring a global state) if presented with such code.
As is always the case for global data, multi-threading will cause some problems with a simplistic implementation like this one. Naive introductions of global state are never going to be inherently thread-safe, and this case is no exception, you'd have to take measures to address that specific problem. And that is part of the reasons why global states are not desirable.
The behaviour of static bool init into SomeClass::function could be achieved with a non-static member variable.
If there is an alternative to achieve the same behavior, then the two alternatives have to be judged on the technical issues (like thread-safety). But in this case, the required behavior is the questionable thing, more so than the implementation details, and the existence of alternative implementations doesn't change that.
Second, I don't see how you can replace a just-in-time initialization of a global state by anything that is based on a non-static data member (a static data member, maybe). And even if you can, it would be wasteful (require per-object storage for a one-time-per-program-execution thing), and on that ground alone, wouldn't make it a better alternative.
Other functions in SomeClass couldn't check the static bool init value because it's visibility is limited to the void SomeClass::function() scope.
I would generally put that in the "Pro" column (as in Pro/Con). This is a good thing. This is information hiding or encapsulation. If you can hide away things that shouldn't be a concern to others, then great! But if there are other functions that would need to know that the global state has already been initialized or not, then you probably need something more along the lines of a singleton class.
The static variables aren't OOPish because they define a global state instead of a object state.
OOPish or not, who cares? But yes, the global state is the concern here. Not so much the use of a local static variable to implement its initialization. Global states, especially mutable global states, are bad in general and should never be abused. They hinder modularity (modules are less self-sufficient if they rely on global states), they introduce multi-threading concerns since they are inherently shared data, they make any function that use them non-reentrant (non-pure), they make debugging difficult, etc... the list goes on. But most of these issues are not tied to how you implement it. On the other hand, using a local static variable is a good way to solve the static-initialization-order-fiasco, so, they are good for that reason, one less problem to worry about when introducing a (well-justified) global state into your code.
Think multi-threading. This type of code is problematic when function() can be called concurrently by multiple threads. Without locking, you're open to race conditions; with locking, concurrency can suffer for no real gain.
Global state is probably the worst problem here. Other functions don't have to be concerned with it, so it's not an issue. The fact that it can be achieved without static variable essentially means you made some form of a singleton. Which of course introduces all problems that singleton has, like being totally unsuitable for multithreaded environment, for one.
Adding to what others said, you can't have multiple objects of this class at the same time, or at least would they not behave as expected. The first instance would set the static variable and do the initialization. The ones created later though would not have their own version of init but share it with all other instances. Since the first instance set it to true, all following won't do any initialization, which is most probably not what you want.
My specific question is that when implementing a singleton class in C++, is there any substantial differences between the two below codes regarding performance, side issues or something:
class singleton
{
// ...
static singleton& getInstance()
{
// allocating on heap
static singleton* pInstance = new singleton();
return *pInstance;
}
// ...
};
and this:
class singleton
{
// ...
static singleton& getInstance()
{
// using static variable
static singleton instance;
return instance;
}
// ...
};
(Note that dereferencing in the heap-based implementation should not affect performance, as AFAIK there is no extra machine-code generated for dereferencing. It's seems only a matter of syntax to distinguish from pointers.)
UPDATE:
I've got interesting answers and comments which I try to summarize them here. (Reading detailed answers is recommended for those interested.):
In the singleton using static local variable, the class destructor is automatically invoked at process termination, whereas in the dynamic allocation case, you have to manage object destruction someway at sometime, e.g. by using smart pointers:
static singleton& getInstance() {
static std::auto_ptr<singleton> instance (new singleton());
return *instance.get();
}
The singleton using dynamic allocation is "lazier" than the static singleton variable, as in the later case, the required memory for the singleton object is (always?) reserved at process start-up (as part of the whole memory required for loading program) and only calling of the singleton constructor is deferred to getInstance() call-time. This may matter when sizeof(singleton) is large.
Both are thread-safe in C++11. But with earlier versions of C++, it's implementation-specific.
The dynamic allocation case uses one level of indirection to access the singleton object, whereas in the static singleton object case, direct address of the object is determined and hard-coded at compile-time.
P.S.: I have corrected the terminology I'd used in the original posting according to the #TonyD's answer.
the new version obviously needs to allocate memory at run-time, whereas the non-pointer version has the memory allocated at compile time (but both need to do the same construction)
the new version won't invoke the object's destructor at program termination, but the non-new version will: you could use a smart pointer to correct this
you need to be careful that some static/namespace-scope object's destructors don't invoke your singleton after its static local instance's destructor has run... if you're concerned about this, you should perhaps read a bit more about Singleton lifetimes and approaches to managing them. Andrei Alexandrescu's Modern C++ Design has a very readable treatment.
under C++03, it's implementation-defined whether either will be thread safe. (I believe GCC tends to be, whilst Visual Studio tends not -comments to confirm/correct appreciated.)
under C++11, it's safe: 6.7.4 "If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization." (sans recursion).
Discussion re compile-time versus run-time allocation & initialisation
From the way you've worded your summary and a few comments, I suspect you're not completely understanding a subtle aspect of the allocation and initialisation of static variables....
Say your program has 3 local static 32-bit ints - a, b and c - in different functions: the compiler's likely to compile a binary that tells the OS loader to leave 3x32-bits = 12 bytes of memory for those statics. The compiler decides what offsets each of those variables is at: it may put a at offset 1000 hex in the data segment, b at 1004, and c at 1008. When the program executes, the OS loader doesn't need to allocate memory for each separately - all it knows about is the total of 12 bytes, which it may or may not have been asked specifically to 0-initialise, but it may want to do anyway to ensure the process can't see left over memory content from other users' programs. The machine code instructions in the program will typically hard-code the offsets 1000, 1004, 1008 for accesses to a, b and c - so no allocation of those addresses is needed at run-time.
Dynamic memory allocation is different in that the pointers (say p_a, p_b, p_c) will be given addresses at compile time as just described, but additionally:
the pointed-to memory (each of a, b and c) has to be found at run-time (typically when the static function first executes but the compiler's allowed to do it earlier as per my comment on the other answer), and
if there's too little memory currently given to the process by the Operating System for the dynamic allocation to succeed, then the program library will ask the OS for more memory (e.g. using sbreak()) - which the OS will typically wipe out for security reasons
the dynamic addresses allocated for each of a, b and c have to be copied back into the pointers p_a, p_b and p_c.
This dynamic approach is clearly more convoluted.
The main difference is that using a local static the object will be destroyed when closing the program, instead heap-allocated objects will just be abandoned without being destroyed.
Note that in C++ if you declare a static variable inside a function it will be initialized the first time you enter the scope, not at program start (like it happens instead for global static duration variables).
In general over the years I switched from using lazy initialization to explicit controlled initialization because program startup and shutdown are delicate phases and quite difficult to debug. If your class is not doing anything complex and just cannot fail (e.g. it's just a registry) then even lazy initialization is fine... otherwise being in control will save you quite a lot of problems.
A program that crashes before entering the first instruction of main or after executing last instruction of main is harder to debug.
Another problem of using lazy construction of singletons is that if your code is multithread you've to pay attention to the risk of having concurrent threads initializing the singleton at the same time. Doing initialization and shutdown in a single thread context is simpler.
The possible races during initialization of function-level static instances in multithreaded code has been resolved since C++11, when the language added official multithreading support: for normal cases proper synchronization guards are automatically added by the compiler so this is not a concern in C++11 or later code. However if initialization of a static in function a calls function b and vice-versa you can risk a deadlock if the two functions are called the first time at the same time by different threads (this is not an issue only if the compiler uses a single mutex for all statics). Note also that calling the function that contains a static object from within the initialization code of the static object recursively is not permitted.
This may be a subjective question, but I'm more or less asking it and hoping that people share their experiences. (As that is the biggest thing which I lack in C++)
Anyways, suppose I have -for some obscure reason- an initialize function that initializes a datastructure from the heap:
void initialize() {
initialized = true;
pointer = new T;
}
now When I would call the initialize function twice, an memory leak would happen (right?). So I can prevent this is multiple ways:
ignore the call (just check wether I am initialized, and if I am don't do anything)
Throw an error
automatically "cleanup" the code and then reinitialize the thing.
Now what is generally the "best" method, which helps keeping my code manegeable in the future?
EDIT: thank you for the answers so far. However I'd like to know how people handle this is a more generic way. - How do people handle "simple" errors which can be ignored. (like, calling the same function twice while only 1 time it makes sense).
You're the only one who can truly answer the question : do you consider that the initialize function could eventually be called twice, or would this mean that your program followed an unexpected execution flow ?
If the initialize function can be called multiple times : just ignore the call by testing if the allocation has already taken place.
If the initialize function has no decent reason to be called several times : I believe that would be a good candidate for an exception.
Just to be clear, I don't believe cleanup and regenerate to be a viable option (or you should seriously consider renaming the function to reflect this behavior).
This pattern is not unusual for on-demand or lazy initialization of costly data structures that might not always be needed. Singleton is one example, or for a class data member that meets those criteria.
What I would do is just skip the init code if the struct is already in place.
void initialize() {
if (!initialized)
{
initialized = true;
pointer = new T;
}
}
If your program has multiple threads you would have to include locking to make this thread-safe.
I'd look at using boost or STL smart pointers.
I think the answer depends entirely on T (and other members of this class). If they are lightweight and there is no side-effect of re-creating a new one, then by all means cleanup and re-create (but use smart pointers). If on the other hand they are heavy (say a network connection or something like that), you should simply bypass if the boolean is set...
You should also investigate boost::optional, this way you don't need an overall flag, and for each object that should exist, you can check to see if instantiated and then instantiate as necessary... (say in the first pass, some construct okay, but some fail..)
The idea of setting a data member later than the constructor is quite common, so don't worry you're definitely not the first one with this issue.
There are two typical use cases:
On demand / Lazy instantiation: if you're not sure it will be used and it's costly to create, then better NOT to initialize it in the constructor
Caching data: to cache the result of a potentially expensive operation so that subsequent calls need not compute it once again
You are in the "Lazy" category, in which case the simpler way is to use a flag or a nullable value:
flag + value combination: reuse of existing class without heap allocation, however this requires default construction
smart pointer: this bypass the default construction issue, at the cost of heap allocation. Check the copy semantics you need...
boost::optional<T>: similar to a pointer, but with deep copy semantics and no heap allocation. Requires the type to be fully defined though, so heavier on dependencies.
I would strongly recommend the boost::optional<T> idiom, or if you wish to provide dependency insulation you might fall back to a smart pointer like std::unique_ptr<T> (or boost::scoped_ptr<T> if you do not have access to a C++0x compiler).
I think that this could be a scenario where the Singleton pattern could be applied.