How does getInstance() work? - c++

Recent I read some C++ code using extensively following getInstance() method:
class S
{
private:
int some_int = 0;
public:
static S& getInstance()
{
static S instance; / (*) /
return instance;
}
};
From how this code fragment being used, I learned the getInstance() works like return this, returning the address(or ref) of the instance of class S. But I got confused.
1) where does the static variable S defined in line(*) allocated in memory? And why it can work like return this?
2) what if there exist more than one instance of class S, whose reference will be returned?

This is the so-called Singleton design pattern. Its distinguishing feature is that there can only ever be exactly one instance of that class and the pattern ensures that. The class has a private constructor and a statically-created instance that is returned with the getInstance method. You cannot create an instance from the outside and thus get the object only through said method.
Since instance is static in the getInstance method it will retain its value between multiple invocations. It is allocated and constructed somewhen before it's first used. E.g. in this answer it seems like GCC initializes the static variable at the time the function is first used. This answer has some excerpts from the C++ standard related to that.

Static variable with a function scope is allocated first time the function is called. The compiler keeps track of the fact that is is initialized the first time and avoids further creation during the next visit to the function. This property would be ideal to implement a singleton pattern since this ensures we just maintain a single copy of object. In addition newer compilers makes sure this allocation is also thread safe giving a bonus thread safe singleton implementation without using any dynamic memory allocation. [As pointed out in comments, it is c++11 standard which guarantees the thread safety so do your check about thread safety if you are using a different compiler]
1) where does the static variable S defined in line(*) allocated in
memory? And why it can work like return this?
There is specific regions in memory where static variables are stored, like heap for dynamically allocated variables and stack for the typical compile time defined variables. It could vary with compilers but for GCC there are specific sections called DATA and BSS segments with in the executable generated by the compiler where initialized and uninitialized static variables are stored.
2) what if there exist more than one instance of class S, whose
reference will be returned?
As mentioned on the top, since it is a static variable the compiler ensures there can only be one instance created the first time function is visited. Also since it has the scope with in the function it cannot clash with any other instances existing else where and getInstance ensures you see the same single instance.

Related

Does a reference singleton go on the stack or the heap?

I've read a lot of articles here about singletons, but none really touch my issue. I understand Singletons should only be use when needed and, in my game, I am using them for specific parts of the engine.
That said, I originally had my singletons as pointers like this:
static MapReader* Instance()
{
if (instance == 0)
{
instance = new MapReader();
return instance;
}
return instance;
}
However I always felt like using too many pointers is bad for leaks and I prefer not using them if I can help it (or smart pointers if I have to). So I've changed all my singletons to references like this:
static MapReader& Instance()
{
static MapReader instance;
return instance;
}
However, now I've notice my game to lag at odd times and then speed up, like the FPS are a little wonky.
My question is that; does the reference singleton all pile up on the stack? or do they still get allocated on the heap? And should I change them back to pointers using smart pointers?
It's almost certainly not on the heap by virtue of the fact it's not created with new.
However, the standard is silent on where static variables are placed, mentioning only their behaviour. See for example, C++11 3.7.1:
1/ All variables which do not have dynamic storage duration, do not have thread storage duration, and are not local have static storage duration. The storage for these entities shall last for the duration of the program (3.6.2, 3.6.3).
2/ If a variable with static storage duration has initialization or a destructor with side effects, it shall not be eliminated even if it appears to be unused, except that a class object or its copy/move may be eliminated as specified in 12.8.
3/ The keyword static can be used to declare a local variable with static storage duration.
4/ The keyword static applied to a class data member in a class definition gives the data member static storage duration.
That's pretty much the extent of what the standard itself imposes on them.
Most implementations would likely have an area separate from both heap and stack for storing variables of static storage duration.
It's also almost certainly not the use of statics and references that are slowing your code down. Having looked deeply into compilers and how they work, statics tend to be at least as fast as other variables since they're very quick to locate in memory.
As an aside, your pointer variant of the singleton has two issues. The first is a potential race condition if you're working in a threaded environment. There's a possibility that more than one object may be created if distinct threads call Instance().
Specifically, if thread A enters the if statement then thread B starts running, B can go through, create an object then return. If A then continues, it will create a new object.
If you're single-threaded, or you create the instance only from one thread, or you create the instance before other threads are running, you should be fine.
The second issue is just eye candy. There's no need to return the object from within the if block, since it will be returned when you get to the bottom of the function:
static MapReader* Instance() {
if (instance == 0)
instance = new MapReader();
return instance;
}
Neither. Static variables inside functions reside in the data section. For the record, so do static variables outside functions, and global ones.

Heap/dynamic vs. static memory allocation for C++ singleton class instance

My specific question is that when implementing a singleton class in C++, is there any substantial differences between the two below codes regarding performance, side issues or something:
class singleton
{
// ...
static singleton& getInstance()
{
// allocating on heap
static singleton* pInstance = new singleton();
return *pInstance;
}
// ...
};
and this:
class singleton
{
// ...
static singleton& getInstance()
{
// using static variable
static singleton instance;
return instance;
}
// ...
};
(Note that dereferencing in the heap-based implementation should not affect performance, as AFAIK there is no extra machine-code generated for dereferencing. It's seems only a matter of syntax to distinguish from pointers.)
UPDATE:
I've got interesting answers and comments which I try to summarize them here. (Reading detailed answers is recommended for those interested.)‎:
In the singleton using static local variable, the class destructor is automatically invoked at process termination, whereas in the dynamic allocation case, you have to manage object destruction someway at sometime, e.g. by using smart pointers:
static singleton& getInstance() {
static std::auto_ptr<singleton> instance (new singleton());
return *instance.get();
}
The singleton using dynamic allocation is "lazier" than the static singleton variable, as in the later case, the required memory for the singleton object is (always?) reserved at process start-up (as part of the whole memory required for loading program) and only calling of the singleton constructor is deferred to getInstance() call-time. This may matter when sizeof(singleton) is large.
Both are thread-safe in C++11. But with earlier versions of C++, it's implementation-specific.
The dynamic allocation case uses one level of indirection to access the singleton object, whereas in the static singleton object case, direct address of the object is determined and hard-coded at compile-time.
P.S.: I have corrected the terminology I'd used in the original posting according to the #TonyD's answer.
the new version obviously needs to allocate memory at run-time, whereas the non-pointer version has the memory allocated at compile time (but both need to do the same construction)
the new version won't invoke the object's destructor at program termination, but the non-new version will: you could use a smart pointer to correct this
you need to be careful that some static/namespace-scope object's destructors don't invoke your singleton after its static local instance's destructor has run... if you're concerned about this, you should perhaps read a bit more about Singleton lifetimes and approaches to managing them. Andrei Alexandrescu's Modern C++ Design has a very readable treatment.
under C++03, it's implementation-defined whether either will be thread safe. (I believe GCC tends to be, whilst Visual Studio tends not -comments to confirm/correct appreciated.)
under C++11, it's safe: 6.7.4 "If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization." (sans recursion).
Discussion re compile-time versus run-time allocation & initialisation
From the way you've worded your summary and a few comments, I suspect you're not completely understanding a subtle aspect of the allocation and initialisation of static variables....
Say your program has 3 local static 32-bit ints - a, b and c - in different functions: the compiler's likely to compile a binary that tells the OS loader to leave 3x32-bits = 12 bytes of memory for those statics. The compiler decides what offsets each of those variables is at: it may put a at offset 1000 hex in the data segment, b at 1004, and c at 1008. When the program executes, the OS loader doesn't need to allocate memory for each separately - all it knows about is the total of 12 bytes, which it may or may not have been asked specifically to 0-initialise, but it may want to do anyway to ensure the process can't see left over memory content from other users' programs. The machine code instructions in the program will typically hard-code the offsets 1000, 1004, 1008 for accesses to a, b and c - so no allocation of those addresses is needed at run-time.
Dynamic memory allocation is different in that the pointers (say p_a, p_b, p_c) will be given addresses at compile time as just described, but additionally:
the pointed-to memory (each of a, b and c) has to be found at run-time (typically when the static function first executes but the compiler's allowed to do it earlier as per my comment on the other answer), and
if there's too little memory currently given to the process by the Operating System for the dynamic allocation to succeed, then the program library will ask the OS for more memory (e.g. using sbreak()) - which the OS will typically wipe out for security reasons
the dynamic addresses allocated for each of a, b and c have to be copied back into the pointers p_a, p_b and p_c.
This dynamic approach is clearly more convoluted.
The main difference is that using a local static the object will be destroyed when closing the program, instead heap-allocated objects will just be abandoned without being destroyed.
Note that in C++ if you declare a static variable inside a function it will be initialized the first time you enter the scope, not at program start (like it happens instead for global static duration variables).
In general over the years I switched from using lazy initialization to explicit controlled initialization because program startup and shutdown are delicate phases and quite difficult to debug. If your class is not doing anything complex and just cannot fail (e.g. it's just a registry) then even lazy initialization is fine... otherwise being in control will save you quite a lot of problems.
A program that crashes before entering the first instruction of main or after executing last instruction of main is harder to debug.
Another problem of using lazy construction of singletons is that if your code is multithread you've to pay attention to the risk of having concurrent threads initializing the singleton at the same time. Doing initialization and shutdown in a single thread context is simpler.
The possible races during initialization of function-level static instances in multithreaded code has been resolved since C++11, when the language added official multithreading support: for normal cases proper synchronization guards are automatically added by the compiler so this is not a concern in C++11 or later code. However if initialization of a static in function a calls function b and vice-versa you can risk a deadlock if the two functions are called the first time at the same time by different threads (this is not an issue only if the compiler uses a single mutex for all statics). Note also that calling the function that contains a static object from within the initialization code of the static object recursively is not permitted.

C++ - Prevent global instantiation?

Is there a way to force a class to be instantiated on the stack or at least prevent it to be global in C++?
I want to prevent global instantiation because the constructor calls C APIs that need previous initialization. AFAIK there is no way to control the construction order of global objects.
Edit: The application targets an embedded device for which dynamic memory allocation is also prohibited. The only possible solution for the user to instanciate the class is either on the stack or through a placement new operator.
Edit2: My class is part of a library which depends on other external libraries (from which come the C APIs). I can't modify those libraries and I can't control the way libraries are initialized in the final application, that's why I am looking for a way to restrict how the class could be used.
Instead of placing somewhat arbitrary restrictions on objects of your class I'd rather make the calls to the C API safe by wrapping them into a class. The constructor of that class would do the initialization and the destructor would release acquired resources.
Then you can require this class as an argument to your class and initialization is always going to work out.
The technique used for the wrapper is called RAII and you can read more about it in this SO question and this wiki page. It originally was meant to combine encapsulate resource initialization and release into objects, but can also be used for a variety of other things.
Half an answer:
To prevent heap allocation (so only allow stack allocation) override operator new and make it private.
void* operator new( size_t size );
EDIT: Others have said just document the limitations, and I kind of agree, nevertheless and just for the hell of it: No Heap allocation, no global allocation, APIs initialised (not quite in the constructor but I would argue still good enough):
class Boogy
{
public:
static Boogy* GetBoogy()
{
// here, we intilialise the APIs before calling
// DoAPIStuffThatRequiresInitialisationFirst()
InitAPIs();
Boogy* ptr = new Boogy();
ptr->DoAPIStuffThatRequiresInitialisationFirst();
return ptr;
}
// a public operator delete, so people can "delete" what we give them
void operator delete( void* ptr )
{
// this function needs to manage marking array objects as allocated
// or not
}
private:
// operator new returns STACK allocated objects.
void* operator new( size_t size )
{
Boogy* ptr = &(m_Memory[0]);
// (this function also needs to manage marking objects as allocated
// or not)
return ptr;
}
void DoAPIStuffThatRequiresInitialisationFirst()
{
// move the stuff that requires initiaisation first
// from the ctor into HERE.
}
// Declare ALL ctors private so no uncontrolled allocation,
// on stack or HEAP, GLOBAL or otherwise,
Boogy(){}
// All Boogys are on the STACK.
static Boogy m_Memory[10];
};
I don't know if I'm proud or ashamed! :-)
You cannot, per se, prevent putting objects as globals. And I would argue you should not try: after all, why cannot build an object that initialize those libraries, instantiate it globally, and then instantiate your object globally ?
So, let me rephrase the question to drill down to its core:
How can I prevent my object from being constructed before some initialization work has been done ?
The response, in general, is: depends.
It all boils down at what the initialization work is, specifically:
is there a way to detect it has not been called yet ?
are there drawbacks to calling the initialization functions several times ?
For example, I can create the following initializer:
class Initializer {
public:
Initializer() { static bool _ = Init(); (void)_; }
protected:
// boilerplate to prevent slicing
Initializer(Initializer&&) = default;
Initializer(Initializer const&) = default;
Initializer& operator=(Initializer) = default;
private:
static bool Init();
}; // class Initializer
The first time this class is instantiated, it calls Init, and afterwards this is ignored (at the cost of a trivial comparison). Now, it's trivial to inherit (privately) from this class to ensure that by the time your constructor's initializer list or body is called the initialization required has been performed.
How should Init be implemented ?
Depends on what's possible and cheaper, either detecting the initialization is done or calling the initialization regardless.
And if the C API is so crappy you cannot actually do either ?
You're toast. Welcome documentation.
You can try using the Singleton pattern
Is there a way to force a class to be instantiated on the stack or at least prevent it to be global in C++?
Not really. You could make constructor private and create said object only using factory method, but nothing would really prevent you from using said method to create global variable.
If global variables were initialized before application enters "main", then you could throw exception from the constructor before "main" sets some flag. However, it is up to implementation to decide when initialize global variables. So, they could be initialized after application enters "main". I.e. that would be relying on undefined behavior, which isn't a good idea.
You could, in theory, attempt to walk the call stack and see from there it is called. However, compiler could inline constructor or several functions, and this will be non-portable, and walking call stack in C++ will be painful.
You could also manually check "this" pointer and attempt to guess where it is located. However, this will be non-portable hack specific to this particular compiler, OS and architecture.
So there is no good solution I can think of.
As a result, the best idea would be to change your program behavior, as others have already suggeste - make a singleton class that initializes your C api in constructor, deinitializes it in destructor, and request this class when necessary via factory method. This will be the most elegant solution to your problem.
Alternatively, you could attempt to document program behavior.
To allocate a class on the stack, you simply say
FooClass foo; // NOTE no parenthesis because it'd be parsed
// as a function declaration. It's a famous gotcha.
To allocate in on the heap, you say
std::unique_ptr<FooClass> foo(new FooClass()); //or
FooClass* foop = new FooClass(); // less safe
Your object will only be global if you declare it at program scope.

initialization of the static variables

I've just read that if I want to be sure about the initialization order it will be better to use some function which will turn global variable into local(but still static), my question, do I need to keep some identifier which tells me that my static object has already been created(the identifier inside function which prevent me from the intialization of the static object one more time) or not? cause I can use this function with initialization in different places, thanks in advance for any help
The first question is do your static lifetime objects care about the order they are initialized?
If true the second question is why?
The initialization is only a problem if a global object uses another global object during its initialization (i.e. when the constructor is running). Note: This is horrible proactive and should be avoided (globals should not be used and if they are they should be interdependent).
If they must be linked then they should be related (in which case you could potentially make a new object that includes the two old ones so that you can control their creation more precisely). If that is not possible you just put them in the same compilation unit (read *.cpp file).
As far as the standard is concerned, initialization of a function-scope static variable only happens once:
int *gettheint(bool set_to_four) {
static int foo = 3; // only happens once, ever
if (set_to_four) {
foo = 4; // happens as many times as the function is called with true
}
return &foo;
}
So there's no need for gettheint to check whether foo has already been initialized - the value won't be overwritten with 3 on the second and subsequent calls.
Threads throw a spanner in the works, being outside the scope of the standard. You can check the documentation for your threading implementation, but chances are that the once-onliness of the initialization is not thread-safe in your implementation. That's what pthread_once is for, or equivalent. Alternatively in a multi-threaded program, you could call the function before creating any extra threads.

C++ singleton vs. global static object

A friend of mine today asked me why should he prefer use of singleton over global static object?
The way I started it to explain was that the singleton can have state vs. static global object won't...but then I wasn't sure..because this in C++.. (I was coming from C#)
What are the advantages one over the other? (in C++)
Actually, in C++ preferred way is local static object.
Printer & thePrinter() {
static Printer printer;
return printer;
}
This is technically a singleton though, this function can even be a static method of a class. So it guaranties to be constructed before used unlike with global static objects, that can be created in any order, making it possible to fail unconsistently when one global object uses another, quite a common scenario.
What makes it better than common way of doing singletons with creating new instance by calling new is that object destructor will be called at the end of a program. It won't happen with dynamically allocated singleton.
Another positive side is there's no way to access singleton before it gets created, even from other static methods or from subclasses. Saves you some debugging time.
In C++, the order of instantiation of static objects in different compilation units is undefined. Thus it's possible for one global to reference another which is not constructed, blowing up your program. The singleton pattern removes this problem by tying construction to a static member function or free function.
There's a decent summary here.
A friend of mine today asked me why should he prefer use of singleton over global static object? The way I started it to explain was that the singleton can have state vs. static global object won't...but then I wasn't sure..because this in C++.. (I was coming from C#)
A static global object can have state in C# as well:
class myclass {
// can have state
// ...
public static myclass m = new myclass(); // globally accessible static instance, which can have state
}
What are the advantages one over the other? (in C++)
A singleton cripples your code, a global static instance does not.
There are countless questions on SO about the problems with singletons already. Here's one, and another, or another.
In short, a singleton gives you two things:
a globally accessible object, and
a guarantee that only one instance can be created.
If we want just the first point, we should create a globally accessible object.
And why would we ever want the second? We don't know in advance how our code may be used in the future, so why nail it down and remove what may be useful functionality? We're usually wrong when we predict that "I'll only need one instance". And there's a big difference between "I'll only need one instance" (correct answer is then to create one instance), and "the application can't under any circumstances run correctly if more than one instance is created. It will crash, format the user's harddrive and publish sensitive data on the internet" (the answer here is then: Most likely your app is broken, but if it isn't, then yes, a singleton is what you need)
Reason 1:
Singletons are easy to make so they are lazy build.
While you can do this with globals it take extra work by the developer. So by default globals are always initialized (apart from some special rules with namespaces).
So if your object is large and/or expensive to build you may not want to build it unless you really have to use it.
Reason 2:
Order of initialization (and destruction) problem.
GlobalRes& getGlobalRes()
{
static GlobalRes instance; // Lazily initialized.
return instance;
}
GlobalResTwo& getGlobalResTwo()
{
static GlobalResTwo instance; // Lazy again.
return instance;
}
// Order of destruction problem.
// The destructor of this object uses another global object so
// the order of destruction is important.
class GlobalResTwo
{
public:
GlobalResTwo()
{
getGlobalRes();
// At this point globalRes is fully initialized.
// Because it is fully initialized before this object it will be destroyed
// after this object is destroyed (Guaranteed)
}
~GlobalResTwo()
{
// It is safe to use globalRes because we know it will not be destroyed
// before this object.
getGlobalRes().doStuff();
}
};
Another benefit of the Singleton over the global static object is that because the constructor is private, there is a very clear, compiler enforced directive saying "There can only be one".
In comparison, with the global static object, there will be nothing stopping a developer writing code that creates an additional instance of this object.
The benefit of the extra constraint is that you have a guarantee as to how the object will be used.
Using Singleton("construct on first use") idiom, you can avoid static initialization order fiasco
In C++, there's not a huge amount of difference between the two in terms of actual usefulness. A global object can of course maintain its own state (possibly with other global variables, though I don't recommend it). If you're going to use a global or a singleton (and there are many reasons not to), the biggest reason to use a singleton over a global object is that with a singleton, you can have dynamic polymorphism by having several classes inheriting from a singleton base class.
OK, there are two reasons to go with a singleton really. One is the static order thing everyone's talking about.
The other is to prevent someone from doing something like this when using your code:
CoolThing blah;
gs_coolGlobalStaticThing = blah;
or, even worse:
gs_coolGlobalStaticThing = {};
The encapsulation aspect will protect your instance from idiots and malicious jerks.