Calling operator new at global scope - c++

A colleague and I were arguing the compilability of writing this at global scope:
int* g_pMyInt = new int;
My arguments revolved around the fact that calling a function (which new is)
at global scope was impossible. To my surprise, the above line compiled just fine
(MS-VC8 & Apple's LLVM 3).
So I went on and tried:
int* foo()
{
return new int;
}
int* g_pMyInt = foo(); // Still global scope.
And, that compiled as well and worked like a charm (tested later with a class
whos constructor/destructor printed out a message. The ctor's message
went through, the dtor's didn't. Less surprised that time.)
While this appears very wrong to me (no orderly/right way/time to call delete),
it's not prohibited by the compiler. Why?

Why shouldn't it be allowed? All you're doing is initializing a global variable, which you are perfectly welcome to do, even if the initialization involves a function call:
int i = 5 + 6;
double j(std::sin(1.25));
const Foo k = get_my_foo_on(i, 11, true);
std::ostream & os(std::cout << "hello world\n");
int * p(new int); // fine but very last-century
std::unique_ptr<int> q(new int); // ah, welcome to the real world
int main() { /* ... */ }
Of course you'll need to worry about deleting dynamically allocated objects, whether they were allocated at global scope or not... a resource-owning wrapper class such as unique_ptr would be the ideal solution.

C++ allow processing to happen before and after the main function, in particular for static objects with constructors & destructors (their constructor have to run before main, their destructor after it). And indeed, the execution order is not well defined.
If you are using GCC, see also its constructor function attribute (which may help to give an order).

Of course you can call functions from global scope, as part of the initialization of global objects. If you couldn't, you couldn't define global variables of types with constructors, because constructors also are functions. However be aware that the initialization order between different translation units is not well defined, so if your function relies on a global variable from another translation unit, you will be in trouble unless you took special precaution.

Related

Reasons for putting C/C++ variables in an unnamed scope? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Can I use blocks to manage scope of variables in C++?
I came across some C++ code that resembled:
int main(void) {
int foo;
float qux;
/* do some stuff */
{
int bar;
bar = foo * foo;
qux = some_func(bar);
}
/* continue doing some more stuff */
}
Initially I thought that perhaps the original author was using braces to group some related variables, but being that the system under design doesn't have an abundance of memory I thought the author might have had the intention of having bar's scope resolve and any variables with in go away rather than have them around for the entire enclosing (foo's) scope.
Is there any reason to do this? It seems to me this shouldn't be necessary and that any modern compiler makes this unnecessary?
It seems to me this shouldn't be necessary and that any modern compiler makes this unnecessary?
Yes, modern compilers will optimize the memory usage in cases like this. The extra scope won't make the code faster or more memory efficient.
However, they can't optimize objects with destructors with side-effects as that would change the behavior of the program. Thus, this makes sense to do for such objects.
Is there any reason to do this?
It is useful to group related code together. You know that variables declared inside the braces won't be used anywhere else which is extremely helpful to know.
If you do that multiple times in a single method, that could result in that method taking up less stack space, depending on your compiler. If you're resource limited, you might be on a microcontroller, and their compilers aren't always as full-featured as x86 compilers.
Also, if you do that with full classes (instead of ints & floats), it lets you control where the destructor is called.
class MyClass;
int main(void) {
int foo;
float qux;
/* do some stuff */
{
MyClass bar;
qux = some_func(bar);
} // <-- ~MyClass() called here.
/* continue doing some more stuff */
}
In case of C/C++ one can try to limit conflicts between names (having function that is so long that requires one to scope variable this way is bad idea...) i.e. if there are multiple bar in the same function than scoping them this way will let one make sure they don't collide/override each other.
Normally scope inside function does not impact stack allocation size - stack is pre-allocated for all local variables irrespective of scope.
If the code is really like you've shown, it's probably pretty much pointless. Most compilers I've seen allocate the space for all local variables on entry to the function, and release it on exit from the function. There are a few other possibilities though.
If what you've shown as bar was an object of some class type (especially something with a destructor), the destructor would run on exit from the scope, even though the space wasn't released until later.
Another possibility is that there were really two inner scopes:
int main() {
// ...
{
// some variables
}
// ...
{
// other variables
}
}
In this case, the space for the local variables will be allocated on entry to main -- but, some variables and other variables will (typically) share the same space. I.e., the space that's allocated will be enough to accommodate the larger of the two, but will not (normally) be the sum of the two, as you'd use if you defined all the variables in main's scope.
intention of having bar's scope resolve and any variables with in go
away rather than have them around for the entire enclosing (foo's)
scope
This could be one (unimportant & legacy) reason.
The other reason is to convey it to the reader that int bar is just used within this scope to have a very small functionality (kind of function inside function). After that there is no use of bar.
Your code is equivalent of:
inline void Update (int &foo, float &qux)
{
int bar = foo * foo;
qux = some_func(bar);
}
int main ()
{
...
Update(foo, qux);
...
}
Most compiler will optimize call to Update() inside main() and inline it, which generates similar to what you posted.
It is likely meant to aid the programmer, not optimize the output, as a modern compiler is certainly smart enough to see a temporary variable is only used once.
On the other hand, for the programmer it adds a logical separation, it means "these variables are only needed for this code," and perhaps also "this code does not affect other code in this function."

Crazy talk (paranoid about initialization)

I learned long ago that the only reliable way for a static member of be initialized for sure is to do in a function. Now, what I'm about to do is to start returning static data by non-const reference and I need someone to stop me.
function int& dataSlot()
{
static int dataMember = 0;
return dataMember;
}
To my knowledge this is the only way to ensure that the static member is initlized to zero. However, it creates obscure code like this:
dataSlot() = 7; // perfectly normal?
The other way is to put the definition in a translation unit and keep the stuff out of the header file. I have nothing against that per se but I have no idea what the standard says regard when and under what circumstances that is safe.
The absolute last thing I wanna end up doing is accidently accessing uninitialized data and losing control of my program.
(With the usual cautions against indiscriminate use of globals...) Just declare the variable at global scope. It is guaranteed to be zero-initialized before any code runs.
You have to be more cunning when it comes to types with non-trivial constructors, but ints will work fine as globals.
Returning a non-const reference in itself is fairly harmless, for example it's what vector::at() does, or vector::iterator::operator*.
If you don't like the syntax dataSlot() = 7;, you could define:
void setglobal(int i) {
dataSlot() = i;
}
int getglobal() {
return dataSlot();
}
Or you could define:
int *dataSlot() {
static int dataMember = 0;
return &dataMember;
}
*dataSlot() = 7; // better than dataSlot() = 7?
std::cout << *dataSlot(); // worse than std::cout << dataSlot()?
If you want someone to stop you, they need more information in order to propose an alternative to your use of mutable global state!
It is called Meyers singletor, and it is almost perfectly safe.
You have to take care that the object is created when the function dataSlot() is called, but it is going to be destroyed when the program exists (somewhere when global variables are destructed), therefore you have to take special care. Using this function in destructors is specially dangerous and might cause random crashes.
I learned long ago that the only reliable way for a static member of be initialized for sure is to do in a function.
No, it isn't. The standard guarantees that:
All objects with static storage (both block and file or class-static scope) with trivial constructors are initialized before any code runs. Any code of the program at all.
All objects with file/global/class-static scope and non-trivial constructos are than initialized before the main function is called. It is guaranteed that if objects A and B are defined in the same translation unit and A is defined before B, than A is initialized before B. However order of construction of objects defined in different translation units is unspecified and will often differ between compilations.
Any block-static objects are initialized when their declaration is reached for the first time. Since C++03 standard does not have any support for threads, this is NOT thread-safe!
All objects with static storage (both block and file/global/class-static scoped) are destroyed in the reverse order of their constructors completing after the main() function exits or the application terminates using exit() system call.
Neither of the methods is usable and reliable in all cases!
Now, what I'm about to do is to start returning static data by non-const reference and I need someone to stop me.
Nobody is going to stop you. It's legal and perfectly reasonable thing to do. But make sure you don't fall in the threads trap.
E.g. any reasonable unit-test library for C++ automatically registers all test cases. It does it by having something like:
std::vector<TestCase *> &testCaseList() {
static std::vector<TestCase *> test_cases;
return test_cases;
}
TestCase::TestCase() {
...
testCaseList().push_back(this);
}
Because that's the one of only two ways to do it. The other is:
TestCase *firstTest = NULL;
class TestCase {
...
TestCase *nextTest;
}
TestCase::TestCase() {
...
nextTest = firstTest;
firstTest = this;
}
this time using the fact that firstTest has trivial constructor and therefore will be initialized before any of the TestCases that have non-trivial one.
dataSlot() = 7; // perfectly normal?
Yes. But if you really want, you can do either:
The old C thing of
#define dataSlot _dataSlot()
in a way the errno "variable" is usually defined,
Or you can wrap it in a struct like
class dataSlot {
Type &getSlot() {
static Type slot;
return slot;
}
operator const Type &() { return getSlot(); }
operator=(Type &newValue) { getSlot() = newValue; }
};
(the disadvantage here is that compiler won't look for Type's method if you try to invoke them on dataSlot directly; that's why it needs the operator=)
You could make yourself 2 functions, dataslot() and set_dataslot() which are wrappers round the actual dataslot, a bit like this:
int &_dataslot() { static int val = 0; return val; }
int dataslot() { return _dataslot(); }
void set_dataslot(int n) { _dataslot() = n; }
You probably wouldn't want to inline that lot in a header, but I've found some C++ implementations do rather badly if you try that sort of thing anyway.

Global variable inside cpp file

I have a global variable inside an anonymous namespace.
namespace {
std::unordered_map<std::string, std::string> m;
}
A::A() { m.insert(make_pair("1", "2")); } // crasches
void A::insert() { m.insert(make_pair("1", "2")); } // ok
If try to use the map inside the constructor I get Access violation reading location.
But if I use it after A has been initialized it works.Is this behavior correct?
What is the scope of the A object whose constructor invocation is causing the crash?
There are no guarantees as to the order that static initializers are executed, so that if your A object is also a global or static (as m is), it's quite possible that m does not exist yet in terms of being a validly constructed object, which would mean that your call to std::unordered_map::insert() would be invoked on uninitialized memory, thus leading to your crash.
A solution is to make sure that all of your A instances that depend on m are constructed explicitly by you and not statically/globally (or as the commenter added, if they are in the same TU, to order them properly), or to change the structure of A such that you can call a function on an instance later in order to do the insert. Whether or not this is a valid solution depends more on the overarching usage of A.
You are probably creating a class of type A in a static context somewhere in your application, ie before your main() function is executed, and therefore before m has been initialized.

C++ initialization of global variables

In C++ what would be the best way to have an object that needs to be initalized in main(), but has to be global, so it can be accessed from other functions throughout the program? And make sure the destructor is called to make sure it gets cleaned up properly?
struct foo {};
foo *x_ptr;
int main() {
foo x;
x_ptr = &x;
// the rest
}
You can also use std::reference_wrapper if you don't want to access members via operator->.
But really, don't do that. Pass it along if it's needed, instead of making it global, e.g.
void needs_foo1(foo&);
void needs_foo2(foo&, int, int, int);
int main() {
foo x;
needs_foo1(x);
needs_foo2(x, 1, 2, 3);
// et cetera
}
I suspect that "global" is a solution rather than a requirement. As it has been suggested you could always pass your object around explicitly.
If you don't want to do that I'd probably use a shared::ptr, possibly wrapped in a Singleton implementation. Your shared_ptr would be initialized to null at program start-up and set to a valid value in main().
Beware that you may encounter order of destruction problems if you have global variables that depend on other global variables. There's also a huge literature about the drawbacks of the Singleton patterns.

C++ static members

I have the following code:
void Foo() {
static std::vector<int>(3);
// Vector object is constructed every function call
// The destructor of the static vector is invoked at
// this point (the debugger shows so)
// <-------------------
int a;
}
Then somewhere I call Foo several times in a sequence
Why does the vector object gets constructed on every Foo() call and why is the destructor called right after static ... declaration?
Update:
I was trying to implement function once calling mechanism and I thought that writing something like
static core::CallOnce(parameters) where CallOnce is a class name would be very nice.
To my mind writing static core::CallOnce call_once(parameters) looks worse, but okay, this is the case I can't do anything with it.
Thank you.
Your variable needs a name:
static std::vector<int> my_static_vector(3);
You forgot to give the vector a name, so without any variable pointing to it it's destroyed immediately after it's created
Because std::vector<int>(3) creates an unnamed temporary, which lives only to the end of it's contained expression. The debugger can't show destruction in the same line as construction though, so it shows it on the next line.
Give the item an name and normal static semantics will apply.