Why is the following code prints "xxY"? Shouldn't local variables live in the scope of whole function? Can I use such behavior or this will be changed in future C++ standard?
I thought that according to C++ Standard 3.3.2 "A name declared in a block is local to that block. Its potential scope begins at its point of declaration and ends at the end of its declarative region."
#include <iostream>
using namespace std;
class MyClass
{
public:
MyClass( int ) { cout << "x" << endl; };
~MyClass() { cout << "x" << endl; };
};
int main(int argc,char* argv[])
{
MyClass (12345);
// changing it to the following will change the behavior
//MyClass m(12345);
cout << "Y" << endl;
return 0;
}
Based on the responses I can assume that MyClass(12345); is the expression (and scope). That is make sense. So I expect that the following code will print "xYx" always:
MyClass (12345), cout << "Y" << endl;
And it is allowed to make such replacement:
// this much strings with explicit scope
{
boost::scoped_lock lock(my_mutex);
int x = some_func(); // should be protected in multi-threaded program
}
// mutex released here
//
// I can replace with the following one string:
int x = boost::scoped_lock (my_mutex), some_func(); // still multi-thread safe
// mutex released here
The object created in your
MyClass(12345);
is a temporary object which is only alive in that expression;
MyClass m(12345);
is an object which is alive for the entire block.
You're actually creating an object without keeping it in scope, so it is destroyed right after it is created. Hence the behavior you're experiencing.
You can't access the created object so why would the compiler keep it around?
To answer your other questions. The following is the invocation of the comma operator. It creates a MyClass temporary, which includes calling its constructor. It then evaluates the second expression cout << "Y" << endl which will print out the Y. It then, at the end of the full expression, will destroy the temporary, which will call its destructor. So your expectations were right.
MyClass (12345), cout << "Y" << endl;
For the following to work, you should add parentheses, because the comma has a predefined meaning in declarations. It would start declaring a function some_func returning an int and taking no parameters and would assign the scoped_lock object to x. Using parentheses, you say that the whole thing is a single comma operator expression instead.
int x = (boost::scoped_lock (my_mutex), some_func()); // still multi-thread safe
It should be noted that the following two lines are equivalent. The first does not create a temporary unnamed object using my_mutex as the constructor argument, but instead the parentheses around the name are redundant. Don't let the syntax confuse you.
boost::scoped_lock(my_mutex);
boost::scoped_lock my_mutex;
I've seen misuse of the terms scope and lifetime.
Scope is where you can refer to a name without qualifying its name. Names have scopes, and objects inherit the scope of the name used to define them (thus sometimes the Standard says "local object"). A temporary object has no scope, because it's got no name. Likewise, an object created by new has no scope. Scope is a compile time property. This term is frequently misused in the Standard, see this defect report, so it's quite confusing to find a real meaning.
Lifetime is a runtime property. It means when the object is set up and ready for use. For a class type object, the lifetime begins when the constructor ends execution, and it ends when the destructor begins execution. Lifetime is often confused with scope, although these two things are completely different.
The lifetime of temporaries is precisely defined. Most of them end lifetime after evaluation of the full expression they are contained in (like, the comma operator of above, or an assignment expression). Temporaries can be bound to const references which will lengthen their lifetime. Objects being thrown in exceptions are temporaries too, and their lifetime ends when there is no handler for them anymore.
You quoted standard correctly. Let me emphasize:
A name declared in a block is local to that block. Its potential scope begins at its point of declaration and ends at the end of its declarative region.
You didn't declare any name, actually. Your line
MyClass (12345);
does not even contain a declaration! What it contains is an expression that creates an instance of MyClass, computes the expression (however, in this particular case there's nothing to compute), and casts its result to void, and destroys the objects created there.
A less confusing thing would sound like
call_a_function(MyClass(12345));
You saw it many times and know how it works, don't you?
Related
If a variable is declared as static in a function's scope it is only initialized once and retains its value between function calls. What exactly is its lifetime? When do its constructor and destructor get called?
void foo()
{
static string plonk = "When will I die?";
}
The lifetime of function static variables begins the first time[0] the program flow encounters the declaration and it ends at program termination. This means that the run-time must perform some book keeping in order to destruct it only if it was actually constructed.
Additionally, since the standard says that the destructors of static objects must run in the reverse order of the completion of their construction[1], and the order of construction may depend on the specific program run, the order of construction must be taken into account.
Example
struct emitter {
string str;
emitter(const string& s) : str(s) { cout << "Created " << str << endl; }
~emitter() { cout << "Destroyed " << str << endl; }
};
void foo(bool skip_first)
{
if (!skip_first)
static emitter a("in if");
static emitter b("in foo");
}
int main(int argc, char*[])
{
foo(argc != 2);
if (argc == 3)
foo(false);
}
Output:
C:>sample.exe
Created in foo
Destroyed in foo
C:>sample.exe 1
Created in if
Created in foo
Destroyed in foo
Destroyed in if
C:>sample.exe 1 2
Created in foo
Created in if
Destroyed in if
Destroyed in foo
[0] Since C++98[2] has no reference to multiple threads how this will be behave in a multi-threaded environment is unspecified, and can be problematic as Roddy mentions.
[1] C++98 section 3.6.3.1 [basic.start.term]
[2] In C++11 statics are initialized in a thread safe way, this is also known as Magic Statics.
Motti is right about the order, but there are some other things to consider:
Compilers typically use a hidden flag variable to indicate if the local statics have already been initialized, and this flag is checked on every entry to the function. Obviously this is a small performance hit, but what's more of a concern is that this flag is not guaranteed to be thread-safe.
If you have a local static as above, and foo is called from multiple threads, you may have race conditions causing plonk to be initialized incorrectly or even multiple times. Also, in this case plonk may get destructed by a different thread than the one which constructed it.
Despite what the standard says, I'd be very wary of the actual order of local static destruction, because it's possible that you may unwittingly rely on a static being still valid after it's been destructed, and this is really difficult to track down.
The existing explanations aren't really complete without the actual rule from the Standard, found in 6.7:
The zero-initialization of all block-scope variables with static storage duration or thread storage duration is performed before any other initialization takes place. Constant initialization of a block-scope entity with static storage duration, if applicable, is performed before its block is first entered. An implementation is permitted to perform early initialization of other block-scope variables with static or thread storage duration under the same conditions that an implementation is permitted to statically initialize a variable with static or thread storage duration in namespace scope. Otherwise such a variable is initialized the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization. If the initialization exits by throwing an exception, the initialization
is not complete, so it will be tried again the next time control enters the declaration. If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization. If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.
FWIW, Codegear C++Builder doesn't destruct in the expected order according to the standard.
C:\> sample.exe 1 2
Created in foo
Created in if
Destroyed in foo
Destroyed in if
... which is another reason not to rely on the destruction order!
The Static variables are come into play once the program execution starts and it remain available till the program execution ends.
The Static variables are created in the Data Segment of the Memory.
int main()
{
int x;
int x;
return 0;
}
This snippet will give an error:
error: redeclaration of 'int x'
But this one, works just fine:
int main()
{
while(true)
{
int x;
{...}
}
return 0;
}
Which is the reason why in the second example, declaring x in the loop does not redeclare it every iteration? I was expecting the same error as in the first case.
You're smashing together two related but different concepts and thus your confusion. But it's not your fault, as most of the didactic material on the matter doesn't necessarily make the distinction between the two concepts clear.
Variable scope: This is the region of the source code where a symbol (a variable) is visible.
Object lifetime: this is the time during the runtime of the program that an object exists.
This brings us to other two concepts we need to understand and differentiate between:
A variable is a compile-time concept: it is a name (a symbol) that refers to objects
An object is an "entity" at runtime, an instance of a type.
Let's go back to your examples:
int main()
{
int x{};
int x{};
}
Here you try to declare 2 different variables inside the same scope. Those two variables would have the same name inside the function scope, so when you would "say" the name x (when you would write the symbol x) you wouldn't know to which variable you would refer. So it is not allowed.
int main()
{
while(true)
{
int x{};
}
}
Here you declare one variable inside the while body scope. When you write x inside this scope you refer to this variable. No ambiguity. No problems. Valid code. Note this discussion about declarations and variable scope applies at compile-time, i.e. we are discussion about what meaning has the code that you write.
When we discus object lifetime however we are talking about runtime, i.e. the moment when your compiled binary runs. Yes, at runtime, multiple objects will be created and destroyed in succession. All of these objects are referred by the symbol x inside the while body-scope. But the lifetimes of these objects don't overlap. I.e. when you run your program the first object is created. In the source code it is named x inside the while-body scope. Then the object is destroyed, the loop is re-entered and a new object is created. It is also named x in the source code inside the while-body scope. Then it is destroyed, the while is re-entered, a new object is created and so on.
To give you an expanded view on the matter, consider you can have:
A variable which never refers to an object
{ // not global scope
int a; // <-- not initialized
}
The variable a is not initialized, so an object will never be created at runtime.
An object without a name:
int get_int();
{
int sum = get_int() + get_int();
}
There are two objects returned by the two calls to the function get_int(). Those objects are temporaries. They are never named.
Multiple objects instantiated inside the scope of a variable.
This is an advanced, contrived example, at the fringe of C++. Just showing that it is technically possible:
{
int x;
// no object
new (&x) int{11}; // <-- 1st object created. It is is named `x`. Start of its lifetime
// 1st object is alive. Named x
x.~int(); // <-- 1st object destructed. End of its lifetime
// no object
new (&x) int{24}; // <-- 2nd object created. Also named `x`
// 2nd object alive. Named x
} // <-- implicit end of the lifetime of 2nd object.
The scope of x is the whole block delimited by the curly brackets. However there are two object with different non-overlapping lifetimes inside this scope.
Declarations don't happen at runtime, they happen at compile-time.
In your code int x; is declared once, because it appears in the code once. It doesn't matter if it's in a loop or not.
If the loop runs more than once, x will be created and then destroyed more than once. It's allowed, of course.
In c++, the curly braces represent the beginning {, and end }, of a scope. If you have a scope nested inside another scope, for example a while loop inside a function, then the previously declared variables from the outer scope are available inside the new loop scope.
You are not allowed to declare a variable with the same name inside the same scope twice. That's why the compiler creates the first error
error: redeclaration of 'int x'
But in the case of the loop, the variable is only declared once. It doesn't matter that the loop will reuse that declaration multiple times. Just like a function being called multiple times doesn't create a redeclaration error for the variables it declares.
Variables in loops stay in loops, and are not redeclared. This is because, to the best of my knowledge, Loops are just sets of instructions with jump points, and not actually the same code in the .exe file written over and over again.
If you try to make a for loop:
for(int x = 0; x < 10000; ++x);
The loop just reuses the same variable, then removes the variable after use. This is helpful so that loops, and do{}while(condition)'s can actally hold values, and not just have to redeclare, and reset each variable.
Back to the original question, I am going to ask my own: Why are you trying to redeclare a variable? You could just do this:
int main(void){
int variable = 0;
...
variable = 2;
}
Instead of this:
int main(void){
int variable = 0;
...
int variable = 2;
}
Curly brackets in C/C++ represent blocks of code. These blocks of code do not transfer information to other blocks. I recommend looking further into resources on blocks of code, but one resource has been linked here.
Unlike code that is written in "interpreted languages", variables require declaration. Moreover, your code is read sequentially in compiled languages.
Block example:
while (true) {
int i = 0;
}
This declaration is stored with the block somewhere in memory.
The storage is assigned to an "int" variable type. This data member has a certain memory capacity. Redeclaring, in essence, tries to override information stored in that particular block. These blocks are set aside at compilation time.
look at my code:
#include <iostream>
using namespace std;
class MyClass{
public:
char ch[50] = "abcd1234";
};
MyClass myFunction(){
MyClass myClass;
return myClass;
}
int main()
{
cout<<myFunction().ch;
return 0;
}
i can't understand where my return value is stored? is it stored in stack? in heap? and does it remain in memory until my program finished?
if it be stored in stack can i be sure that my class values never change?
please explain the mechanism of these return. and if returning structure is different to returning class?
MyClass myClass; is stored on the stack. It's destroyed immediately after myFunction() exits.
When you return it, a copy is made on the stack. This copy exists until the end of the enclosing expression: cout << myFunction().ch;
Note that if your compiler is smart enough, the second object shouldn't be created at all. Rather, the first object will live until the end of the enclosing expression. This is called NRVO, named return value optimization.
Also note that the standard doesn't define "stack". But any common implementation will use a stack in this case.
if returning structure is different to returning class?
There are no structures in C++; keyword struct creates classes. The only difference between class and struct is the default member access, so the answer is "no".
It's up to the implementation to find a sensible place to store that value. While it's usually on the stack, the language definition does not impose any requirements on where it's actually stored. The returned value is a temporary object, and it gets destroyed at the end of the full statement where it is created; that is, it gets destroyed at the ; at the end of the line that calls myFunction().
When you create an object in any function it's destroyed as soon as the function execution is finished just like in variables.
But when you return a object from a function firstly compiler creates a local instance of this object in heap called unnamed_temporary then destroyes the object you created. And copies the contents of unnamed_temporary on call. Then it destroyes this unnamed _temporary also.
Anything you create without the keyword new will be created in stack.
Yes,contets of your variable ch will not change unless you access that variable and change it yourself.
The instance returned by myFunction is temporary, it disappears when it stop to be useful, so it doesn't exist after after the cout <<.... Just add a destructor and you will see when it is called.
What do you mean about can i be sure that my class values never change? ? You get a copy of the instance.
returning structure is different to returning class? : a struct is like a class where all is public by default, this is the alone difference.
Your function is returning a copy of an object. It will be stored in the stack in memory.
The returning obj. will exist until the scope of that function. After that, it will be destroyed. Then, your expression cout<<function(); will also have the copy of that obj. which is returned by the function. IT will be completely destroyed after the running of this cout<<function(); expression.
I want to use unique_ptr in a method. I want to rely on the fact that it is destroyed at the closing brace of the method (if that is indeed true).
The reason why I want to rely on this fact is that I want to write a simple class for logging that says when it has entered/exited a method, something like:
class MethodTracing
{
string _signature;
public:
MethodTracing(string signature)
{
_signature=signature;
BOOST_LOG_TRIVIAL(trace) << "ENTERED " << _signature ;
}
~MethodTracing()
{
BOOST_LOG_TRIVIAL(trace) << "EXITED " << _signature;
}
};
I can then use it like so:
void myMethod( )
{
auto _ = unique_ptr<MethodTracing>(new MethodTracing(__FUNCSIG__) ) ;
/// ...
}
Is it true (and consistent) that a unique_ptr, when created in a method, is destroyed at the end of the method (assuming it's not passed around).
Are there any other hidden (or otherwise!) pitfalls that I should be aware of?
Update:
As most of the answers suggested, I could have used local variable scoping. I tried this with MethodTracing(__FUNCSIG__);, but of course, I didn't assign a local variable! so it immediately went out of scope. I thought the runtime was being clever, but no, it was me being stupid (too long in C#!)
You don't need to do that - rely on automatic storage, e.g.
void myMethod( )
{
MethodTracing __sig(__FUNCSIG__);
// do stuff
}
__sig will be destroyed at the end of the function scope automatically
(yes __sig is bad form, call it something else if you want)
Yes, unique_ptr is destroyed at the end of the scope it's created in. However you don't need unique_ptr to get this functionality, because all C++ classes have this. You might as well just create your MethodTracing object directly:
void myMethod( )
{
MethodTracing _(__FUNCSIG__);
/// ...
}
You can usually rely on it. The exception would be code that explicitly calls terminate(). In general, destructors of objects are called when they go out of scope (for local variables, that is the end of the method). This is the foundation of RAII.
Yes that is true. But it is not necessarily when control flows through the closing brace. It can be because of a return, an exception or a goto out of its block.
However care should be taken when calling exit() to terminate the program. The local automatics like your unique ptr will no be destroyed then.
You ought to ensure that your ~MethodTracing destructor is no-throw if it is more complex that what you describe above, otherwise your class might only be partially destroyed.
Personally, I'd just declare it on the stack as mentioned above.
If a variable is declared as static in a function's scope it is only initialized once and retains its value between function calls. What exactly is its lifetime? When do its constructor and destructor get called?
void foo()
{
static string plonk = "When will I die?";
}
The lifetime of function static variables begins the first time[0] the program flow encounters the declaration and it ends at program termination. This means that the run-time must perform some book keeping in order to destruct it only if it was actually constructed.
Additionally, since the standard says that the destructors of static objects must run in the reverse order of the completion of their construction[1], and the order of construction may depend on the specific program run, the order of construction must be taken into account.
Example
struct emitter {
string str;
emitter(const string& s) : str(s) { cout << "Created " << str << endl; }
~emitter() { cout << "Destroyed " << str << endl; }
};
void foo(bool skip_first)
{
if (!skip_first)
static emitter a("in if");
static emitter b("in foo");
}
int main(int argc, char*[])
{
foo(argc != 2);
if (argc == 3)
foo(false);
}
Output:
C:>sample.exe
Created in foo
Destroyed in foo
C:>sample.exe 1
Created in if
Created in foo
Destroyed in foo
Destroyed in if
C:>sample.exe 1 2
Created in foo
Created in if
Destroyed in if
Destroyed in foo
[0] Since C++98[2] has no reference to multiple threads how this will be behave in a multi-threaded environment is unspecified, and can be problematic as Roddy mentions.
[1] C++98 section 3.6.3.1 [basic.start.term]
[2] In C++11 statics are initialized in a thread safe way, this is also known as Magic Statics.
Motti is right about the order, but there are some other things to consider:
Compilers typically use a hidden flag variable to indicate if the local statics have already been initialized, and this flag is checked on every entry to the function. Obviously this is a small performance hit, but what's more of a concern is that this flag is not guaranteed to be thread-safe.
If you have a local static as above, and foo is called from multiple threads, you may have race conditions causing plonk to be initialized incorrectly or even multiple times. Also, in this case plonk may get destructed by a different thread than the one which constructed it.
Despite what the standard says, I'd be very wary of the actual order of local static destruction, because it's possible that you may unwittingly rely on a static being still valid after it's been destructed, and this is really difficult to track down.
The existing explanations aren't really complete without the actual rule from the Standard, found in 6.7:
The zero-initialization of all block-scope variables with static storage duration or thread storage duration is performed before any other initialization takes place. Constant initialization of a block-scope entity with static storage duration, if applicable, is performed before its block is first entered. An implementation is permitted to perform early initialization of other block-scope variables with static or thread storage duration under the same conditions that an implementation is permitted to statically initialize a variable with static or thread storage duration in namespace scope. Otherwise such a variable is initialized the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization. If the initialization exits by throwing an exception, the initialization
is not complete, so it will be tried again the next time control enters the declaration. If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization. If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.
FWIW, Codegear C++Builder doesn't destruct in the expected order according to the standard.
C:\> sample.exe 1 2
Created in foo
Created in if
Destroyed in foo
Destroyed in if
... which is another reason not to rely on the destruction order!
The Static variables are come into play once the program execution starts and it remain available till the program execution ends.
The Static variables are created in the Data Segment of the Memory.