I have encountered a problem in my learning of C++, where a local variable in a function is being passed to the local variable with the same name in another function, both of these functions run in main().
When this is run,
#include <iostream>
using namespace std;
void next();
void again();
int main()
{
int a = 2;
cout << a << endl;
next();
again();
return 0;
}
void next()
{
int a = 5;
cout << a << endl;
}
void again()
{
int a;
cout << a << endl;
}
it outputs:
2
5
5
I expected that again() would say null or 0 since 'a' is declared again there, and yet it seems to use the value that 'a' was assigned in next().
Why does next() pass the value of local variable 'a' to again() if 'a' is declared another time in again()?
http://en.cppreference.com/w/cpp/language/ub
You're correct, an uninitialized variable is a no-no. However, you are allowed to declare a variable and not initialize it until later. Memory is set aside to hold the integer, but what value happens to be in that memory until you do so can be anything at all. Some compilers will auto-initialize variables to junk values (to help you catch bugs), some will auto-initialize to default values, and some do nothing at all. C++ itself promises nothing, hence it's undefined behavior. In your case, with your simple program, it's easy enough to imagine how the compiler created assembly code that reused that exact same piece of memory without altering it. However, that's blind luck, and even in your simple program isn't guaranteed to happen. These types of bugs can actually be fairly insidious, so make it a rule: Be vigilant about uninitialized variables.
An uninitialized non-static local variable of *built-in type (phew! that was a mouthful) has an indeterminate value. Except for the char types, using that value yields formally Undefined Behavior, a.k.a. UB. Anything can happen, including the behavior that you see.
Apparently with your compiler and options, the stack area that was used for a in the call of next, was not used for something else until the call of again, where it was reused for the a in again, now with the same value as before.
But you cannot rely on that. With UB anything, or nothing, can happen.
* Or more generally of POD type, Plain Old Data. The standard's specification of this is somewhat complicated. In C++11 it starts with §8.5/11, “If no initializer is specified for an object, the object is default-initialized; if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value.”. Where “automatic … storage duration” includes the case of local non-static variable. And where the “no initialization” can occur in two ways via §8.5/6 that defines default initialization, namely either via a do-nothing default constructor, or via the object not being of class or array type.
This is completely coincidental and undefined behavior.
What's happened is that you have two functions called immediately after one another. Both will have more or less identical function prologs and both reserve a variable of exactly the same size on the stack.
Since there are no other variables in play and the stack is not modified between the calls, you just happen to end up with the local variable in the second function "landing" in the same place as the previous function's local variable.
Clearly, this is not good to rely upon. In fact, it's a perfect example of why you should always initialize variables!
Related
#include<iostream>
using namespace std;
void fun(int a) {
int x;
cout << x << endl;
x = a;
}
int main() {
fun(12);
fun(1);
return 0;
}
The output of this code is as follows:
178293 //garbage value
12
why are we getting 12 and not garbage value instead??
why are we getting 12 and not garbage value instead??
In theory, the value of x could be anything. However, what's happening in practice is that two calls to fun one after the other is responsible for the previous value of x to be still on the stack frame.
Let's say the stack frame is structured as below:
arguments
return value
local variables
In your case,
Memory used for arguments is equal to sizeof(int).
The compiler may omit using any memory for the return value since the return type is void.
Memory used for local variables is equal to sizeof(int).
When the function call is made the first time, the value in the arguments part is set to 12. The value in the local variables is garbage, as you noticed. However, before you return from the function, you set the value of the local variable to 12.
The second time the function is made, the the value of the argument is set to 1. The value in the local variable is still left over from the previous call. Hence, it is still 12. If you call the function a third time, you are likely to see the value 1 in the local variable.
Anyway, that's one plausible explanation. Once again, remember that this is undefined behavior. Don't count on any specific behavior. The compiler could decide to scrub the stack frame before using it. The compiler could decide to scrub the stack frame immediately after it is used. The compiler could do whatever it wants to do with the stack frame after it is used. If there is another call between the calls to fun, you will most likely get completely different values.
You have not initialized the value x when printing it. Reading from uninitialized memory is UB, i.e., there are no guarantees as to what will happen. It could output a random number, or invoke an unlikely combination of bits that will do something unexpected.
Reading an uninitialized integer is UNDEFINED BEHAVIOR, that means that it can do literally anything, can print anything. It can format your hard drive or collapse the observable universe! Basically I dont know compiler implementations that do so but theoretically they can!
I was recently reviewing some code and I came across something that I was confused about. Say I have a functions, int getNewNumber(int num, int dir), implemented like this:
int getNewNumber(int num, int dir) {
int newNum = num;
if(dir == 1) {
newNum++;
} else {
newNum--;
}
return newNum;
}
Now, when calling the function, I have something like this:
int number = getNewNumber(number, 1);
Is it initialized to 0 before being passed into newNum? I'm confused about how you can use the variable as an argument when it's being initialized.
Is it initialized to 0 before being passed into newNum?
Maybe. It depends on context. If the variable is a global static, then it is zero initialized before dynamic initialization.
If it is an automatic variable, then the value passed into getNewNumber is indeterminate and using that value has undefined behaviour. A decent compiler will warn you.
I'm confused about how you can use the variable as an argument when it's being initialized.
If the variable wasn't initialized statically, then you can't use its value in its own initialization in a way that would have defined behaviour.
If the variable was zero initialized before dynamic initialization, then you can use the value, but you might as well use literal zero, and that would be clearer to the reader of the program. I don't think there is any useful way to use the value of a variable in its own initialization.
I really think it depends on the compiler. In general I'd call it unsafe - in the best case you'll get a a value, that has the same type, or can be converted to this type. In the worst case - the program will simply crash.
I've checked myself, I wrote a program like this
int main() {
int i;
cout << i;
return 0;
}
I ran the program a few times and the result was same all the time, zero.
I've tried it in C and the result was the same.
But my textbook says
If you don’t initialize an variable that’s defined
inside a function, the variable value remain undefined.That means the element takes on
whatever value previously resided at that location in memory.
How's this possible when the program always assign a free memory location to a variable? How could it be something other than zero(I assume that default free memory value is zero)?
How's this possible when the program always assign a free memory
location to a variable? How could it be something rather than zero?
Let's take a look at an example practical implementation.
Let's say it utilizes stack to keep local variables.
void
foo(void)
{
int foo_var = 42;
}
void
bar(void)
{
int bar_var;
printf("%d\n", bar_var);
}
int
main(void)
{
bar();
foo();
bar();
}
Totally broken code above illustrates the point. After we call foo, certain location on the stack where foo_var was placed is set to 42. When we call bar, bar_var occupies that exact location. And indeed, executing the code results in printing 0 and 42, showing that bar_var value cannot be relied upon unless initialized.
Now it should be clear that local variable initialisation is required. But could main be an exception? Is there anything which could play with the stack and in result give us a non-zero value?
Yes. main is not the first function executed in your program. In fact there is tons of work required to set everything up. Any of this work could have used the stack and leave some non-zeros on it. Not only you can't expect the same value on different operating systems, it may very well suddenly change on the very system you are using right now. Interested parties can google for "dynamic linker".
Finally, the C language standard does not even have the term stack. Having a "place" for local variables is left to the compiler. It could even get random crap from whatever happened to be in a given register. It really can be totally anything. In fact, if an undefined behaviour is triggered, the compiler has the freedom to do whatever it feels like.
If you don’t initialize an variable that’s defined inside a function, the variable value remain undefined.
This bit is true.
That means the element takes on whatever value previously resided at that location in memory.
This bit is not.
Sometimes in practice this will occur, and you should realise that getting zero or not getting zero perfectly fits this theory, for any given run of your program.
In theory your compiler could actually assign a random initial value to that integer if it wanted, so trying to rationalise about this is entirely pointless. But let's continue as if we assumed that "the element takes on whatever value previously resided at that location in memory"…
How could it be something rather than zero(I assume that default free memory value is zero)?
Well, this is what happens when you assume. :)
This code invokes Undefined Behavior (UB), since the variable is used uninitialized.
The compiler should emit a warning, when a warning flag is used, like -Wall for example:
warning: 'i' is used uninitialized in this function [-Wuninitialized]
cout << i;
^
It just happens, that at this run, on your system, it had the value of 0. That means that the garbage value the variable was assigned to, happened to be 0, because the memory leftovers there suggested so.
However, notice that, kernel zeroes appear relatively often. That means that it is quite common that I can get zero as an output of my system, but it is not guaranteed and should not be taken into a promise.
Static variables and global variables are initialized to zero:
Global:
int a; //a is initialized as 0
void myfunc(){
static int x; // x is also initialized as 0
printf("%d", x);}
Where as non-static variables or auto variables i.e. the local variables are indeterminate (indeterminate usually means it can do anything. It can be zero, it can be the value that was in there, it can crash the program). Reading them prior to assigning a value results in undefined behavior.
void myfunc2(){
int x; // value of x is assigned by compiler it can even be 0
printf("%d", x);}
It mostly depends on compiler but in general most cases the value is pre assumed as 0 by the compliers
Out of curiosity, I've tried this code, resulting from an interview question[*]
int main(int argc, char *argv[])
{
int a = 1234;
printf("Outer: %d\n", a);
{
int a(a);
printf("Inner: %d\n", a);
}
}
When compiled on Linux (both g++ 4.6.3 and clang++ 3.0) it outputs:
Outer: 1234
Inner: -1217375632
However on Windows (VS2010) it prints:
Outer: 1234
Inner: 1234
The rationale would be that, until the copy-constructor of the second 'a' variable has finished, the first 'a' variable is still accessible. However I'm not sure if this is standard behaviour, or just a(nother) Microsoft quirk.
Any idea?
[*] The actual question was:
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
{
// Not at global scope here
int a = 1234;
{
int a;
// how do you set this a to the value of the containing scope a ?
}
}
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
Unless the outer scope can be explicitly named you cannot do this. You can explicitly name the global scope, namespace scopes, and class scopes, but not function or block statement scopes.
C++11 [basic.scope.pdecl 3.3.2 p1 states:
The point of declaration for a name is immediately after its complete declarator (Clause 8) and before its initializer (if any), except as noted below. [ Example:
int x = 12;
{ int x = x; }
Here the second x is initialized with its own (indeterminate) value. —end example ]
MSVC correctly implements this example, however it does not correctly implement this when the initializer uses parentheses instead of assignment syntax. There's a bug filed about this on microsoft connect.
Here's an example program with incorrect behavior in VS as a result of this bug.
#include <iostream>
int foo(char) { return 0; }
int foo(int) { return 1; }
int main()
{
char x = 'a';
{
int x = foo(static_cast<decltype(x)>(0));
std::cout << "'=' initialization has correct behavior? " << (x?"Yes":"No") << ".\n";
}
{
int x(foo(static_cast<decltype(x)>(0)));
std::cout << "'()' initialization has correct behavior? " << (x?"Yes":"No") << ".\n";
}
}
C++ includes the following note.
[ Note: Operations involving indeterminate values may cause undefined behavior. —end note ]
However, this note indicates that operations may cause undefined behavior, not that they necessarily do. The above linked bug report includes an acknowledgement from Microsoft that this is a bug and not that the program triggers undefined behavior.
Edit: And now I've changed the example so that the object with indeterminate value is only 'used' in an unevaluated context, and I believe that this absolutely rules out the possibility of undefined behavior on any platform, while still demonstrating the bug in Visual Studio.
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
If you want to get technical about the wording, it's pretty easy. A "temporary" has a specific meaning in C++ (see §12.2); any named variable you create is not a temporary. As such, you can just create a local variable (which is not a temporary) initialized with the correct value:
int a = 1234;
{
int b = a;
int a = b;
}
An even more defensible possibility would be to use a reference to the variable in the outer scope:
int a = 1234;
{
int &ref_a = a;
int a = ref_a;
}
This doesn't create an extra variable at all -- it just creates an alias to the variable at the outer scope. Since the alias has a different name, we retain access to the variable at the outer scope, without defining a variable (temporary or otherwise) to do so. Many references are implemented as pointers internally, but in this case (at least with a modern compiler and optimization turned on) I'd expect it not to be -- that the alias really would just be treated as a different name referring to the variable at the outer scope (and a quick test with VC++ shows that it works this way -- the generated assembly language doesn't use ref_a at all).
Another possibility along the same lines would be like this:
const int a = 10;
{
enum { a_val = a };
int a = a_val;
}
This is somewhat similar to the reference, except that in this case there's not even room for argument about whether a_val could be called a variable -- it absolutely is not a variable. The problem is that an enumeration can only be initialized with a constant expression, so we have to define the outer variable as const for it to work.
I doubt any of these is what the interviewer really intended, but all of them answer the question as stated. The first is (admittedly) a pure technicality about definitions of terms. The second might still be open to some argument (many people think of references as variables). Though it restricts the scope, there's no room for question or argument about the third.
What you are doing, initializing a variable with itself, is undefined behavior. All your test cases got it right, this is not a quirk. An implementation could also initialize a to 123456789 and it would still be standard.
Update: The comments on this answer point that initializing a variable with itself is not undefined behavior, but trying to read such variable is.
How you'd initialise a variable within a scope with the value of an identically named variable in the containing scope without using a temporary or global variable?
You can't. As soon as the identical name is declared, the outer name is inaccessible for the rest of the scope. You'd need a copy or an alias of the outer variable, which means you'd need a temporary variable.
I'm surprised that, even with the warning level cranked up, VC++ doesn't complain on this line:
int a(a);
Visual C++ will sometimes warn you about hiding a variable (maybe that's only for members of derived classes). It's also usually pretty good about telling you you're using a value before it has been initialized, which is the case here.
Looking at the code generated, it happens to initialize the inner a to the same value of the outer a because that's what's left behind in a register.
I had a look at the standard, it's actually a grey area but here's my 2 cents...
3.1 Declarations and definitions [basic.def]
A declaration introduces names into a translation unit or redeclares names introduced by previous declarations.
A declaration is a definition unless... [non relevant cases follow]
3.3.1 Point of declaration
The point of declaration for a name is immediately after its complete declarator and before its initializer (if any), except as noted below [self-assignment example].
A nonlocal name remains visible up to the point of declaration of the local name that hides it.
Now, if we assume that this is the point of declaration of the inner 'a' (3.3.1/1)
int a (a);
^
then the outer 'a' should be visible up to that point (3.3.1/2), where the inner 'a' is defined.
Problem is that in this case, according to 3.1/2, a declaration IS a definition. This means the inner 'a' should be created. Until then, I can't understand from the standard whether the outer 'a' is still visible or not. VS2010 assumes that it is, and all that falls within the parentheses refers to the outer scope. However clang++ and g++ treat that line as a case of self-assignment, which results in undefined behaviour.
I'm not sure which approach is correct, but I find VS2010 to be more consistent: the outer scope is still visible until the inner 'a' is fully created.
I just realised that this program compiles and runs (gcc version 4.4.5 / Ubuntu):
#include <iostream>
using namespace std;
class Test
{
public:
// copyconstructor
Test(const Test& other);
};
Test::Test(const Test& other)
{
if (this == &other)
cout << "copying myself" << endl;
else
cout << "copying something else" << endl;
}
int main(int argv, char** argc)
{
Test a(a); // compiles, runs and prints "copying myself"
Test *b = new Test(*b); // compiles, runs and prints "copying something else"
}
I wonder why on earth this even compiles. I assume that (just as in Java) arguments are evaluated before the method / constructor is called, so I suspect that this case must be covered by some "special case" in the language specification?
Questions:
Could someone explain this (preferably by referring to the specification)?
What is the rationale for allowing this?
Is it standard C++ or is it gcc-specific?
EDIT 1: I just realised that I can even write int i = i;
EDIT 2: Even with -Wall and -pedantic the compiler doesn't complain about Test a(a);.
EDIT 3: If I add a method
Test method(Test& t)
{
cout << "in some" << endl;
return t;
}
I can even do Test a(method(a)); without any warnings.
The reason this "is allowed" is because the rules say an identifiers scope starts immediately after the identifier. In the case
int i = i;
the RHS i is "after" the LHS i so i is in scope. This is not always bad:
void *p = (void*)&p; // p contains its own address
because a variable can be addressed without its value being used. In the case of the OP's copy constructor no error can be given easily, since binding a reference to a variable does not require the variable to be initialised: it is equivalent to taking the address of a variable. A legitimate constructor could be:
struct List { List *next; List(List &n) { next = &n; } };
where you see the argument is merely addressed, its value isn't used. In this case a self-reference could actually make sense: the tail of a list is given by a self-reference. Indeed, if you change the type of "next" to a reference, there's little choice since you can't easily use NULL as you might for a pointer.
As usual, the question is backwards. The question is not why an initialisation of a variable can refer to itself, the question is why it can't refer forward. [In Felix, this is possible]. In particular, for types as opposed to variables, the lack of ability to forward reference is extremely broken, since it prevents recursive types being defined other than by using incomplete types, which is enough in C, but not in C++ due to the existence of templates.
I have no idea how this relates to the specification, but this is how I see it:
When you do Test a(a); it allocates space for a on the stack. Therefore the location of a in memory is known to the compiler at the start of main. When the constructor is called (the memory is of course allocated before that), the correct this pointer is passed to it because it's known.
When you do Test *b = new Test(*b);, you need to think of it as two steps. First the object is allocated and constructed, and then the pointer to it is assigned to b. The reason you get the message you get is that you're essentially passing in an uninitialized pointer to the constructor, and the comparing it with the actual this pointer of the object (which will eventually get assigned to b, but not before the constructor exits).
The second one where you use new is actually easier to understand; what you're invoking there is exactly the same as:
Test *b;
b = new Test(*b);
and you're actually performing an invalid dereference. Try to add a << &other << to your cout lines in the constructor, and make that
Test *b = (Test *)0xFOOD1E44BADD1E5;
to see that you're passing through whatever value a pointer on the stack has been given. If not explicitly initialized, that's undefined. But even if you don't initialize it with some sort of (in)sane default, it'll be different from the return value of new, as you found out.
For the first, think of it as an in-place new. Test a is a local variable not a pointer, it lives on the stack and therefore its memory location is always well defined - this is very much unlike a pointer, Test *b which, unless explicitly initialized to some valid location, will be dangling.
If you write your first instantiation like:
Test a(*(&a));
it becomes clearer what you're invoking there.
I don't know a way to make the compiler disallow (or even warn) about this sort of self-initialization-from-nowhere through the copy constructor.
The first case is (perhaps) covered by 3.8/6:
before the lifetime of an object has
started but after the storage which
the object will occupy has been
allocated or, after the lifetime of an
object has ended and before the
storage which the object occupied is
reused or released, any lvalue which
refers to the original object may be
used but only in limited ways. Such an
lvalue refers to allocated storage
(3.7.3.2), and using the properties of
the lvalue which do not depend on its
value is well-defined.
Since all you're using of a (and other, which is bound to a) before the start of its lifetime is the address, I think you're good: read the rest of that paragraph for the detailed rules.
Beware though that 8.3.2/4 says, "A reference shall be initialized to refer to a valid object or function." There is some question (as a defect report on the standard) what "valid" means in this context, so possibly you can't bind the parameter other to the unconstructed (and hence, "invalid"?) a.
So, I'm uncertain what the standard actually says here - I can use an lvalue, but not bind it to a reference, perhaps, in which case a isn't good, while passing a pointer to a would be OK as long as it's only used in the ways permitted by 3.8/5.
In the case of b, you're using the value before it's initialized (because you dereference it, and also because even if you got that far, &other would be the value of b). This clearly is not good.
As ever in C++, it compiles because it's not a breach of language constraints, and the standard doesn't explicitly require a diagnostic. Imagine the contortions the spec would have to go through in order to mandate a diagnostic when an object is invalidly used in its own initialization, and imagine the data flow analysis that a compiler might have to do to identify complex cases (it may not even be possible at compile time, if the pointer is smuggled through an externally-defined function). Easier to leave it as undefined behavior, unless anyone has any really good suggestions for new spec language ;-)
If you crank your warning levels up, your compiler will probably warn you about using uninitialized stuff. UB doesn't require a diagnostic, many things that are "obviously" wrong may compile.
I don't know the spec reference, but I do know that accessing an uninitialized pointer always results in undefined behaviour.
When I compile your code in Visual C++ I get:
test.cpp(20): warning C4700:
uninitialized local variable 'b' used