I've been coding for a while in other languages and am pretty proficient, but now I am diving more deeply into C++ and have come across some weird problems that I never had in other languages. The most frustrating one, which a google search hasn't been able to answer, is with two different code orders.
The background is that I have an array of integers, and a pointer an element in the array. When I go to print the pointer one method prints correctly, and the other prints nonsense.
An example of the first code order is:
#include <iostream>
using namespace std;
void main(){
int *pAry;
int Ary[5]={2,5,2,6,8};
pAry=&Ary[3];
cout<<*pAry<<endl;
system("pause");
}
and it works as expected. However this simple order wont work for the full project as I want other modules to access pAry, so I thought a global define should work, since it works in other languages. Here is the example:
#include <iostream>
using namespace std;
int *pAry;
void evaluate();
void main(){
evaluate();
cout<<*pAry<<endl;
system("pause");
}
void evaluate(){
int Ary[5]={2,5,2,6,8};
pAry=&Ary[3];
}
When I use this second method the output is nonsense. Specifically 1241908....when the answer should be 6.
First I would like to know why my global method isn't working, and secondly I would like to know how to make it work. Thanks
In the second example, Ary is local to the function evaluate. When evaluate returns, Ary goes out of scope, and accessing it's memory region results in undefined behaviour.
To avoid this, declare Ary in a scope where it will still be valid at the time you try to access it.
It's not working because your pAry is pointing into a local array, which is destroyed when you return from evaluate(). That's undefined behavior.
One possible fix is to make your local array static:
static int Ary[5]={2,5,2,6,8};
In the evaluate() function, you're aiming pAry at a variable (actually array element) local to that function. A pointer can only be dereferenced as long as the object to which it points still exists. Local objects cease to exist when they go out of scope; in this case, this means all elements of Ary cease to exist when evalute() ends, and thus pAry becomes a dangling pointer (it doesn't point anywhere valid).
Dereferncing a dangling pointer gives undefined behaviour; in your particular case, it outputs a garbage value, but it might just as well crash or (worst of all) appear to work fine until the program changes later.
To solve this issue, you can either make Ary global as well, or make it static:
void evaluate(){
static int Ary[5]={2,5,2,6,8};
pAry=&Ary[3];
}
A static local variable persists across function calls (it exists from first initialisation until program termination), so the pointer will remaing valid.
The problem is that you need to declare Ary with file scope.
int Ary[] = {1,2,3,4,5};
void print_ary(void); // The void parameter is a style thingy. :-)
int main(void)
{
print_ary();
return EXIT_SUCCESS;
}
void print_ary(void)
{
for (unsigned int i = 0; i < (sizeof(Ary) / sizeof(Ary[0]); ++i)
{
std::cout << Ary[i] << std::endl;
}
}
A better solution would be to pass a std::vector around to your functions rather than using a global variable.
Related
I did a bit of an experiment to try to understand references in C++:
#include <iostream>
#include <vector>
#include <set>
struct Description {
int a = 765;
};
class Resource {
public:
Resource(const Description &description) : mDescription(description) {}
const Description &mDescription;
};
void print_set(const std::set<Resource *> &resources) {
for (auto *resource: resources) {
std::cout << resource->mDescription.a << "\n";
}
}
int main() {
std::vector<Description> descriptions;
std::set<Resource *> resources;
descriptions.push_back({ 10 });
resources.insert(new Resource(descriptions.at(0)));
// Same as description (prints 10)
print_set(resources);
// Same as description (prints 20)
descriptions.at(0).a = 20;
print_set(resources);
// Why? (prints 20)
descriptions.clear();
print_set(resources);
// Object is written to the same address (prints 50)
descriptions.push_back({ 50 });
print_set(resources);
// Create new array
descriptions.reserve(100);
// Invalid address
print_set(resources);
for (auto *res : resources) {
delete res;
}
return 0;
}
https://godbolt.org/z/TYqaY6Tz8
I don't understand what is going on here. I have found this excerpt from C++ FAQ:
Important note: Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object, just with another name. It is neither a pointer to the object, nor a copy of the object. It is the object. There is no C++ syntax that lets you operate on the reference itself separate from the object to which it refers.
This creates some questions for me. So, if reference is the object itself and I create a new object in the same memory address, does this mean that the reference "becomes" the new object? In the example above, vectors are linear arrays; so, as long as the array points to the same memory range, the object will be valid. However, this becomes a lot trickier when other data sets are being used (e.g sets, maps, linked lists) because each "node" typically points to different parts of memory.
Should I treat references as undefined if the original object is destroyed? If yes, is there a way to identify that the reference is destroyed other than a custom mechanism that tracks the references?
Note: Tested this with GCC, LLVM, and MSVC
The note is misleading, treating references as syntax sugar for pointers is fine as a mental model. In all the ways a pointer might dangle, a reference will also dangle. Accessing dangling pointers/references is undefined behaviour (UB).
int* p = new int{42};
int& i = *p;
delete p;
void f(int);
f(*p); // UB
f(i); // UB, with the exact same reason
This also extends to the standard containers and their rules about pointer/reference invalidation. The reason any surprising behaviour happens in your example is simply UB.
The way I explain this to myself is:
Pointer is like a finger on your hands. It can point to memory blocks, think of them as a keyboard. So pointer literally points to a keypad that holds something or does something.
Reference is a nickname for something. Your name may be for example Michael Johnson, but people may call you Mike, MJ, Mikeson etc. Anytime you hear your nickname, person who called REFERED to the same thing - you. If you do something to yourself, reference will show the change too. If you point at something else, it won't affect what you previously pointed on (unless you're doing something weird), but rather point on something new. That being said, as in the accepted answer above, if you do something weird with your fingers and your nicknames, you'll see weird things happening.
References are likely the most important feature that C++ has that is critical in coding for beginners. Many schools today start with MATLAB which is insanely slow when you wish to do things seriously. One of the reasons is the lack of controlling references in MATLAB (yes it has them, make a class and derive from the handle - google it out) as you would in C++.
Look these two functions:
double fun1(std::valarray<double> &array)
{
return array.max();
}
double fun2(std::valarray<double> array)
{
return array.max();
}
These simple two functions are very different. When you have some STL array and use fun1, function will expect nickname for that array, and will process it directly without making a copy. fun2 on the other hand will take the input array, create its copy, and process the copy.
Naturally, it is much more efficient to use references when making functions to process inputs in C++. That being said, you must be certain not to change your input in any way, because that will affect original input array in another piece of code where you generated it - you are processing the same thing, just called differently.
This makes references useful for a bit controversial coding, called side-effects.
In C++ you can't make a function with multiple outputs directly without making a custom data type. One workaround is a side effect in example like this:
#include <stdio.h>
#include <valarray>
#include <iostream>
double fun3(std::valarray<double> &array, double &min)
{
min = array.min();
return array.max();
}
int main()
{
std::valarray<double> a={1, 2, 3, 4, 5};
double sideEffectMin;
double max = fun3(a,sideEffectMin);
std::cout << "max of array is " << max << " min of array is " <<
sideEffectMin<<std::endl;
return 0;
}
So fun3 is expecting a reference to a double data type. In other words, it wants the second input to be a nickname for another double variable. This function then goes to alter the reference, and this will also alter the input. Both name and nickname get altered by the function, because it's the same "thing".
In main function, variable sideEffectMin is initialized to 0, but it will get a value when fun3 function is called. Therefore, you got 2 outputs from fun3.
The example shows you the trick with side effect, but also to be ware not to alter your inputs, specially when they are references to something else, unless you know what you are doing.
I am new to C++ and trying to convert string into integer. I was using atoi but there are some restrictions so I start using strtol which works perfectly. However, I would like to learn more on *temp and &temp (I have google and learn that it is a temporary space for storage) but would like to learn the difference and when to use which.
char *temp;
int m = strtol (argv[1],&temp,10);
if (*temp != '\0')
*temp is a pointer to a variable named temp and &temp takes the address of that variable
First of all jessycaaaa welcome to Stackoverflow.
I am new to C++ and trying to convert string into integer.
For me this looks like plain C-code. You can compile this with a C++ compiler though.
I was using atoi but there are some restrictions so I start using strtol which works perfectly.
Since you get an undefined behavior using atoi when argv[1] contains something different than a number, strtol is an approach to go for. If you share us a bit more code, we would help you better on your questions.
However, I would like to learn more on *temp and &temp (I have google and learn that it is a temporary space for storage) but would like to learn the difference and when to use which.
First of all you have to distinguish between use and declaration
char *temp;
Here you declare (*-symbol in declaration) a pointer named temp of type char. A pointer is a variable which stores the memory address (where it is pointing to). Here you did not define an address so it most likely will point a random space, but then
int m = strtol (argv[1],&temp,10);
you pass the address of the pointer (&-symbol, use-case, address-of operator) to strtol, so you get an address pointing to the part of the argv[1] where the number literals end, that is all fine. The function also returns the numerical value of the read string as long and is converted to an int.
if (*temp != '\0')
Here you access the value of what the address is pointing to (*-symbol, use-case, dereference operator). \0 is normally set as indication for a null-terminated string. So you are asking if the previously read end part has the null-termination character.
You know what: in C++ there are more elegant ways to accomplish that using stringstreams:
std::stringstream
Just an idea if you don't want to handle too much string manipulation in C and annoyances with pointers.
Also I would read a good book about C (not C++). C++ has also the references don't get confused by those. If you dominate the pointer-concept of C, I'm pretty sure everything else will be very clear for you.
Best regards
* and & are one of the first hurdles that programmers new to C and C++ have to take.
To really understand these concepts, it helps to know a bit more about how memory works in these languages.
First of all: C++ is just C but with classes and many other additional features. Almost all C programs are valid C++ programs. C++ even started out as a language that was compiled to C first.
Memory is, roughly speaking, divided in two parts, a 'stack' and a 'heap'. There are also other places for the code itself and compile-time constants (and maybe a few more) et cetera but that doesn't matter for now. Variables declared within a function always live on the stack. Let's see this in action with a simple example and analyse how memory is organized to build a mental model.
#include <iostream>
void MyFunction() {
int intOnStack = 5;
int* intPtrOnStack = new int(6); // This int pointer points to an int on the heap
std::cout << intOnStack << *intPtrOnStack;
delete intPtrOnStack;
}
int main() { MyFunction(); }
This program prints 56 when executed. So what happens when MyFunction() gets called? First, a part of the stack is reserved for this function to work with. When the variable intOnStack is declared within the function, it is placed in this part of the stack and it is initialized with (filled with) the int value 5.
Next, the variable intPtrOnStack is declared. intPtrOnStack is of type int*. int*'s point to int's by containing their memory-address. So an int* is placed on the stack and it is initialized with the value that results from the expression new int(6). This expression creates a new int on the heap and returns the memory-address of this int (an int*) to it. So that means that intPtrOnStack now points to the int on the heap. Though the pointer itself lives on the stack.
The heap is a part of memory that is 'shared' by all functions and objects within the program. The stack isn't. Every function has its own part of the stack and when the function ends, its part of the stack is deallocated.
So int*'s are just memory-addresses of int's. It doesn't matter where the int lives. int*'s can also point to int's on the stack:
#include <iostream>
void MyFunction() {
int intOnStack = 5;
int* intPtrOnStack = &intOnStack; // This int pointer points to intOnStack
std::cout << intOnStack << *intPtrOnStack;
}
int main() { MyFunction(); }
This prints 55. In this example we also see the &-operator in action (there are several uses of & like the bit-wise-and, I'm not going into them).
& simply returns the memory-address (a pointer!) of its operand. In this case its operand is intOnStack so it returns its memory-address and assigns it to intPtrOnStack.
So far, we've seen only int* as types of pointers but there exist pointer-types for each type of object that has a memory-address, including pointers. That means that a thing like int** exists and simply means 'pointer to a pointer to an int'. How would you get one? Like this: &intPtrOnStack.
Can pointers only live on the stack? No: new int*(&intPtrOnStack). Or new int*(new int(5)).
How variable int a is in existence without object creation? It is not of static type also.
#include <iostream>
using namespace std;
class Data
{
public:
int a;
void print() { cout << "a is " << a << endl; }
};
int main()
{
Data *cp;
int Data::*ptr = &Data::a;
cp->*ptr = 5;
cp->print();
}
Your code shows some undefined behavior, let's go through it:
Data *cp;
Creates a pointer on the stack, though, does not initialize it. On it's own not a problem, though, it should be initialized at some point. Right now, it can contain 0x0badc0de for all we know.
int Data::*ptr=&Data::a;
Nothing wrong with this, it simply creates a pointer to a member.
cp->*ptr=5;
Very dangerous code, you are now using cp without it being initialized. In the best case, this crashes your program. You are now assigning 5 to the memory pointed to by cp. As this was not initialized, you are writing somewhere in the memory space. This can include in the best case: memory you don't own, memory without write access. In both cases, your program can crash. In the worst case, this actually writes to memory that you do own, resulting in corruption of data.
cp->print();
Less dangerous, still undefined, so will read the memory. If you reach this statement, the memory is most likely allocated to your program and this will print 5.
It becomes worse
This program might actually just work, you might be able to execute it because your compiler has optimized it. It noticed you did a write, followed by a read, after which the memory is ignored. So, it could actually optimize your program to: cout << "a is "<< 5 <<endl;, which is totally defined.
So if this actually, for some unknown reason, would work, you have a bug in your program which in time will corrupt or crash your program.
Please write the following instead:
int main()
{
int stackStorage = 0;
Data *cp = &stackStorage;
int Data::*ptr=&Data::a;
cp->*ptr=5;
cp->print();
}
I'd like to add a bit more on the types used in this example.
int Data::*ptr=&Data::a;
For me, ptr is a pointer to int member of Data. Data::a is not an instance, so the address operator returns the offset of a in Data, typically 0.
cp->*ptr=5;
This dereferences cp, a pointer to Data, and applies the offset stored in ptr, namely 0, i.e., a;
So the two lines
int Data::*ptr=&Data::a;
cp->*ptr=5;
are just an obfuscated way of writing
cp->a = 5;
I am pretty new to C++, and while getting started I got stuck on a frustrating problem concerning pointers. Consider the following code:
#include <iostream>
using namespace std;
int main (){
int* mypointer;
*mypointer = 1;
cout << "Whats wrong";
}
It crashes during runtime. I suspect it has to do with the pointer assignment. But after commenting out the cout statement, the program executes. By assigning the pointer as
int* mypointer, myvar;
myvar = 1;
mypointer = &myvar;
the program runs, and I can print the value of the pointer as:
cout << "value of pointer: " << *mypointer;
I draw the conclusion that this would be the correct usage of a pointer.
BUT: Why does the following code execute??:
#include <iostream>
#include <stdio.h>
using namespace std;
int main (){
int* mypointer;
*mypointer = 1;
printf("This works!\n");
printf("I can even print the value mypointer is pointing to: %i",*mypointer);
}
Simply using printf?? I would really appreciate an explanation guys!
The code executes because, just by chance, your compiler has managed to optimise the program down enough that the 1 is "hardcoded" into the printf call.
It would probably have done this anyway, rendering both the original int and the pointer irrelevant, but in this instance it's not reflecting the fact that the pointer was broken and there wasn't an original int.
so, strictly speaking, this doesn't even reflect the semantics of the program: as you've spotted, assigning a value to an int that doesn't exist (through an uninitialised or otherwise invalid pointer) is nonsense and results in undefined behaviour.
But that's the nature of undefined behaviour: anything can happen! The authors of your compiler are making the most of that, by realising that they don't have to write any code to make this case work logically. Because it's you who violated the C++ contract. :)
I am a c++ learner. Others told me "uninitiatied pointer may point to anywhere". How to prove that by code.?I made a little test code but my uninitiatied pointer always point to 0. In which case it does not point to 0? Thanks
#include <iostream>
using namespace std;
int main() {
int* p;
printf("%d\n", p);
char* p1;
printf("%d\n", p1);
return 0;
}
Any uninitialized variable by definition has an indeterminate value until a value is supplied, and even accessing it is undefined. Because this is the grey-area of undefined behaviour, there's no way you can guarantee that an uninitialized pointer will be anything other than 0.
Anything you write to demonstrate this would be dictated by the compiler and system you are running on.
If you really want to, you can try writing a function that fills up a local array with garbage values, and create another function that defines an uninitialized pointer and prints it. Run the second function after the first in your main() and you might see it.
Edit: For you curiosity, I exhibited the behavior with VS2015 on my system with this code:
void f1()
{
// junk
char arr[24];
for (char& c : arr) c = 1;
}
void f2()
{
// uninitialized
int* ptr[4];
std::cout << (std::uintptr_t)ptr[1] << std::endl;
}
int main()
{
f1();
f2();
return 0;
}
Which prints 16843009 (0x01010101). But again, this is all undefined behaviour.
Well, I think it is not worth to prove this question, because a good coding style should be used and this say's: Initialise all variables! One example: If you "free" a pointer, just give them a value like in this example:
char *p=NULL; // yes, this is not needed but do it! later you may change your program an add code beneath this line...
p=(char *)malloc(512);
...
free(p);
p=NULL;
That is a safe and good style. Also if you use free(p) again by accident, it will not crash your program ! In this example - if you don't set NULL to p after doing a free(), your can use the pointer by mistake again and your program would try to address already freed memory - this will crash your program or (more bad) may end in strange results.
So don't waste time on you question about a case where pointers do not point to NULL. Just set values to your variables (pointers) ! :-)
It depends on the compiler. Your code executed on an old MSVC2008 displays in release mode (plain random):
1955116784
1955116784
and in debug mode (after croaking for using unitialized pointer usage):
-858993460
-858993460
because that implementation sets uninitialized pointers to 0xcccccccc in debug mode to detect their usage.
The standard says that using an uninitialized pointer leads to undefined behaviour. That means that from the standard anything can happen. But a particular implementation is free to do whatever it wants:
yours happen to set the pointers to 0 (but you should not rely on it unless it is documented in the implementation documentation)
MSVC in debug mode sets the pointer to 0xcccccccc in debug mode but AFAIK does not document it (*), so we still cannot rely on it
(*) at least I could not find any reference...