I did a bit of an experiment to try to understand references in C++:
#include <iostream>
#include <vector>
#include <set>
struct Description {
int a = 765;
};
class Resource {
public:
Resource(const Description &description) : mDescription(description) {}
const Description &mDescription;
};
void print_set(const std::set<Resource *> &resources) {
for (auto *resource: resources) {
std::cout << resource->mDescription.a << "\n";
}
}
int main() {
std::vector<Description> descriptions;
std::set<Resource *> resources;
descriptions.push_back({ 10 });
resources.insert(new Resource(descriptions.at(0)));
// Same as description (prints 10)
print_set(resources);
// Same as description (prints 20)
descriptions.at(0).a = 20;
print_set(resources);
// Why? (prints 20)
descriptions.clear();
print_set(resources);
// Object is written to the same address (prints 50)
descriptions.push_back({ 50 });
print_set(resources);
// Create new array
descriptions.reserve(100);
// Invalid address
print_set(resources);
for (auto *res : resources) {
delete res;
}
return 0;
}
https://godbolt.org/z/TYqaY6Tz8
I don't understand what is going on here. I have found this excerpt from C++ FAQ:
Important note: Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object, just with another name. It is neither a pointer to the object, nor a copy of the object. It is the object. There is no C++ syntax that lets you operate on the reference itself separate from the object to which it refers.
This creates some questions for me. So, if reference is the object itself and I create a new object in the same memory address, does this mean that the reference "becomes" the new object? In the example above, vectors are linear arrays; so, as long as the array points to the same memory range, the object will be valid. However, this becomes a lot trickier when other data sets are being used (e.g sets, maps, linked lists) because each "node" typically points to different parts of memory.
Should I treat references as undefined if the original object is destroyed? If yes, is there a way to identify that the reference is destroyed other than a custom mechanism that tracks the references?
Note: Tested this with GCC, LLVM, and MSVC
The note is misleading, treating references as syntax sugar for pointers is fine as a mental model. In all the ways a pointer might dangle, a reference will also dangle. Accessing dangling pointers/references is undefined behaviour (UB).
int* p = new int{42};
int& i = *p;
delete p;
void f(int);
f(*p); // UB
f(i); // UB, with the exact same reason
This also extends to the standard containers and their rules about pointer/reference invalidation. The reason any surprising behaviour happens in your example is simply UB.
The way I explain this to myself is:
Pointer is like a finger on your hands. It can point to memory blocks, think of them as a keyboard. So pointer literally points to a keypad that holds something or does something.
Reference is a nickname for something. Your name may be for example Michael Johnson, but people may call you Mike, MJ, Mikeson etc. Anytime you hear your nickname, person who called REFERED to the same thing - you. If you do something to yourself, reference will show the change too. If you point at something else, it won't affect what you previously pointed on (unless you're doing something weird), but rather point on something new. That being said, as in the accepted answer above, if you do something weird with your fingers and your nicknames, you'll see weird things happening.
References are likely the most important feature that C++ has that is critical in coding for beginners. Many schools today start with MATLAB which is insanely slow when you wish to do things seriously. One of the reasons is the lack of controlling references in MATLAB (yes it has them, make a class and derive from the handle - google it out) as you would in C++.
Look these two functions:
double fun1(std::valarray<double> &array)
{
return array.max();
}
double fun2(std::valarray<double> array)
{
return array.max();
}
These simple two functions are very different. When you have some STL array and use fun1, function will expect nickname for that array, and will process it directly without making a copy. fun2 on the other hand will take the input array, create its copy, and process the copy.
Naturally, it is much more efficient to use references when making functions to process inputs in C++. That being said, you must be certain not to change your input in any way, because that will affect original input array in another piece of code where you generated it - you are processing the same thing, just called differently.
This makes references useful for a bit controversial coding, called side-effects.
In C++ you can't make a function with multiple outputs directly without making a custom data type. One workaround is a side effect in example like this:
#include <stdio.h>
#include <valarray>
#include <iostream>
double fun3(std::valarray<double> &array, double &min)
{
min = array.min();
return array.max();
}
int main()
{
std::valarray<double> a={1, 2, 3, 4, 5};
double sideEffectMin;
double max = fun3(a,sideEffectMin);
std::cout << "max of array is " << max << " min of array is " <<
sideEffectMin<<std::endl;
return 0;
}
So fun3 is expecting a reference to a double data type. In other words, it wants the second input to be a nickname for another double variable. This function then goes to alter the reference, and this will also alter the input. Both name and nickname get altered by the function, because it's the same "thing".
In main function, variable sideEffectMin is initialized to 0, but it will get a value when fun3 function is called. Therefore, you got 2 outputs from fun3.
The example shows you the trick with side effect, but also to be ware not to alter your inputs, specially when they are references to something else, unless you know what you are doing.
Related
How variable int a is in existence without object creation? It is not of static type also.
#include <iostream>
using namespace std;
class Data
{
public:
int a;
void print() { cout << "a is " << a << endl; }
};
int main()
{
Data *cp;
int Data::*ptr = &Data::a;
cp->*ptr = 5;
cp->print();
}
Your code shows some undefined behavior, let's go through it:
Data *cp;
Creates a pointer on the stack, though, does not initialize it. On it's own not a problem, though, it should be initialized at some point. Right now, it can contain 0x0badc0de for all we know.
int Data::*ptr=&Data::a;
Nothing wrong with this, it simply creates a pointer to a member.
cp->*ptr=5;
Very dangerous code, you are now using cp without it being initialized. In the best case, this crashes your program. You are now assigning 5 to the memory pointed to by cp. As this was not initialized, you are writing somewhere in the memory space. This can include in the best case: memory you don't own, memory without write access. In both cases, your program can crash. In the worst case, this actually writes to memory that you do own, resulting in corruption of data.
cp->print();
Less dangerous, still undefined, so will read the memory. If you reach this statement, the memory is most likely allocated to your program and this will print 5.
It becomes worse
This program might actually just work, you might be able to execute it because your compiler has optimized it. It noticed you did a write, followed by a read, after which the memory is ignored. So, it could actually optimize your program to: cout << "a is "<< 5 <<endl;, which is totally defined.
So if this actually, for some unknown reason, would work, you have a bug in your program which in time will corrupt or crash your program.
Please write the following instead:
int main()
{
int stackStorage = 0;
Data *cp = &stackStorage;
int Data::*ptr=&Data::a;
cp->*ptr=5;
cp->print();
}
I'd like to add a bit more on the types used in this example.
int Data::*ptr=&Data::a;
For me, ptr is a pointer to int member of Data. Data::a is not an instance, so the address operator returns the offset of a in Data, typically 0.
cp->*ptr=5;
This dereferences cp, a pointer to Data, and applies the offset stored in ptr, namely 0, i.e., a;
So the two lines
int Data::*ptr=&Data::a;
cp->*ptr=5;
are just an obfuscated way of writing
cp->a = 5;
It seems one can pass an array by reference to the first element:
void passMe(int& firstElement)
{
int* p=&firstElement;
cout << p[0] << p[1];
}
Main program:
int hat[]={ 5,2,8 };
passMe(hat[0]);
Usually, instead of the above function definition, I would do void passMe(int* myArray) or void passMe(int[] myArray). But would the method above cause any issues? Knowing the answer would allow me to gain a better understanding of the things at play here, if nothing else.
As far as the language-lawyers and the compiler are concerned, it's fine. The main issue will be with the human programmers (including future versions of yourself). When your standard C++ programmer sees this in a program:
void passMe(int& firstElement); // function declaration (in a .h file)
int hat[]={ 5,2,8 };
passMe(hat[0]);
He is going to expect that passMe() might read and/or modify hat[0]. He is definitely not going to expect that passMe() might read and/or modify hat[1]. And when he eventually figures out what you are doing, he's going to be very upset with you for misleading him via unnecessarily "clever" code. :)
Furthermore, if another programmer (who doesn't know about your trick) tries to call your function from his own code, after seeing only the function declaration in the header, he's likely to try to do something like this:
int cane = 5;
passMe(cane);
... which will lead directly to mysterious undefined behavior at runtime when passMe() tries to reference the second item in the array after cane, which doesn't actually exist because cane is not actually in an array.
I've been coding for a while in other languages and am pretty proficient, but now I am diving more deeply into C++ and have come across some weird problems that I never had in other languages. The most frustrating one, which a google search hasn't been able to answer, is with two different code orders.
The background is that I have an array of integers, and a pointer an element in the array. When I go to print the pointer one method prints correctly, and the other prints nonsense.
An example of the first code order is:
#include <iostream>
using namespace std;
void main(){
int *pAry;
int Ary[5]={2,5,2,6,8};
pAry=&Ary[3];
cout<<*pAry<<endl;
system("pause");
}
and it works as expected. However this simple order wont work for the full project as I want other modules to access pAry, so I thought a global define should work, since it works in other languages. Here is the example:
#include <iostream>
using namespace std;
int *pAry;
void evaluate();
void main(){
evaluate();
cout<<*pAry<<endl;
system("pause");
}
void evaluate(){
int Ary[5]={2,5,2,6,8};
pAry=&Ary[3];
}
When I use this second method the output is nonsense. Specifically 1241908....when the answer should be 6.
First I would like to know why my global method isn't working, and secondly I would like to know how to make it work. Thanks
In the second example, Ary is local to the function evaluate. When evaluate returns, Ary goes out of scope, and accessing it's memory region results in undefined behaviour.
To avoid this, declare Ary in a scope where it will still be valid at the time you try to access it.
It's not working because your pAry is pointing into a local array, which is destroyed when you return from evaluate(). That's undefined behavior.
One possible fix is to make your local array static:
static int Ary[5]={2,5,2,6,8};
In the evaluate() function, you're aiming pAry at a variable (actually array element) local to that function. A pointer can only be dereferenced as long as the object to which it points still exists. Local objects cease to exist when they go out of scope; in this case, this means all elements of Ary cease to exist when evalute() ends, and thus pAry becomes a dangling pointer (it doesn't point anywhere valid).
Dereferncing a dangling pointer gives undefined behaviour; in your particular case, it outputs a garbage value, but it might just as well crash or (worst of all) appear to work fine until the program changes later.
To solve this issue, you can either make Ary global as well, or make it static:
void evaluate(){
static int Ary[5]={2,5,2,6,8};
pAry=&Ary[3];
}
A static local variable persists across function calls (it exists from first initialisation until program termination), so the pointer will remaing valid.
The problem is that you need to declare Ary with file scope.
int Ary[] = {1,2,3,4,5};
void print_ary(void); // The void parameter is a style thingy. :-)
int main(void)
{
print_ary();
return EXIT_SUCCESS;
}
void print_ary(void)
{
for (unsigned int i = 0; i < (sizeof(Ary) / sizeof(Ary[0]); ++i)
{
std::cout << Ary[i] << std::endl;
}
}
A better solution would be to pass a std::vector around to your functions rather than using a global variable.
I am learning C++ from a game development standpoint coming from long time development in C# not related to gaming, but am having a fairly difficult time grasping the concept/use of pointers and de-referencing. I have read the two chapters in my current classes textbook literally 3 times and even googled some different pages relating to them, but it doesn't seem to be coming together all that well.
I think I get this part:
#include <iostream>
int main()
{
int myValue = 5;
int* myPointer = nullptr;
std::cout << "My value: " << myValue << std::endl; // Returns value of 5.
std::cout << "My Pointer: " << &myValue << std::endl; // Returns some hex address.
myPointer = &myValue; // This would set it to the address of memory.
*myPointer = 10; // Essentially sets myValue to 10.
std::cout << "My value: " << myValue << std::endl; // Returns value of 10.
std::cout << "My Pointer: " << &myValue << std::endl; // Returns same hex address.
}
I think what I'm not getting is, why? Why not just say myValue = 5, then myValue = 10? What is the purpose in going through the added layer for another variable or pointer? Any helpful input, real life uses or links to some reading that would help make sense of this would be GREATLY appreciated!
The purpose of pointers is something you will not fully realize until you actually need them for the first time. The example you provide is a situation where pointers are not needed, but can be used. It is really just to show how they work. A pointer is a way to remember where memory is without having to copy around everything it points to. Read this tutorial because it may give you a different view than the class book does:
http://www.cplusplus.com/doc/tutorial/pointers/
Example: If you have an array of game entities defined like this:
std::vector<Entity*> entities;
And you have a Camera class that can "track" a particular Entity:
class Camera
{
private:
Entity *mTarget; //Entity to track
public:
void setTarget(Entity *target) { mTarget = target; }
}
In this case, the only way for a Camera to refer to an Entity is by the use of pointers.
entities.push_back(new Entity());
Camera camera;
camera.setTarget(entities.front());
Now whenever the position of the Entity changes in your game world, the Camera will automatically have access to the latest position when it renders to the screen. If you had instead not used a pointer to the Entity and passed a copy, you would have an outdated position to render the Camera.
TL;DR: pointers are useful when multiple places need access to the same information
In your example they aren't doing much, like you said it's just showing how they can be used. One thing pointers are used for is to connect nodes like in a tree. If you have a node structure like so...
struct myNode
{
myNode *next;
int someData;
};
You can create several nodes and link each one to the previous myNode's next member. You can do this without pointers, but the neat thing with pointers is because they are all linked together, when you pass around the myNode list you only need to pass the first (root) node.
The cool thing about pointers is that if two pointers are referencing the same memory address, any changes to the memory address are recognized by everything referencing that memory address. So if you did:
int a = 5; // set a to 5
int *b = &a; // tell b to point to a
int *c = b; // tell c to point to b (which points to a)
*b = 3; // set the value at 'a' to 3
cout << c << endl; // this would print '3' because c points to the same place as b
This has some practical uses. Consider you have a list of nodes linked together. The data in each node defines some sort of task that needs to be done that will be handled by some function. As new tasks are added to the list, they get appended to the end. Since the function has a pointer to the node list, as tasks are added on it receives those as well. On the other hand, the function can also remove tasks as it completes them, which are then reflected back across any other pointers that are looking at the node list.
Pointers are also used for dynamic memory. Say you want the user to enter in a series of numbers, and they tell you how many numbers they want to use. You could define an array of 100 elements to allow for up to 100 numbers, or you could use dynamic memory.
int count = 0;
cout << "How many numbers do you want?\n> ";
cin >> count;
// Create a dynamic array with size 'count'
int *myArray = new int[count];
for(int i = 0; i < count; i++)
{
// Ask for numbers here
}
// Make sure to delete it afterwars
delete[] myArray;
If you pass an int by value, you will not be able to change the callers value. But if you pass a pointer to the int, you can change it. This is how C changed parameters. C++ can pass values by reference so this is less useful.
f(int i)
{
i= 10;
std::cout << "f value: " << i << std::endl;
}
f2(int *pi)
{
*pi = 10;
std::cout << "f2 value: " << pi << std::endl;
}
main()
{
i = 5
f(i)
std::cout << "main f value: " << i << std::endl;
f2(&i)
std::cout << "main f2 value: " << i << std::endl;
}
in main the first print should still be 5. The second one should be 10.
What is the purpose in going through the added layer for another variable or pointer?
There isn't one. It's a deliberately contrived example to show you how the mechanism works.
In reality, objects are often stored, or accessed from distant parts of your codebase, or allocated dynamically, or otherwise cannot be scope-bound. In any of these scenarios you may find yourself in need of indirectly referring to objects, and this is achieved using pointers and/or references (depending on your need).
For example some objects have no name. It can be an allocated memory or an address returned from a function or it can be an iterator.
In your simple example of course there is no need to declare the pointer. However in many cases as for example when you deal with C string functions you need to use pointers. A simple example
char s[] = "It is pointer?";
if ( char *p = std::strchr( s, '?' ) ) *p = '!';
We use pointers mainly when we need to allocate memory dynamically. For example,To implement some data structures like Linked lists,Trees etc.
From C# point of view pointer is quite same as Object reference in C# - it is just an address in memory there actual data is stored, and by dereferencing it you can manipulate with this data.
First of non-pointer data like int in your example is allocated on the stack. This means that then it goes out of the scope it's used memory will be set free. On the other hand data allocated with operator new will be placed in heap (just like then you create any Object in C#) resulting that this data will not be set free that you loose it's pointer. So using data in heap memory makes you do one of the following:
use garbage collector to remove data later (as done in C#)
manually free memory then you don't need it anymore (in C++ way
with operator delete).
Why is it needed?
There are basically three use-cases:
stack memory is fast but limited, so if you need to store big
amount of data you have to use heap
copying big data around is
expensive. Then you pass simple value between functions on the stack
it does copying. Then you pass pointer the only thing copied is just
it's address (just like in C#).
some objects in C++ might be
non-copyable, like threads for example, due to their nature.
Take the example where you have a pointer to a class.
struct A
{
int thing;
double other;
A() {
thing = 4;
other = 7.2;
}
};
Let's say we have a method which takes an 'A':
void otherMethod()
{
int num = 12;
A mine;
doMethod(num, mine);
std::cout << "doobie " << mine.thing;
}
void doMethod(int num, A foo)
{
for(int i = 0; i < num; ++i)
std::cout << "blargh " << foo.other;
foo.thing--;
}
When the doMethod is called, the A object is passed by value. This means a NEW A object is created (as a copy). The foo.thing-- line won't modify the mine object at all as they're two separate objects.
What you need to do is to pass in a pointer to the original object. When you pass in a pointer, then the foo.thing-- will modify the original object instead of creating a copy of the old object into a new one.
Pointers (or references) are vital for the use of dynamic polymorphism in C++. They are how you use a class hierarchy.
Shape * myShape = new Circle();
myShape->Draw(); // this draws a circle
// in fact there is likely no implementation for Shape::Draw
Attempts to use a derived class through a value (instead of pointer or reference) to a base class will often result in slicing and losing the derived data portion of the object.
It makes a lot more sense when you're passing the pointer to a function, see this example:
void setNumber(int *number, int value) {
*number = value;
}
int aNumber = 5;
setNumber(&aNumber, 10);
// aNumber is now 10
What we're doing here is setting the value of *number, this would not be possible without the use of pointers.
If you defined it like this instead:
void setNumber(int number, int value) {
number = value;
}
int aNumber = 5;
setNumber(aNumber, 10);
// aNumber is still 5 since you're only copying its value
It also gives better performance and you're not wasting as much memory when you're passing a reference to a larger object (such as a class) to a function, instead of passing the whole object.
Well to use pointers in programming is a pretty concept. And for dynamically allocating the memory it is essentiall to use pointers to store the adress of the first location of the memory which we have reserved and same is the case for releasing the memory, we need pointers. It's true as some one said in above answer that u cannot understand the use of poinetrs until u need it. One example is that u can make a variable size array using pointers and dynamic memoru allocation. And one thing important is that using pointers we can change the actual value of the location becaues we are accessing the location indirectly. More ever, when we need to pass our value by refernce there are times when references do not work so we need pointers.
And the code u have written is using dereference operator. As i have said that we access the loaction of memory indirectly by using pointers so it changes the actuall value of the location like reference objects that is why it is printing 10.
I have a C++ class; this class is as follows:
First, the header:
class PageTableEntry {
public:
PageTableEntry(bool modified = true);
virtual ~PageTableEntry();
bool modified();
void setModified(bool modified);
private:
PageTableEntry(PageTableEntry &existing);
PageTableEntry &operator=(PageTableEntry &rhs);
bool _modified;
};
And the .cpp file
#include "PageTableEntry.h"
PageTableEntry::PageTableEntry(bool modified) {
_modified = modified;
}
PageTableEntry::~PageTableEntry() {}
bool PageTableEntry::modified() {
return _modified;
}
void PageTableEntry::setModified(bool modified) {
_modified = modified;
}
I set a breakpoint on all 3 lines in the .cpp file involving _modified so I can see exactly where they are being set/changed/read. The sequence goes as follows:
Breakpoint in constructor is triggered. _modified variable is confirmed to be set to true
Breakpoint in accessor is triggered. _modified variable is FALSE!
This occurs with every instance of PageTableEntry. The class itself is not changing the variable - something else is. Unfortunately, I don't know what. The class is created dynamically using new and is passed around (as pointers) to various STL structures, including a vector and a map. The mutator is never called by my own code (I haven't gotten to that point yet) and the STL structures shouldn't be able to, and since the breakpoint is never called on the mutator I can only assume that they aren't.
Clearly there's some "gotcha" where private variables can, under certain circumstances, be changed without going through the class's mutator, triggered by who-knows-what situation, but I can't imagine what it could be. Any thoughts?
UPDATE:
The value of this at each stage:
Constructor 1: 0x100100210
Constructor 2: 0x100100400
Accessor 1: 0x1001003f0
Accessor 2: 0x100100440
UPDATE2:
(code showing where PageTableEntry is accessed)
// In constructor:
_tableEntries = std::map<unsigned int, PageTableEntry *>();
// To get an entry in the table (body of testAddr() function, address is an unsigned int:
std::map<unsigned int, PageTableEntry *>::iterator it;
it = _tableEntries.find(address);
if (it == _tableEntries.end()) {
return NULL;
}
return (PageTableEntry *)&(*it);
// To create a new entry:
PageTableEntry *entry = testAddr(address);
if (!entry) {
entry = new PageTableEntry(_currentProcessID, 0, true, kStorageTypeDoesNotExist);
_tableEntries.insert(std::pair<unsigned int, PageTableEntry *>(address, entry));
}
Those are the only points at which PageTableEntry objects are stored and retrieved from STL structures in order to cause the issue. All other functions utilize the testAddr() function to retrieve entries.
UNRELATED: Since C++ now has 65663 questions, and so far 164 have been asked today, that means that only just today the number of C++ tagged questions exceeded a 16-bit unsigned integer. Useful? No. Interesting? Yes. :)
Either your debugger isn't reporting the value correctly (not unheard of, and even expected in optimized builds) or you have memory corruption elsewhere in your program. The code you've shown is more-or-less fine and should behave as you expect.
EDIT corresponding to your 'UPDATE2':
The problem is in this line:
return (PageTableEntry *)&(*it);
The type of *it is std::pair<unsigned const, PageTableEntry*>&, so you're effectively reinterpret-casting a std::pair<unsigned const, PageTableEntry*>* to a PageTableEntry*. Change that line to:
return it->second;
Keep an eye out for other casts in your codebase. Needing a cast in the first place is a code smell, and the result of doing a cast incorrectly can be undefined behavior, including manifesting as memory corruption as you're seeing here. Using C++-style casts instead of C-style casts makes it trivial to find where casts occur in your codebase so they can be reviewed easily (hint, hint).
std::map<>::find() returns an iterator which when dereferenced returns a std::map<>::value_type. The value_type in this case is a std::pair<>. You're returning the address of that pair rather than the PageTableEntry. I believe you want the following:
// To get an entry in the table (body of testAddr() function, address is an unsigned int:
std::map<unsigned int, PageTableEntry *>::iterator it;
it = _tableEntries.find(address);
if (it == _tableEntries.end()) {
return NULL;
}
return (*it).second;
P.S.: C-style casts are evil. The compiler would have issued a diagnostic with a C++ cast in place. :)
Try looking at the value of this at each breakpoint.
That copy constructor and assignment operator are both going to be used a lot if you're using STL containers. Maybe if you show us the code for those, we would see something wrong.
Could you add another unique value to the class to track the PageTableEntry s?
I know that I've had problems like this where the real problem was that there were multiple entries that looked the same and the breakpoints might switch PageTableEntry s without me realizing it.