I wrote this simple c++ program and I got some strange results that I don't understand (results are described in the line comments)
int arr[3] {1, 2, 3};
int* p{ nullptr };
p = arr;
std::cout << p[0] << " " << p[1] << " " << p[2]; // prints 1 2 3, OK
p = arr;
std::cout << *(p++) << " " << *(p++) << " " << *(p); // prints 2 1 3 ??
p = arr;
std::cout << *p << " " << *(++p) << " " << *(++p); // prints 3 3 3 ??
p = arr;
std::cout << *p << " "; ++p;
std::cout << *p << " "; ++p;
std::cout << *p; // prints 1 2 3, OK
it seems that the pointer increments along a std::cout concatenation don't work.
What's wrong in my idea?
I supposed it should have worked.
best
final edit: I was using c++14, I switched to c++20 and now it works properly
thank you everybody!
int* p{ nullptr };
std::cout << p[0] << " " << p[1] << " " << p[2];
This is Undefined Behavior, as you are dereferencing nullptr, p does not point at valid memory yet.
p = arr;
std::cout << p[0] << " " << p[1] << " " << p[2];
This is well-defined behavior. p points at valid memory, is always incremented before dereferenced, and is incremented in a deterministic and valid manner. This is the same as if you had written the following instead:
std::cout << *(p+0) << " " << *(p+1) << " " << *(p+2);
p = arr;
std::cout << *(p++) << " " << *(p++) << " " << *(p);
p = arr;
std::cout << *p << " " << *(++p) << " " << *(++p);
Both of these are Undefined Behavior prior to C++17, because the order in which chained operator<< calls are evaluated is not guaranteed in earlier versions, the compiler is free to evaluate them in whatever order it wants. This is no longer the case in C++17 onward.
p = arr;
std::cout << *p << " "; ++p;
std::cout << *p << " "; ++p;
std::cout << *p;
This is well-defined behavior. p points at valid memory, is always dereferenced before incremented, and is incremented in a deterministic and valid manner.
Related
I am trying to understand std::find(). Below is my code.
std::set::find searches the container for an element equivalent to
val and returns an iterator to it if found, otherwise it returns an
iterator to set::end.
But when I gave find(100) I am getting 7 rather than 20.
#include <iostream>
#include <set>
using namespace std;
int main()
{
set <int> s1{20, 7, 2};
s1.insert(10);
s1.insert(5);
s1.insert(15);
s1.insert(1);
cout << "size() : " << s1.size() << endl;
cout << "max_size() : " << s1.max_size() << endl;
cout << "empty() : " << s1.empty() << endl;
for(auto itr = s1.begin(); itr != s1.end(); itr++)
cout << *itr << " ";
cout << endl;
cout << endl << "---- find(value) ----" << endl;
auto a1 = s1.find(10);
//cout << "find(10) : " << a1 << " " << *a1 << endl;
cout << "find(10) : " << *a1 << endl;
auto a2 = s1.find(100);
cout << "find(100) : " << *a2 << endl;
cout << endl << "---- count(value) ----" << endl;
cout << "s1.count(10) : " << s1.count(10) << endl;
cout << "s1.count(100) : " << s1.count(100) << endl;
return 0;
}
Output:
size() : 7
max_size() : 107374182
empty() : 0
1 2 5 7 10 15 20
---- find(value) ----
find(10) : 10
find(100) : 7
---- count(value) ----
s1.count(10) : 1
s1.count(100) : 0
The problem is that you're dereferencing an iterator a2 that points to s1.end() leading to undefined behavior. This problem arose because you're not checking before dereferencing the iterators, if the element was found or not.
To solve this you should add an explicit check before dereferencing the iterators.
//dereference only if the element was found
if(a2!=s1.end())
{
std::cout << "find(100) : " << *a2 << std::endl;
}
//otherwise print a message saying element not found
else
{
std::cout<<"element not found"<<std::endl;
}
auto a2 = s1.find(100);
cout << "find(100) : " << *a2 << endl;
Here you dereference (*a2) the end iterator. That is undefined behaviour - remember that s1.end() points to one past the last element and must not be dereferenced.
You're unlucky that you got a value from that dereference - it would be more convenient if your program crashed or otherwise reported the problem. But UB doesn't have to be diagnosed in any way.
You might have spotted the problem if you had run your program using Valgrind's memory checker (or your preferred equivalent). But there's a good chance that's unable to detect it (if the set has over-allocated, which is likely).
The value 100 is not present in the set. So this call
auto a2 = s1.find(100);
returns the iterator s1.end(). You may not dereference the iterator. This statement
cout << "find(100) : " << *a2 << endl;
invokes undefined behavior.
I was playing around with pointers and got results I did not expect:
#include <iostream>
#include <vector>
int main() {
int arr[4] = { 1, 2, 3, 4 };
int* pArr = arr;
std::cout << "First number: " << *pArr << " at address: " << pArr;
pArr++;
std::cout << "\nSecond number: " << *pArr << " at address: " << pArr;
pArr++;
std::cout << "\nThird number: " << *pArr << " at address: " << pArr;
pArr++;
std::cout << "\nFourth number: " << *pArr << " at address: " << pArr;
int* pArr2 = arr;
std::cout << "\n"
<< *pArr2++ << "\n"
<< *pArr2++ << "\n"
<< *pArr2++ << "\n"
<< *pArr2++ << "\n";
/*
int* pArr2 = arr;
std::cout << "\n"
<< ++ * pArr2 << "\n"
<< * ++pArr2 << "\n";
*/
}
The two different results:
1 2 3 4 - as expected using the first method
4 3 2 1 - using cout with multiple arguments I do not know the proper name.
So my question is - why does this happen? Using multiple cout statements results in expected for me code, while using just 1 cout results in backwards solution.
As a side note, another confusing thing to me is that pre-increment results in all values being equal. In the commented bit of code, the result is 3 3, no matter the ++ placement with respect to the *.
This code:
std::cout << "\n" << *pArr2++ << "\n";
std::cout << "\n" << *pArr2++ << "\n";
has a well defined order of modifications to pArr and will print
1
2
But this code:
std::cout << "\n" << *pArr2++ << "\n" << *pArr2++ << "\n";
invokes undefined behavior before c++17, because there are multiple modifications to pArr2 that are unsequenced. The program has UB, so it could print anything.
From c++17, there is a sequence point between the modifications, and the above code is guaranteed to print:
1
2
If we make a reference to a vector element and then resize the vector, the reference is no longer valid, the same happens with an iterator:
std::vector<int> vec{0, 1, 2, 3, 4, 5};
int& ref = vec[0];
auto itr = vec.begin();
cout << ref << " " << *itr << endl;
vec[0] = 7;
cout << ref << " " << *itr << endl;
vec.resize(100);
vec[0] = 3;
cout << ref << " " << *itr << endl;
Prints out:
0 0
7 7
0 0 // We expected a 3 here
And I know that it would be more practical to just keep a reference to the vector itself and call vec[0], but just for the sake of questioning, is it possible to keep an object that will always be vec[0] even if the object is moved?
I've tried writing a small helper class to help with this, but I'm unsure if this is the best method or if it can even fail?
template<typename T>
struct HelperClass
{
std::vector<T>& vec;
size_t element;
HelperClass(std::vector<T>& vec_, size_t element_) : vec(vec_) , element(element_) {}
// Either define an implicit conversion from HelperClass to T
// or a 'dereference' operator that returns vec[0]
operator T&() { return vec.at(element); }
T& operator*() { return vec.at(element); }
};
And use it by either the implicit conversion to T& or by the 'dereference' operator:
std::vector<int> vec{0, 1, 2, 3, 4, 5};
int& ref = vec[0];
auto itr = vec.begin();
HelperClass<int> hlp = HelperClass<int>(vec, 0); // HelperClass
cout << ref << " " << *itr << " " << hlp << " " << *hlp << endl;
vec[0] = 7;
cout << ref << " " << *itr << " " << hlp << " " << *hlp << endl;
vec.resize(100);
vec[0] = 3;
cout << ref << " " << *itr << " " << hlp << " " << *hlp << endl;
Which already prints what was excepted:
0 0 0 0
7 7 7 7
0 0 3 3
So is there a better way to do this aside from having a helper class and can the helper class be unreliable in some cases?
I've also come across this thread in reddit, but it seems that they do not discuss the helper class there
The one thing you could do is have a vector of pointers rather than a vector of instances. That of course has its own passel of issues but if you must have object references survive a vector resize that will do it.
Any reallocation of the vector will invalidate any pointers, references and iterators.
In your example, your HelperClass is useless in sense that this:
cout << ref << " " << *itr << " " << hlp << " " << *hlp << endl;
is the same as:
cout << ref << " " << *itr << " " << vec[0] << " " << vec[0] << endl;
If a reallocation happens, just use the iterator interface .begin() .end() to access again the iterators.
The code works as expected until the lines 22-24, where we are printing 8 followed by address. Incrementing the pointer address increments the address by one byte only, whereas it should move address by 4 bytes. The problem does not occur in arrays or if lines 22-24 are run separately.
#include<iostream>
using namespace std;
void main()
{
int *p;
//int a[10] = { 0 };
//p = a;
int a = 100;
p=&a;
cout << "1. "<<p <<" "<<*p<< endl;
p++;
cout << "2. " << p << " " << *p << endl;
++p;
cout << "3. " << p << " " << *p << endl;
++*p;
cout << "4. " << p << " " << *p << endl;
++(*p);
cout << "5. " << p << " " << *p << endl;
++*(p);
cout << "6. " << p << " " << *p << endl;
*p++;
cout << "7. " << p << " " << *p << endl;
(*p)++; //This is the problem, increments the address by 1, even though its in int type
cout << "8. " << p << " " << *p << endl;
*(p)++;
cout << "9. " << p << " " << *p << endl;
*++p;
cout << "10. " << p << " " << *p << endl;
*(++p);
cout << "11. " << p << " " << *p << endl;
cin.get();
}
Initially you set p to point to an integer variable on the stack. When you subsequently increment the pointer you are pointing to an area of memory on the stack which is likely to change when a function is called ( cout for example). when the function returns it will probably have changed the memory location that your incremented pointer p is pointing to and this probably explains your issue.
You should declare an array large enough to accommodate the range of pointer addresses that you are going to step through. I notice that you commented out the array code which would have worked as you expected.
Your code is:
p = &a;
p++;
Now p is pointing past the end of a. This is still OK, however on the next line:
cout << "2. " << p << " " << *p << endl;
when you write *p this tries to read the memory past the end of a, which causes undefined behaviour.
When undefined behaviour has happened, the definition of the C++ language no longer covers what the program does. Anything can happen.
To put it another way: when generating the executable, the compiler can make assumptions based on the premise that your program only does things which are well-defined.
Your output could perhaps be explained by the compiler making such an assumption which would be justified if your program were confirming, but is actually false because your program is invalid.
One explanation that comes to mind is that you advanced p until it happens to be pointing to the memory location in which p itself is stored. The compiler implements (*p)++ by outputting an instruction for incrementing an int stored at the location where p is pointing. On your system, the result of applying this instruction to the location where actually p is stored is to increase the address value of p by one.
I'm struggling a little bit with my understanding concerning the actual mechanism in dereferencing pointers (what the compiler actually does).
I red a lot through google and on here on stackoverflow, but I coulnd't quite get it yet :-(
I wrote a simple program with multiple pointers:
#include <iostream>
int main()
{
int a = 5;
int* ptr = &a;
int** pptr = &ptr;
int*** ppptr = &pptr;
int**** p4tr = &ppptr;
std::cout << "a = 5 \nint*ptr = &a \nint** pptr = *ptr\nint*** ppptr = &pptr\nint**** p4tr= &ppptr\n" << std::endl;
std::cout << "a: " << a << std::endl;
std::cout << "&a: " << &a << std::endl << std::endl;
std::cout << "ptr: " << ptr << std::endl;
std::cout << "*ptr: " << *ptr << std::endl;
std::cout << "&ptr: " << &ptr << std::endl << std::endl;
std::cout << "pptr: " << pptr << std::endl;
std::cout << "*ptr: " << *pptr << std::endl;
std::cout << "**pptr: "<< **pptr << std::endl;
std::cout << "&pptr: " << &pptr << std::endl << std::endl;
std::cout << "ppptr: " << ppptr << std::endl;
std::cout << "*ppptr: " << *ppptr << std::endl;
std::cout << "**pptr: " << **ppptr << std::endl;
std::cout << "***pptr: " << ***ppptr << std::endl;
std::cout<< "&pptr: " << &ppptr << std::endl << std::endl;
std::cout << "p4tr: " << p4tr<< std::endl;
std::cout << "*p4tr: " << *p4tr<< std::endl;
std::cout << "**p4tr: " << **p4tr<< std::endl;
std::cout << "***p4tr: " << ***p4tr<< std::endl;
std::cout << "****p4tr: " << ****p4tr<< std::endl;
std::cout << "&p4tr: " << &p4tr<< std::endl << std::endl;
return 0;
}
Which gives me this on my machine:
a = 5
int*ptr = &a
int** pptr = *ptr
int*** ppptr = &pptr
int**** p4tr= &ppptr
a: 5
&a: 0x7fffe4db870c
ptr: 0x7fffe4db870c
*ptr: 5
&ptr: 0x7fffe4db8700
pptr: 0x7fffe4db8700
*ptr: 0x7fffe4db870c
**pptr: 5
&pptr: 0x7fffe4db86f8
ppptr: 0x7fffe4db86f8
*ppptr: 0x7fffe4db8700
**pptr: 0x7fffe4db870c
***pptr: 5
&pptr: 0x7fffe4db86f0
p4tr: 0x7fffe4db86f0
*p4tr: 0x7fffe4db86f8
**p4tr: 0x7fffe4db8700
***p4tr: 0x7fffe4db870c
****p4tr: 5
&p4tr: 0x7fffe4db86e8
What I figured out, how dereference works is:
int* ptr = &a; tells the compiler that the "variable" ptr is of type "int*" (pointer to an interger; i.e. memory address)
Hence, when I write *ptr, the compiler takes the value of ptr, takes it as an address and interpretes what is stored at the very address as type int.
So far, so good.
But what does int** pptr = &ptr actually mean to the compiler ?
Does it mean pptr is of type " int** " ?
or does it still mean pptr is of type " int* " (i mean, &ptr is as good a memory address as &a)
What does the second asterix actually mean to the compiler and why can't I write: "int* pptr = &ptr"
(at least g++ won't let me do that)
thank you so much for your efforts,
it hurts my brain, if things seem unlogical to me :-))
But what does int** pptr = &ptr actually mean to the compiler? Does it mean pptr is of type int**?
Yes, pptr is of type int**. And int** is pointer to pointer to int. So *pptr has type int*, and **pptr has type int.
Why can't I write: int* pptr = &ptr?
Well, ptr has type int*. And so &ptr has type int** which is not assignment compatible with a variable of type int*. A pointer to int is a different type of thing to a pointer to pointer to int.