I am trying to understand std::find(). Below is my code.
std::set::find searches the container for an element equivalent to
val and returns an iterator to it if found, otherwise it returns an
iterator to set::end.
But when I gave find(100) I am getting 7 rather than 20.
#include <iostream>
#include <set>
using namespace std;
int main()
{
set <int> s1{20, 7, 2};
s1.insert(10);
s1.insert(5);
s1.insert(15);
s1.insert(1);
cout << "size() : " << s1.size() << endl;
cout << "max_size() : " << s1.max_size() << endl;
cout << "empty() : " << s1.empty() << endl;
for(auto itr = s1.begin(); itr != s1.end(); itr++)
cout << *itr << " ";
cout << endl;
cout << endl << "---- find(value) ----" << endl;
auto a1 = s1.find(10);
//cout << "find(10) : " << a1 << " " << *a1 << endl;
cout << "find(10) : " << *a1 << endl;
auto a2 = s1.find(100);
cout << "find(100) : " << *a2 << endl;
cout << endl << "---- count(value) ----" << endl;
cout << "s1.count(10) : " << s1.count(10) << endl;
cout << "s1.count(100) : " << s1.count(100) << endl;
return 0;
}
Output:
size() : 7
max_size() : 107374182
empty() : 0
1 2 5 7 10 15 20
---- find(value) ----
find(10) : 10
find(100) : 7
---- count(value) ----
s1.count(10) : 1
s1.count(100) : 0
The problem is that you're dereferencing an iterator a2 that points to s1.end() leading to undefined behavior. This problem arose because you're not checking before dereferencing the iterators, if the element was found or not.
To solve this you should add an explicit check before dereferencing the iterators.
//dereference only if the element was found
if(a2!=s1.end())
{
std::cout << "find(100) : " << *a2 << std::endl;
}
//otherwise print a message saying element not found
else
{
std::cout<<"element not found"<<std::endl;
}
auto a2 = s1.find(100);
cout << "find(100) : " << *a2 << endl;
Here you dereference (*a2) the end iterator. That is undefined behaviour - remember that s1.end() points to one past the last element and must not be dereferenced.
You're unlucky that you got a value from that dereference - it would be more convenient if your program crashed or otherwise reported the problem. But UB doesn't have to be diagnosed in any way.
You might have spotted the problem if you had run your program using Valgrind's memory checker (or your preferred equivalent). But there's a good chance that's unable to detect it (if the set has over-allocated, which is likely).
The value 100 is not present in the set. So this call
auto a2 = s1.find(100);
returns the iterator s1.end(). You may not dereference the iterator. This statement
cout << "find(100) : " << *a2 << endl;
invokes undefined behavior.
Related
I wrote this simple c++ program and I got some strange results that I don't understand (results are described in the line comments)
int arr[3] {1, 2, 3};
int* p{ nullptr };
p = arr;
std::cout << p[0] << " " << p[1] << " " << p[2]; // prints 1 2 3, OK
p = arr;
std::cout << *(p++) << " " << *(p++) << " " << *(p); // prints 2 1 3 ??
p = arr;
std::cout << *p << " " << *(++p) << " " << *(++p); // prints 3 3 3 ??
p = arr;
std::cout << *p << " "; ++p;
std::cout << *p << " "; ++p;
std::cout << *p; // prints 1 2 3, OK
it seems that the pointer increments along a std::cout concatenation don't work.
What's wrong in my idea?
I supposed it should have worked.
best
final edit: I was using c++14, I switched to c++20 and now it works properly
thank you everybody!
int* p{ nullptr };
std::cout << p[0] << " " << p[1] << " " << p[2];
This is Undefined Behavior, as you are dereferencing nullptr, p does not point at valid memory yet.
p = arr;
std::cout << p[0] << " " << p[1] << " " << p[2];
This is well-defined behavior. p points at valid memory, is always incremented before dereferenced, and is incremented in a deterministic and valid manner. This is the same as if you had written the following instead:
std::cout << *(p+0) << " " << *(p+1) << " " << *(p+2);
p = arr;
std::cout << *(p++) << " " << *(p++) << " " << *(p);
p = arr;
std::cout << *p << " " << *(++p) << " " << *(++p);
Both of these are Undefined Behavior prior to C++17, because the order in which chained operator<< calls are evaluated is not guaranteed in earlier versions, the compiler is free to evaluate them in whatever order it wants. This is no longer the case in C++17 onward.
p = arr;
std::cout << *p << " "; ++p;
std::cout << *p << " "; ++p;
std::cout << *p;
This is well-defined behavior. p points at valid memory, is always dereferenced before incremented, and is incremented in a deterministic and valid manner.
I was playing around with pointers and got results I did not expect:
#include <iostream>
#include <vector>
int main() {
int arr[4] = { 1, 2, 3, 4 };
int* pArr = arr;
std::cout << "First number: " << *pArr << " at address: " << pArr;
pArr++;
std::cout << "\nSecond number: " << *pArr << " at address: " << pArr;
pArr++;
std::cout << "\nThird number: " << *pArr << " at address: " << pArr;
pArr++;
std::cout << "\nFourth number: " << *pArr << " at address: " << pArr;
int* pArr2 = arr;
std::cout << "\n"
<< *pArr2++ << "\n"
<< *pArr2++ << "\n"
<< *pArr2++ << "\n"
<< *pArr2++ << "\n";
/*
int* pArr2 = arr;
std::cout << "\n"
<< ++ * pArr2 << "\n"
<< * ++pArr2 << "\n";
*/
}
The two different results:
1 2 3 4 - as expected using the first method
4 3 2 1 - using cout with multiple arguments I do not know the proper name.
So my question is - why does this happen? Using multiple cout statements results in expected for me code, while using just 1 cout results in backwards solution.
As a side note, another confusing thing to me is that pre-increment results in all values being equal. In the commented bit of code, the result is 3 3, no matter the ++ placement with respect to the *.
This code:
std::cout << "\n" << *pArr2++ << "\n";
std::cout << "\n" << *pArr2++ << "\n";
has a well defined order of modifications to pArr and will print
1
2
But this code:
std::cout << "\n" << *pArr2++ << "\n" << *pArr2++ << "\n";
invokes undefined behavior before c++17, because there are multiple modifications to pArr2 that are unsequenced. The program has UB, so it could print anything.
From c++17, there is a sequence point between the modifications, and the above code is guaranteed to print:
1
2
In the below code I declared a vector as {1,2,3,4,5}.
Using the STL std::find(), I am trying to find 5 in the vector ranging from arr.begin() to arr.end()-1 or arr.begin() to arr.begin()+4 which is the same range from 1 to 4.
But here for both, iterators are return pointing to 5. Why is that since the range is only from 1 to 4?
#include <iostream>
#include <vector>
#include <array>
#include <algorithm>
using namespace std;
int main () {
vector<int> arr {1,2,3,4,5};
// TEST
for_each(arr.begin(), arr.begin()+4, [](const int &x) { cerr << x << " "; }); cerr << endl;
for_each(arr.begin(), arr.end()-1, [](const int &x) { cerr << x << " "; }); cerr << endl;
auto it1 {std::find(arr.begin(), arr.begin()+4, 5)};
auto it2 {std::find(arr.begin(), arr.end()-1, 5)};
if (it1 != arr.end())
cout << *it1 << " Found!" << endl;
else
cout << "NOT Found!" << endl;
if (it2 != arr.end())
cout << *it2 << " Found!" << endl;
else
cout << "NOT Found!" << endl;
return 0;
}
OUTPUT:
1 2 3 4
1 2 3 4
5 Found!
5 Found!
std::find just returns the iterator passed as the 2nd argument when the element is not found. So it returns the iterators as arr.begin()+4 or arr.end()-1 in your code.
You shouldn't compare it with std::end, e.g.
if (it1 != arr.begin()+4)
cout << *it1 << " Found!" << endl;
else
cout << "NOT Found!" << endl;
if (it2 != arr.end()-1)
cout << *it2 << " Found!" << endl;
else
cout << "NOT Found!" << endl;
It is because if std:find does not find the requested value (as it occurs here), it returns the end iterator you give to it (not the end iterator of the full vector), which in this case points to the element you are looking for.
If we make a reference to a vector element and then resize the vector, the reference is no longer valid, the same happens with an iterator:
std::vector<int> vec{0, 1, 2, 3, 4, 5};
int& ref = vec[0];
auto itr = vec.begin();
cout << ref << " " << *itr << endl;
vec[0] = 7;
cout << ref << " " << *itr << endl;
vec.resize(100);
vec[0] = 3;
cout << ref << " " << *itr << endl;
Prints out:
0 0
7 7
0 0 // We expected a 3 here
And I know that it would be more practical to just keep a reference to the vector itself and call vec[0], but just for the sake of questioning, is it possible to keep an object that will always be vec[0] even if the object is moved?
I've tried writing a small helper class to help with this, but I'm unsure if this is the best method or if it can even fail?
template<typename T>
struct HelperClass
{
std::vector<T>& vec;
size_t element;
HelperClass(std::vector<T>& vec_, size_t element_) : vec(vec_) , element(element_) {}
// Either define an implicit conversion from HelperClass to T
// or a 'dereference' operator that returns vec[0]
operator T&() { return vec.at(element); }
T& operator*() { return vec.at(element); }
};
And use it by either the implicit conversion to T& or by the 'dereference' operator:
std::vector<int> vec{0, 1, 2, 3, 4, 5};
int& ref = vec[0];
auto itr = vec.begin();
HelperClass<int> hlp = HelperClass<int>(vec, 0); // HelperClass
cout << ref << " " << *itr << " " << hlp << " " << *hlp << endl;
vec[0] = 7;
cout << ref << " " << *itr << " " << hlp << " " << *hlp << endl;
vec.resize(100);
vec[0] = 3;
cout << ref << " " << *itr << " " << hlp << " " << *hlp << endl;
Which already prints what was excepted:
0 0 0 0
7 7 7 7
0 0 3 3
So is there a better way to do this aside from having a helper class and can the helper class be unreliable in some cases?
I've also come across this thread in reddit, but it seems that they do not discuss the helper class there
The one thing you could do is have a vector of pointers rather than a vector of instances. That of course has its own passel of issues but if you must have object references survive a vector resize that will do it.
Any reallocation of the vector will invalidate any pointers, references and iterators.
In your example, your HelperClass is useless in sense that this:
cout << ref << " " << *itr << " " << hlp << " " << *hlp << endl;
is the same as:
cout << ref << " " << *itr << " " << vec[0] << " " << vec[0] << endl;
If a reallocation happens, just use the iterator interface .begin() .end() to access again the iterators.
The code works as expected until the lines 22-24, where we are printing 8 followed by address. Incrementing the pointer address increments the address by one byte only, whereas it should move address by 4 bytes. The problem does not occur in arrays or if lines 22-24 are run separately.
#include<iostream>
using namespace std;
void main()
{
int *p;
//int a[10] = { 0 };
//p = a;
int a = 100;
p=&a;
cout << "1. "<<p <<" "<<*p<< endl;
p++;
cout << "2. " << p << " " << *p << endl;
++p;
cout << "3. " << p << " " << *p << endl;
++*p;
cout << "4. " << p << " " << *p << endl;
++(*p);
cout << "5. " << p << " " << *p << endl;
++*(p);
cout << "6. " << p << " " << *p << endl;
*p++;
cout << "7. " << p << " " << *p << endl;
(*p)++; //This is the problem, increments the address by 1, even though its in int type
cout << "8. " << p << " " << *p << endl;
*(p)++;
cout << "9. " << p << " " << *p << endl;
*++p;
cout << "10. " << p << " " << *p << endl;
*(++p);
cout << "11. " << p << " " << *p << endl;
cin.get();
}
Initially you set p to point to an integer variable on the stack. When you subsequently increment the pointer you are pointing to an area of memory on the stack which is likely to change when a function is called ( cout for example). when the function returns it will probably have changed the memory location that your incremented pointer p is pointing to and this probably explains your issue.
You should declare an array large enough to accommodate the range of pointer addresses that you are going to step through. I notice that you commented out the array code which would have worked as you expected.
Your code is:
p = &a;
p++;
Now p is pointing past the end of a. This is still OK, however on the next line:
cout << "2. " << p << " " << *p << endl;
when you write *p this tries to read the memory past the end of a, which causes undefined behaviour.
When undefined behaviour has happened, the definition of the C++ language no longer covers what the program does. Anything can happen.
To put it another way: when generating the executable, the compiler can make assumptions based on the premise that your program only does things which are well-defined.
Your output could perhaps be explained by the compiler making such an assumption which would be justified if your program were confirming, but is actually false because your program is invalid.
One explanation that comes to mind is that you advanced p until it happens to be pointing to the memory location in which p itself is stored. The compiler implements (*p)++ by outputting an instruction for incrementing an int stored at the location where p is pointing. On your system, the result of applying this instruction to the location where actually p is stored is to increase the address value of p by one.