vector<bool>::operator[] misbehavior? [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why vector<bool>::reference doesn’t return reference to bool?
I used to think that with std::vector::operator[] we get deep copies of the accessed item, but it seems that it is not always true. At least, with vector<bool> the following test code gives a different result:
#include <iostream>
#include <vector>
using namespace std;
template <typename T>
void Test(const T& oldValue, const T& newValue, const char* message)
{
cout << message << '\n';
vector<T> v;
v.push_back(oldValue);
cout << " before: v[0] = " << v[0] << '\n';
// Should be a deep-copy (?)
auto x = v[0];
x = newValue;
cout << " after: v[0] = " << v[0] << '\n';
cout << "-------------------------------\n";
}
int main()
{
Test<int>(10, 20, "Testing vector<int>");
Test<double>(3.14, 6.28, "Testing vector<double>");
Test<bool>(true, false, "Testing vector<bool>");
}
Output (source code compiled with VC10/VS2010 SP1):
Testing vector<int>
before: v[0] = 10
after: v[0] = 10
-------------------------------
Testing vector<double>
before: v[0] = 3.14
after: v[0] = 3.14
-------------------------------
Testing vector<bool>
before: v[0] = 1
after: v[0] = 0
-------------------------------
I would have expected that v[0] after the x = newValue assignment would still be equal to its previous value, but this seems not true.
Why is that?
Why is vector<bool> special?

vector<bool> is a hideous abomination and special. The Committee specialized it to pack bits, therefore it does not support proper reference semantics, as you cannot refer to a bit, this means that it has a non-conforming interface and does not actually qualify as a Standard Container. The solution that most people use is simply to never, ever, use vector<bool>.

vector<bool>::operator[] neither yields a bool nor a reference to a bool. It just returns a little proxy object that acts like a reference. This is because there are no references to single bits and vector<bool> actually stores the bools in a compressed way. So by using auto you just created a copy of that reference-like object. The problem is that C++ does not know that this object acts as a reference. You have to force the "decay to a value" here by replacing auto with T.

operator[] returns a T& for every value of T except for bool, where it gives a reference proxy. See this old column by Herb Sutter on why using vector<bool> in generic code is bad idea (and why it is not even a container). There is also a special Item about it in Effective STL by Scott Meyers, and tons of questions on it here at SO.

Related

Is it Safe to Assign vector With operation= vector() in C++

Is it safe to assign vector later after initialization.
Let say i have a global vector variable. But i don't want to initialize the value at the beginning.
#include <iostream>
#include <vector>
using namespace std;
vector<int> globalVector;
int myNumber=123;
void setVector()
{
// Is it safe to set the vector as shown below ?
globalVector = vector<int>{1,2,3,4};
}
int main(int, char**) {
setVector();
for (int x=0; x<globalVector.size();x++)
{
cout << "Val = " << globalVector[x] << endl;
}
std::cout << "Hello, world! : " << myNumber << endl;
return 0;
}
on VSCode i can see some information said :
std::vector<int> &std::vector<int>::operator=(std::vector<int> &&__x)
+2 overloads
%Vector move assignment operator.
Parameters:
__x – A %vector of identical element and allocator types. The contents of __x are moved into this %vector (without copying, if the allocators permit it). Afterwards __x is a valid, but unspecified %vector. Whether the allocator is moved depends on the allocator traits.
The description said "move without copying". will the globalVector corrupt when the program exit from function setVector ?
Yes that is safe, although
The notation globalVector = {1,2,3,4}; is clearer.
It's not thread-safe.
Use globalVector.at(x) rather than globalVector[x] unless performance really matters, as the behaviour of the latter can be undefined for some values of x. In this particular case, a range-for loop would be better still: for (auto&& i: globalVector).
The description said "move without copying". will the globalVector corrupt when the program exit from function setVector ?
It means that operator= moved content of new vector without copy operation. The vector used for initialization is invalid after that operation. However it doesn't impact you since it is the object used only for initialization so is destroyed just after that.
Yes, it is totally safe, but I recommend you not to use global vector
instead write
int main() {
std::vector<int> globalVector = vector<int>{1,2,3,4};
for (int i=0; i<globalVector.size();i++) {
cout << "Val = " << globalVector.at(i) << '\n';
}
std::cout << "Hello, world! : " << myNumber << '\n';
return 0;
}

is there a cleaner way to right operator[]() for a vector? [duplicate]

If I define a pointer to an object that defines the [] operator, is there a direct way to access this operator from a pointer?
For example, in the following code I can directly access Vec's member functions (such as empty()) by using the pointer's -> operator, but if I want to access the [] operator I need to first get a reference to the object and then call the operator.
#include <vector>
int main(int argc, char *argv[])
{
std::vector<int> Vec(1,1);
std::vector<int>* VecPtr = &Vec;
if(!VecPtr->empty()) // this is fine
return (*VecPtr)[0]; // is there some sort of ->[] operator I could use?
return 0;
}
I might very well be wrong, but it looks like doing (*VecPtr).empty() is less efficient than doing VecPtr->empty(). Which is why I was looking for an alternative to (*VecPtr)[].
You could do any of the following:
#include <vector>
int main () {
std::vector<int> v(1,1);
std::vector<int>* p = &v;
p->operator[](0);
(*p)[0];
p[0][0];
}
By the way, in the particular case of std::vector, you might also choose: p->at(0), even though it has a slightly different meaning.
return VecPtr->operator[](0);
...will do the trick. But really, the (*VecPtr)[0] form looks nicer, doesn't it?
(*VecPtr)[0] is perfectly OK, but you can use the at function if you want:
VecPtr->at(0);
Keep in mind that this (unlike operator[]) will throw an std::out_of_range exception if the index is not in range.
There's another way, you can use a reference to the object:
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> v = {7};
vector<int> *p = &v;
// Reference to the vector
vector<int> &r = *p;
cout << (*p)[0] << '\n'; // Prints 7
cout << r[0] << '\n'; // Prints 7
return 0;
}
This way, r is the same as v and you can substitute all occurrences of (*p) by r.
Caveat: This will only work if you won't modify the pointer (i.e. change which object it points to).
Consider the following:
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> v = {7};
vector<int> *p = &v;
// Reference to the vector
vector<int> &r = *p;
cout << (*p)[0] << '\n'; // Prints 7
cout << r[0] << '\n'; // Prints 7
// Caveat: When you change p, r is still the old *p (i.e. v)
vector<int> u = {3};
p = &u; // Doesn't change who r references
//r = u; // Wrong, see below why
cout << (*p)[0] << '\n'; // Prints 3
cout << r[0] << '\n'; // Prints 7
return 0;
}
r = u; is wrong because you can't change references:
This will modify the vector referenced by r (v)
instead of referencing another vector (u).
So, again, this only works if the pointer won't change while still using the reference.
The examples need C++11 only because of vector<int> ... = {...};
You can use it as VecPrt->operator [] ( 0 ), but I'm not sure you'll find it less obscure.
It is worth noting that in C++11 std::vector has a member function 'data' that returns a pointer to the underlying array (both const and non-const versions), allowing you to write the following:
VecPtr->data()[0];
This might be an alternative to
VecPtr->at(0);
which incurs a small runtime overhead, but more importantly it's use implies you aren't checking the index for validity before calling it, which is not true in your particular example.
See std::vector::data for more details.
People are advising you to use ->at(0) because of range checking. But here is my advise (with other point of view):
NEVER use ->at(0)! It is really slower. Would you sacrifice performance just because you are lazy enough to not check range by yourself? If so, you should not be programming in C++.
I think (*VecPtr)[0] is ok.

Weird behavior of reference to vector.back() after vector is modified

Let's start with this sample code in C++:
#include <vector>
#include <iostream>
int main()
{
std::vector<int> vec;
vec.push_back(0);
for (int i = 1; i < 5; i++)
{
const auto &x = vec.back();
std::cout << "Before: " << x << ", ";
vec.push_back(i);
std::cout << "After: " << x << std::endl;
}
return 0;
}
The code is compiled with g++ test.cc -std=c++11 -O0 and below is the result:
Before: 0, After: 0
Before: 1, After: 0
Before: 2, After: 2
Before: 3, After: 3
I was expecting the second line of output to be
Before: 1, After: 1
since x is reference to an item in the vector, which shall not be modified by appending items to the vector.
However I haven't read the disassembled code or done any other investigations for now. Also I don't know whether this is an undefined behavior in the language standard.
I want this to be explained. Thanks.
push_back can cause reallocation, if we look at the draft C++ standard section 23.3.6.5 vector modifiers says:
void push_back(const T& x);
void push_back(T&& x);
Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid.
we can see that back gives us a reference and so if there is a reallocation it will not be valid anymore.
vector iterators (and previous references to elements) can possibly be invalidated after the vector is modified. Using them is unsafe.

How to insert a duplicate element into a vector?

I'm trying to insert a copy of an existing vector element to double it up. The following code worked in previous versions but fails in Visual Studio 2010.
#include <iostream>
#include <vector>
using namespace std;
int main(int argc, char* argv[])
{
vector<int> test;
test.push_back(1);
test.push_back(2);
test.insert(test.begin(), test[0]);
cout << test[0] << " " << test[1] << " " << test[2] << endl;
return 0;
}
Output is -17891602 1 2, expected 1 1 2.
I've figured out why it's happening - the vector is being reallocated, and the reference becomes invalid before it's copied to the insertion point. The older Visual Studio apparently did things in a different order, thus proving that one possible outcome of undefined behavior is to work correctly and also proving that it's never something you should rely on.
I've come up with two different ways to fix this problem. One is to use reserve to make sure that no reallocation takes place:
test.reserve(test.size() + 1);
test.insert(test.begin(), test[0]);
The other is to make a copy from the reference so that there's no dependency on the reference remaining valid:
template<typename T>
T make_copy(const T & original)
{
return original;
}
test.insert(test.begin(), make_copy(test[0]));
Although both work, neither one feels like a natural solution. Is there something I'm missing?
The issue is that vector::insert takes a reference to a value as the second parameter and not a value. You don't need the template to make a copy, just use a copy constructor to create another object, which will be pass by reference. This copy remains valid even if the vector is resized.
#include <iostream>
#include <vector>
using namespace std;
int main(int argc, char* argv[])
{
vector<int> test;
test.push_back(1);
test.push_back(2);
test.insert(test.begin(), int(test[0]));
cout << test[0] << " " << test[1] << " " << test[2] << endl;
return 0;
}
I believe this is defined behavior. In §23.2.3 of the 2011 C++ standard, table 100 lists sequence container requirements and there is an entry for this case. It gives the example expression
a.insert(p,t)
where a is a value of X which is a sequence container type containing elements of type T, p is a const iterator to a, and t is an lvalue or const rvalue of type X::value_type, i.e. T.
The assertion for this expression is:
Requires: T shall be CopyInsertable into X. For vector and deque, T shall also be CopyAssignable.
Effects: Inserts a copy of t before p.
The only relevant vector specific quote I could find is in §23.3.6.5 paragraph 1:
Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid.
Although this does mention the vector being reallocated, it doesn't make an exception to the previous requirements for insert on sequence containers.
As for working around this issue, I agree with #EdChum's suggestion of just making a copy of the element and inserting that copy.

sending back a vector from a function

How to translate properly the following Java code to C++?
Vector v;
v = getLargeVector();
...
Vector getLargeVector() {
Vector v2 = new Vector();
// fill v2
return v2;
}
So here v is a reference. The function creates a new Vector object and returns a reference to it. Nice and clean.
However, let's see the following C++ mirror-translation:
vector<int> v;
v = getLargeVector();
...
vector<int> getLargeVector() {
vector<int> v2;
// fill v2
return v2;
}
Now v is a vector object, and if I understand correctly, v = getLargeVector() will copy all the elements from the vector returned by the function to v, which can be expensive. Furthermore, v2 is created on the stack and returning it will result in another copy (but as I know modern compilers can optimize it out).
Currently this is what I do:
vector<int> v;
getLargeVector(v);
...
void getLargeVector(vector<int>& vec) {
// fill vec
}
But I don't find it an elegant solution.
So my question is: what is the best practice to do it (by avoiding unnecessary copy operations)? If possible, I'd like to avoid normal pointers. I've never used smart pointers so far, I don't know if they could help here.
Most C++ compilers implement return value optimization which means you can efficiently return a class from a function without the overhead of copying all the objects.
I would also recommend that you write:
vector<int> v(getLargeVector());
So that you copy construct the object instead of default construct and then operator assign to it.
void getLargeVector(vector<int>& vec) {
// fill the vector
}
Is a better approach for now. With c++0x , the problem with the first approach would go by making use of move operations instead copy operations.
RVO can be relied upon to make this code simple to write, but relying RVO can also bite you. RVO is a compiler-dependent feature, but more importantly an RVO-capable compiler can disable RVO depending on the code itself. For example, if you were to write:
MyBigObject Gimme(bool condition)
{
if( condition )
return MyBigObject( oneSetOfValues );
else
return MyBigObject( anotherSetOfValues );
}
...then even an RVO-capable compiler won't be able to optimize here. There are many other conditions under which the compiler won't be able to optimize, and so by my reckoning any code that by design relies on RVO for performance or functionality smells.
If you buy in to the idea that one function should have one job (I only sorta do), then your dilema as to how to return a populated vector becomes much simpler when you realize that your code is broken at the design level. Your function really does two jobs: it instantiates the vector, then it fills it in. Even with all this pedantary aside, however, a more generic & reliable solution exists than to rely on RVO. Simply write a function that populates an arbitrary vector. For example:
#include <cstdlib>
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
template<typename Iter> Iter PopulateVector(Iter it, size_t howMany)
{
for( size_t n = 0; n < howMany; ++n )
{
*(it++) = n;
}
return it;
}
int main()
{
vector<int> ints;
PopulateVector(back_inserter(ints), 42);
cout << "The vector has " << ints.size() << " elements" << endl << "and they are..." << endl;
copy(ints.begin(), ints.end(), ostream_iterator<int>(cout, " "));
cout << endl << endl;
static const size_t numOtherInts = 42;
int otherInts[numOtherInts] = {0};
PopulateVector(&otherInts[0], numOtherInts);
cout << "The other vector has " << numOtherInts << " elements" << endl << "and they are..." << endl;
copy(&otherInts[0], &otherInts[numOtherInts], ostream_iterator<int>(cout, " "));
return 0;
}
Why would you like to avoid normal pointers? Is it because you don't want to worry about memory management, or is it because you are not familiar with pointer syntax?
If you don't want to worry about memory management, then a smart pointer is the best approach. If you are uncomfortable with pointer syntax, then use references.
You have the best solution. Pass by reference is the way to handle that situation.
Sounds like you could do this with a class... but this could be unnecessary.
#include <vector>
using std::vector;
class MySpecialArray
{
vector<int> v;
public:
MySpecialArray()
{
//fill v
}
vector<int> const * getLargeVector()
{
return &v;
}
};