Difference in std::vector::emplace_back between GCC and VC++ [duplicate] - c++

This question already has answers here:
Can std::vector emplace_back copy construct from an element of the vector itself?
(3 answers)
Closed 8 years ago.
I have heard that one of the recommendations of Modern C++ is to use emplace_back instead of push_back for append in containers (emplace_back accept any version of parameters of any constructor of the type storage in the container).
According to the standard draft N3797 23.3.6.5 (1), say that:
Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid. If an exception is thrown other than by the copy constructor, move constructor, assignment operator, or move assignment operator of T or by any InputIterator operation there are no effects. If an exception is thrown by the move constructor of a non-CopyInsertable T, the effects are unspecified.
This specify what happen when no reallocation is needed, but leave open the problem when the container need to grow.
In this piece of Code:
#include <iostream>
#include <vector>
int main() {
std::vector<unsigned char> buff {1, 2, 3, 4};
buff.emplace_back(buff[0]);
buff.push_back(buff[1]);
for (const auto& c : buff) {
std::cout << std::hex << static_cast<long>(c) << ", ";
}
std::cout << std::endl;
return 0;
}
Compiled with VC++ (Visual Studio 2013 Update 4) and GCC 4.9.1 (MinGW) in Debug in Windows 8.1.
When compiled with VC++ the output is:
1, 2, 3, 4, dd, 2
When compiled with GCC the output is:
1, 2, 3, 4, 1, 2
Checking the implementation of emplace_back in VC++ the difference is that the first lines of code, check if the container need to grow (and grow if it's needed), in the case that the container need to grow, the reference to the first element (buff[0]) received in the emplace_back method is invalidated and when the actual setting of the value in the new created element of the container happen the value is invalid.
In the case of the push_back work because the creation of the element to append is made in the parameter binding (before the possible grow of the container).
My question is:
This behavior, when the container need to grow because a call to emplace_back and the parameter is a reference to the same container is implementation defined, unspecified or there is a problem in the implementation of one of the compiler (suppose in VC++ as GCC behavior is more close to the expected)?

When you used &operator[], that returned a reference. You then used emplace_back which caused reallocation, and thus invalidated all past references. Both of these rules are well defined. The correct thing that should happen is an exception. I actually expect the VC++ version to throw an exception if you are running the debug version under the debugger.
push_back has the same two rules, which means it will also do the same. I am almost certain, swapping the two lines emplace_back/push_back will result in the same behavior.

Related

Move-insert/emplace from a vector into itself

Should this code print empty (Clang & GCC), or not empty (Visual C++)?
Should that change if I remove reserve()? What about using emplace() instead of insert()?
#include <stdio.h>
#include <memory>
#include <vector>
int main()
{
std::vector<std::shared_ptr<int> > v;
v.reserve(10); // Should this affect the output?
v.push_back(std::make_shared<int>(0));
v.insert(v.begin(), std::move(v.front())); // Does 'emplace' vs. 'insert' matter?
printf("Element 0 is: %s\n", v[0] ? "not empty" : "empty");
return 0;
}
Logically not empty makes more sense to me for insert, because the caller of insert is requesting movement from a particular object, and we need to ensure that's what we're moving from. Moreover, it seems silly for reserve() to affect which element ends up where.
However, that would imply both libstdc++ (GCC) and libc++ (Clang) are buggy, which doesn't seem likely.
Perhaps more interestingly, I'm less certain about emplace. I'm inclined to think using emplace should logically output empty, because the caller is requesting construction to occur at in the object's correct place ("emplace"), which requires movement to occur before construction.
What is the correct behavior of moving from a container into itself? Does it depend on the specifics of which of insert/emplace/reserve are called? Is it implementation-defined?
For what it's worth, cppreference.com says the following for emplace:
If the required location has been occupied by an existing element, the inserted element is constructed at another location at first, and then move assigned into the required location.
but that goes against what both GCC and Clang do for emplace, as well as against my expectation above (which differs from my expectation for insert).

Is it safe to call size() method on moved-from vector? [duplicate]

This question already has answers here:
What constitutes a valid state for a "moved from" object in C++11?
(2 answers)
Closed 3 years ago.
Standard specifies that STL containers, after begin moved (in this case we talk about std::move that enables move construction / assignment), are in valid, but unspecified state.
I belive that means we can only apply operations that require no preconditions. I recall that someone here, on Stackoverflow, claimed that to be true and after some checking I agreed. Unfortunately, I cannot recall what sources have I checked. Furthermore, I was not able to find relevant information in the standard.
From [container.requirements.general/4], table 62 ([tab:container.req]), we can see that a.size() has no preconditions. Does that mean this code is safe?
#include <iostream>
#include <vector>
int main() {
std::vector<int> v1 = {1, 2, 3};
std::vector<int> v2 = std::move(v1);
std::cout << v1.size(); // displaying size of the moved-from vector
}
It's unspecified what will this code print, but is it safe? Meaning, do we have undefined behaviour here?
EDIT: I don't believe this question will be too broad if I ask abour other containers. Will the answer be consistent among all other STL containers, including std::string?
There is no undefined behavior here, because of the lack of pre-conditions. The Standard guarantees that a moved-from container will be left in a valid but unspecified state. A valid state implies that anything that doesn't have preconditions can be invoked, but the result will be unpredictable.
So yeah, this is not UB, but definitely useless and a bad idea.

Taking reference to vector element before adding more element to vector

Consider this code:
int main()
{
std::vector<std::string> v;
v.push_back("hello");
v.push_back("stack");
std::string &s = v[0];
v.push_back("overflow");
std::cout << s << std::endl;
return 0;
}
After running (using g++ (Ubuntu 4.8.4-2ubuntu1~14.04.1) 4.8.4) this prints only an empty line, hello is not printed. If I comment out v.push_back("stack"); then a segmentation fault appears.
Now I understand why this is happening. Adding more elements to vector is triggering a grow operation under the hood and my old reference becomes invalid after that. This is not my question.
My question is whether this behavior - modifying a vector or other STL container after taking a reference/pointer - is defined as undefined behavior in C++ standard? If yes, where? If no then what the standard says about this type of situation?
C++14 [vector.capacity]/6:
Reallocation invalidates all the references, pointers, and iterators referring to the elements in the sequence.
[vector.modifiers]/6 covers that push_back may cause reallocation, with iterators not being invalidated only if it did not reallocate.
I can't actually find any text that defines what it means for a reference to be invalidated, but it is clearly implied that using the referred-to value after invalidation would be undefined behaviour.
The act of modifying the container is not prohibited just because you acquired an iterator, reference, or pointer by some means. It is the iterator, reference, or pointer itself that is potentially invalidated.
§23.3.6.6 [vector.modifiers] (includes the push_back member family)
Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before
the insertion point remain valid. If an exception is thrown other than
by the copy constructor, move constructor, assignment operator, or
move assignment operator of T or by any InputIterator operation there
are no effects. If an exception is thrown while inserting a single
element at the end and T is CopyInsertable or
is_nothrow_move_constructible<T>::value is true, there are no effects.
Otherwise, if an exception is thrown by the move constructor of a
non-CopyInsertable T, the effects are unspecified.
If no resize happens only references, pointers, and iterators (including the end-iterator) past the insertion point are invalid. Great, but what happens if a reallocation happens? Interestingly, we find that in:
§23.3.6.3 [vector.capacity]
Remarks: Reallocation invalidates all the references, pointers, and
iterators referring to the elements in the sequence. No reallocation
shall take place during insertions that happen after a call to
reserve() until the time when an insertion would make the size of the
vector greater than the value of capacity().
I'm not entirely convinced this completely answers your question, however. If your wondering what happened to the prior memory that occupied the vector, that's up to the standard library, but it no longer contains viable content. The container no longer owns the memory (as far as you know), and neither do you.

Is it safe to push_back an element from the same vector?

vector<int> v;
v.push_back(1);
v.push_back(v[0]);
If the second push_back causes a reallocation, the reference to the first integer in the vector will no longer be valid. So this isn't safe?
vector<int> v;
v.push_back(1);
v.reserve(v.size() + 1);
v.push_back(v[0]);
This makes it safe?
It looks like http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-closed.html#526 addressed this problem (or something very similar to it) as a potential defect in the standard:
1) Parameters taken by const reference can be changed during execution
of the function
Examples:
Given std::vector v:
v.insert(v.begin(), v[2]);
v[2] can be changed by moving elements of vector
The proposed resolution was that this was not a defect:
vector::insert(iter, value) is required to work because the standard
doesn't give permission for it not to work.
Yes, it's safe, and standard library implementations jump through hoops to make it so.
I believe implementers trace this requirement back to 23.2/11 somehow, but I can't figure out how, and I can't find something more concrete either. The best I can find is this article:
http://www.drdobbs.com/cpp/copying-container-elements-from-the-c-li/240155771
Inspection of libc++'s and libstdc++'s implementations shows that they are also safe.
The standard guarantees even your first example to be safe. Quoting C++11
[sequence.reqmts]
3 In Tables 100 and 101 ... X denotes a sequence container class, a denotes a value of X containing elements of type T, ... t denotes an lvalue or a const rvalue of X::value_type
16 Table 101 ...
Expression a.push_back(t) Return type void Operational semantics Appends a copy of t. Requires: T shall be CopyInsertable into X. Container basic_string, deque, list, vector
So even though it's not exactly trivial, the implementation must guarantee it will not invalidate the reference when doing the push_back.
It is not obvious that the first example is safe, because the simplest implementation of push_back would be to first reallocate the vector, if needed, and then copy the reference.
But at least it seems to be safe with Visual Studio 2010. Its implementation of push_back does special handling of the case when you push back an element in the vector.
The code is structured as follows:
void push_back(const _Ty& _Val)
{ // insert element at end
if (_Inside(_STD addressof(_Val)))
{ // push back an element
...
}
else
{ // push back a non-element
...
}
}
This isn't a guarantee from the standard, but as another data point, v.push_back(v[0]) is safe for LLVM's libc++.
libc++'s std::vector::push_back calls __push_back_slow_path when it needs to reallocate memory:
void __push_back_slow_path(_Up& __x) {
allocator_type& __a = this->__alloc();
__split_buffer<value_type, allocator_type&> __v(__recommend(size() + 1),
size(),
__a);
// Note that we construct a copy of __x before deallocating
// the existing storage or moving existing elements.
__alloc_traits::construct(__a,
_VSTD::__to_raw_pointer(__v.__end_),
_VSTD::forward<_Up>(__x));
__v.__end_++;
// Moving existing elements happens here:
__swap_out_circular_buffer(__v);
// When __v goes out of scope, __x will be invalid.
}
The first version is definitely NOT safe:
Operations on iterators obtained by calling a standard library container or string member function may access the underlying container, but shall not modify it. [ Note: In particular, container operations that invalidate iterators conflict with operations on iterators associated with that container. — end note ]
from section 17.6.5.9
Note that this is the section on data races, which people normally think of in conjunction with threading... but the actual definition involves "happens before" relationships, and I don't see any ordering relationship between the multiple side-effects of push_back in play here, namely the reference invalidation seems not to be defined as ordered with respect to copy-constructing the new tail element.
It is completely safe.
In your second example you have
v.reserve(v.size() + 1);
which is not needed because if vector goes out of its size, it will imply the reserve.
Vector is responsible for this stuff, not you.
Both are safe since push_back will copy the value, not the reference. If you are storing pointers, that is still safe as far as the vector is concerned, but just know that you'll have two elements of your vector pointing to the same data.
Section 23.2.1 General Container Requirements
16
a.push_back(t) Appends a copy of t. Requires: T shall be CopyInsertable into X.
a.push_back(rv) Appends a copy of rv. Requires: T shall be MoveInsertable into X.
Implementations of push_back must therefore ensure that a copy of v[0] is inserted. By counter example, assuming an implementation that would reallocate before copying, it would not assuredly append a copy of v[0] and as such violate the specs.
From 23.3.6.5/1: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid.
Since we're inserting at the end, no references will be invalidated if the vector isn't resized. So if the vector's capacity() > size() then it's guaranteed to work, otherwise it's guaranteed to be undefined behavior.

Wrong results when appending vector to itself using copy and back_inserter [duplicate]

This question already has answers here:
Nice way to append a vector to itself
(4 answers)
Closed 8 years ago.
Inspired by this question, asking how to append a vector to itself, my first thought was the following (and yes, I realize insert is a better option now):
#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
int main() {
std::vector<int> vec {1, 2, 3};
std::copy (std::begin (vec), std::end (vec), std::back_inserter (vec));
for (const auto &v : vec)
std::cout << v << ' ';
}
However, this prints:
1 2 3 1 * 3
The * is a different number every time the program is run. The fact that it's only the 2 being replaced is peculiar, and if there actually is an explanation for that, I'd be interested to hear it. Continuing, if I append to a different vector (a copy of the original), it outputs correctly. It also outputs correctly if I add the following line before the copy one:
vec.reserve (2 * vec.size());
I was under the impression std::back_inserter was a safe way to add elements onto the end of a container, despite not reserving memory beforehand. If my understanding is correct, what's wrong with the copying line?
I assume it's nothing to do with the compiler, but I'm using GCC 4.7.1.
std::back_inserter creates an inserting iterator that inserts elements into a container. Each time this iterator is dereferenced, it calls push_back on the container to append a new element to the container.
For a std::vector container, a call to push_back where v.size() == v.capacity() will result in a reallocation: a new array is created to store the contents of the vector, its current contents are copied into the new array, and the old array is destroyed. Any iterators into the vector at this time are invalidated, meaning they can no longer be used.
In your program, this includes the input range defined by begin(vec) and end(vec) from which the copy algorithm is copying. The algorithm continues to use these iterators, even though they are invalidated, thus your program exhibits undefined behavior.
Even if your container had sufficient capacity, its behavior would still be undefined: the specification states that, upon insertion, "if no reallocation happens, all the iterators and references before the insertion point remain valid" (C++11 §23.3.6.5/1).
The call to push_back is equivalent to insertion at the end, so the end iterator (std::end(vec)) that you passed into std::copy is invalidated after a single call to push_back. If the input range is nonempty, the program therefore exhibits undefined behavior.
Note that the behavior of your program would be well-defined if you used a std::deque<int> or a std::list<int>, because neither of those containers invalidates iterators when elements are appended.