Move-insert/emplace from a vector into itself - c++

Should this code print empty (Clang & GCC), or not empty (Visual C++)?
Should that change if I remove reserve()? What about using emplace() instead of insert()?
#include <stdio.h>
#include <memory>
#include <vector>
int main()
{
std::vector<std::shared_ptr<int> > v;
v.reserve(10); // Should this affect the output?
v.push_back(std::make_shared<int>(0));
v.insert(v.begin(), std::move(v.front())); // Does 'emplace' vs. 'insert' matter?
printf("Element 0 is: %s\n", v[0] ? "not empty" : "empty");
return 0;
}
Logically not empty makes more sense to me for insert, because the caller of insert is requesting movement from a particular object, and we need to ensure that's what we're moving from. Moreover, it seems silly for reserve() to affect which element ends up where.
However, that would imply both libstdc++ (GCC) and libc++ (Clang) are buggy, which doesn't seem likely.
Perhaps more interestingly, I'm less certain about emplace. I'm inclined to think using emplace should logically output empty, because the caller is requesting construction to occur at in the object's correct place ("emplace"), which requires movement to occur before construction.
What is the correct behavior of moving from a container into itself? Does it depend on the specifics of which of insert/emplace/reserve are called? Is it implementation-defined?
For what it's worth, cppreference.com says the following for emplace:
If the required location has been occupied by an existing element, the inserted element is constructed at another location at first, and then move assigned into the required location.
but that goes against what both GCC and Clang do for emplace, as well as against my expectation above (which differs from my expectation for insert).

Related

Why does const_casting a heap.top() of priority_queue have undefined behavior?

I've made a simple Huffman encoding program to output individual encodings for characters and save the encoded file. This was for an assignment, and I was told that using const_cast on heap.top() is considered undefined behavior if we heap.pop() afterwards, but I'm not sure I understand why.
I've read cppreference regarding the std::pop_heap which is the underlying function called when we call heap.pop() and I believe that a nullptr in the comparison is still defined and understood. It doesn't seem to function abnormally to me when I debugged it.
Here's an example
#include <functional>
#include <queue>
#include <vector>
#include <iostream>
#include <memory>
template<typename T> void print_queue_constcast(T& q) {
while(!q.empty()) {
auto temp = std::move(const_cast<int&>(q.top()));
std::cout << temp << " ";
q.pop();
}
std::cout << '\n';
}
template<typename T> void print_queue(T& q) {
while(!q.empty()) {
std::cout << q.top() << " ";
q.pop();
}
std::cout << '\n';
}
int main() {
std::priority_queue<int> q1;
std::priority_queue<int> q2;
for(int n : {1,8,5,6,3,4,0,9,7,2}){
q1.push(n);
q2.push(n);
}
print_queue(q1);
print_queue_constcast(q2);
}
Could anyone explain what is actually going in the backgroun that'd be undefined behavior or that would cause this to fail under certain circumstances?
tl;dr: Maybe; maybe not.
Language-level safety
Like a set, a priority_queue is in charge of ordering its elements. Any modification to an element would potentially "break" the ordering, so the only safe way to do that is via the container's own mutating methods. (In fact, neither one actually provides such a thing.) Directly modifying elements is dangerous. To enforce this, these containers only expose const access to your elements.
Now, at the language level, the objects won't actually have a static type of const T; most likely they're just Ts. So modifying them (after a const_cast to cheat the type system) doesn't have undefined behaviour in that sense.
Library-level safety
However, you are potentially breaking a condition of using the container. The rules for priority_queue don't ever actually say this, but since its mutating operations are defined in terms of functions like push_heap and pop_heap, your use of such operations will break preconditions of those functions if the container's ordering is no longer satisfied after your direct mutation.
Thus your program will have undefined behaviour if you break the ordering and later mutate the priority_queue in such a way that depends on the ordering being intact. If you don't, technically your program's behaviour is well-defined; however, in general, you'd still be playing with fire. A const_cast should be a measure of last resort.
So, where do we stand?
The question is: did you break the ordering? What's the state of the element after moving from it, and is the ordering satisfied by having an object in that state at the top of the queue?
Your original example uses shared_ptrs, and we know from the documentation that a moved-from shared_ptr turns safely into a null pointer.
The default priority_queue ordering is defined by std::less, which yields a strict total order over raw pointers; std::less on a shared_ptr will actually invoke its base case of operator<, but that in turn is defined to invoke std::less on its raw pointer equivalent.
Unfortunately, that doesn't mean that a null shared_ptr is ordered "first": though std::less's pointer ordering is strict and total, where null pointers land in this ordering is unspecified.
So, it is unspecified as to whether your mutation will break the ordering, and therefore it is unspecified as to whether your pop() will have undefined behaviour.
(The MCVE example with int is safe because std::move on an int has no work to do: it'll just copy the int. So, the ordering remains unaffected.)
Conclusion
I would agree with what was presumably your driving rationale, that it is unfortunate pop() doesn't return you the popped thing, which you could then move from. Similar restrictions with sets and maps are why we now have node splicing features for those containers. There is not such a thing for a priority_queue, which is just a wrapper around another container like a vector. If you need more fine-grained control, you can substitute that for your own which has the features you need.
Anyway, for the sake of a shared_ptr increment (as in your original code), I'd probably just take the hit of the copy, unless you have some really extreme performance requirements. That way, you know everything will be well-defined.
Certainly, for the sake of an int copy (as in your MCVE), a std::move is entirely pointless (there are no indirect resources to steal!) and you're doing a copy anyway, so the point is rather moot and all you've done is to create more complex code for no reason.
I would also recommend not writing code where you have to ask whether it's well-defined, even if it turns out it is. That's not ideal for readability or maintainability.

std::vector::reserve allows random access before push_back

I recently learned about std::vector::reserve online. The websites say that reserving memory inside a std::vector does NOT change the size of the vector, but instead increases the std::vector's capacity. After reserving, attempting to access random elements should crash.
However, when I run this code:
#include <iostream>
#include <vector>
using namespace std;
int main(){
vector <int> v;
v.reserve(1000000);
v[4] = 5;
cout << v[4] << endl; // this line and the above line should cause errors
return 0;
}
Nothing happens. The program runs and prints 5 to the screen, and I don't get any errors at all.
I'm not sure if I'm making a mistake here, so can somebody tell me why the above program runs?
After reserving, attempting to access random elements should crash.
No, it would be more correct to say that "after reserving, attempting to access random elements will result in undefined behaviour"(a).
And undefined behaviour means exactly that, undefined. It may work, it may not. It may seem to work but set up conditions for spectacular failure later on in your program. It may not work in another implementation, it may even not work in the same implementation on certain days of the week.
Bottom line, don't do it.
(a) Table 69 in ISO C++20 has the two element access operations stating exactly the same thing:
a[n] - returns reference, or const reference for constant a. Semantics: *(a.begin() + n).
a.at(n) - returns reference, or const reference for constant a. Semantics: *(a.begin() + n).
But the note immediately after that clarifies the difference:
The member function at() provides bounds-checked access to container elements. at() throws out_of_range if n >= a.size().
Hence, if you need it to "crash" (quoted since it' really raising an exception rather than crashing), use the latter.

Difference in std::vector::emplace_back between GCC and VC++ [duplicate]

This question already has answers here:
Can std::vector emplace_back copy construct from an element of the vector itself?
(3 answers)
Closed 8 years ago.
I have heard that one of the recommendations of Modern C++ is to use emplace_back instead of push_back for append in containers (emplace_back accept any version of parameters of any constructor of the type storage in the container).
According to the standard draft N3797 23.3.6.5 (1), say that:
Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid. If an exception is thrown other than by the copy constructor, move constructor, assignment operator, or move assignment operator of T or by any InputIterator operation there are no effects. If an exception is thrown by the move constructor of a non-CopyInsertable T, the effects are unspecified.
This specify what happen when no reallocation is needed, but leave open the problem when the container need to grow.
In this piece of Code:
#include <iostream>
#include <vector>
int main() {
std::vector<unsigned char> buff {1, 2, 3, 4};
buff.emplace_back(buff[0]);
buff.push_back(buff[1]);
for (const auto& c : buff) {
std::cout << std::hex << static_cast<long>(c) << ", ";
}
std::cout << std::endl;
return 0;
}
Compiled with VC++ (Visual Studio 2013 Update 4) and GCC 4.9.1 (MinGW) in Debug in Windows 8.1.
When compiled with VC++ the output is:
1, 2, 3, 4, dd, 2
When compiled with GCC the output is:
1, 2, 3, 4, 1, 2
Checking the implementation of emplace_back in VC++ the difference is that the first lines of code, check if the container need to grow (and grow if it's needed), in the case that the container need to grow, the reference to the first element (buff[0]) received in the emplace_back method is invalidated and when the actual setting of the value in the new created element of the container happen the value is invalid.
In the case of the push_back work because the creation of the element to append is made in the parameter binding (before the possible grow of the container).
My question is:
This behavior, when the container need to grow because a call to emplace_back and the parameter is a reference to the same container is implementation defined, unspecified or there is a problem in the implementation of one of the compiler (suppose in VC++ as GCC behavior is more close to the expected)?
When you used &operator[], that returned a reference. You then used emplace_back which caused reallocation, and thus invalidated all past references. Both of these rules are well defined. The correct thing that should happen is an exception. I actually expect the VC++ version to throw an exception if you are running the debug version under the debugger.
push_back has the same two rules, which means it will also do the same. I am almost certain, swapping the two lines emplace_back/push_back will result in the same behavior.

std::move on two deques - input vs output iterators

take a look at the following code:
#include <algorithm>
#include <deque>
#include <iostream>
using namespace std;
int main()
{
deque<int> in {1,2,3};
deque<int> out;
// line in question
move(in.begin(), in.end(), out.begin());
for(auto i : out)
cout << i << endl;
return 0;
}
This will not move anything. Looking at the example here, one must write the line in question like this:
move(in.begin(), in.end(), std::back_inserter(out));
This makes sense in a way, as std::move expects its first two arguments to be InputInterators (which is satisfied here) and the third one to be an OutputIterator (which out.begin() is not).
What does actually happen if the original code is executed and move is passed an iterator that is not an OutputIterator? Why does C++'s type-safety not work here? And why is the construction of an output-iterator delegated to an external function, i.e. why does out.backInserter() not exist?
The original code tries to dereference and increment out.begin(). Since out is empty, that's a past-the-end iterator, and it can't be dereferenced or incremented. Doing so gives undefined behaviour.
std::move expects [...] the third one to be an OutputIterator (which out.begin() is not).
Yes it is. Specifically, it's a mutable random access iterator, which supports all the operations required of an output iterator, and more.
What does actually happen if the original code is executed and move is passed an iterator that is not an OutputIterator?
That would cause a compile error if the iterator didn't support the operations required of an output iterator needed by the function; or undefined behaviour if the operations existed but did something other than that required of an output iterator.
Why does C++'s type-safety not work here?
Because the type is correct. The incorrect runtime state (being a past-the-end iterator, not the start of a sequence with at least as many elements as the input range) can't be detected through the static type system.
why does out.backInserter() not exist?
That would have to be written separately for all sequence containers: both the standard ones, and any others you might define yourself. The generic function only has to be implemented once, in the standard library, to be usable for any container that supports push_back.

pointer delegate in STL set

I'm kinda stuck with using a set with a pointer delegate. My code is as follows:
void Graph::addNodes (NodeSet& nodes)
{
for (NodeSet::iterator pos = nodes.begin(); pos != nodes.end(); ++pos)
{
addNode(*pos);
}
}
Here NodeSet is defined as:
typedef std::set<Node_ptr, Node_ptr_Sorting_Predicate> NodeSet;
The above piece of code works perfectly on my windows machine, but when I run the same piece of code on a MAC, it gives me the following error:
no matching function for call to 'Graph::addNode(const boost::shared_ptr<Node>&)'
FYI, Node_ptr is of type: typedef boost::shared_ptr<Node> Node_ptr;
Can somebody please tell me why this is happening?
Ok, from your added information, the problem seems to be that addNode takes a Node_ptr per non-const reference, while what the compiler has to call the function is a const boost::shared_ptr<Node>& (note the const). Let me explain:
std::set is an associative container. Associative containers store their elements in some order, using the key element to define the ordering. If you would be allowed to change the key without the container knowing, you would invalidate the container's internal order. That's why I believe dereferencing a std::set<T>::iterator does not return an modifiable lvalue. (Which means you cannot alter the reference returned. For example, if you have an iterator pos into a std::set<int>, *pos=42 should not compile.)
The catch with this is that only modifiable lvalues will bind to a non-const reference. But what *pos returns isn't a modifiable lvalue and thus won't. (So, in my example, int& r = *pos; won't compile.) The reason is that, if this was allowed, you could change the sorting key through that non-const reference behind the container's back and mess up the container's internal ordering.
That is why the result of your *pos won't bind to a Node_ptr&. And that in turn is why the compiler cannot call your function.
Does your addNode() member function really alter the Node it's given? If not, it should take a const Node_ptr&.
If it does, you have a design problem. You cannot alter an element that's in a set. The only thing you can do is to remove it from the set, change it, and add it back in.
On a side note: VC9 indeed compiles the following piece of code:
#include <iostream>
#include <set>
#include <typeinfo>
#include <iterator>
int main()
{
std::set<int> set;
set.insert(5);
std::cout << *set.begin() << '\n';
*set.begin() = 3; // this is an error!
std::cout << *set.begin() << '\n';
return (0);
}
I believe this is an error in VC9. Comeau rejects it.
Here's how to solve riddles with a compiler not calling a function you think it should call or calling the wrong function from a set of overloads.
The function you thought it should call is Graph::addNode(Node_ptr&). The code that you thought should call it is
addNode(*pos);
Change that code so that it provides the exact parameter(s) required:
Node_ptr& tmp = *pos;
addNode(tmp);
Now the call should definitely compile (or call the right overload), and the compiler should bark if it thinks *pos cannot be assigned to to a Node_ptr&.
Usually this tactic helps me to find out what's wrong in such situations.
If memory serves, the original C++ spec (1998) permits std::set to return modifiable iterators. This carries with it a risk--the iterator might be used to modify the stored value, such that the ordering of the set is now broken. I believe subsequent versions of the spec have changed this, and now all set iterators are non-modifiable.
VC++ 2010 respects the new behaviour, and has non-modifiable set iterators (which is annoying, as it prevents making changes that don't change the ordering and which ought to be legal).
Prior versions, however, did not. This means that you can create functions that are not suitably annotated with const, which will cause problems on switching to different compilers. The solution is to add the necessary const changes. VC++ will still work (since non-const values can be implicitly made const anyway), and so will everything else.