I know std::array doesn't do move semantics because it's not dynamically allocated. Do Compilers do proper NRVO for it? What about in the context of the calling code being a constructor initializer list?
Code is below and on goldbolt here: https://godbolt.org/z/je1Mvj15P
NB The share link passes argv[1] = "12345678901234567890". Also it is using only -Og to keep the assembly readable. At -O3 it starts unrolling loops etc, but this does not affect the question I believe.
The constructor init list call arr_(get_arr(s)) has no choice but to copy, because move its not available. Unless the compiler is doing full NRVO (which is "not mandatory", see comments below).
The compiler explorer output seems to show no copying to my eyes.
Is NRVO saving this?
Is this idiomatic / good? Or is std::array the wrong choice here? Or maybe this way of initialising an non-movable aggregate member with a function call and therefore relying on RVO is not sensible/reliable?
Would it be better to leave arr_ uninitialised in the constructor init list and move the code from get_arr into the constructor body? Like this: https://godbolt.org/z/jxv69YvK3
#include <algorithm>
#include <array>
#include <numeric>
#include <string>
std::array<std::byte, 20> get_arr(const std::string& s) {
std::array<std::byte, 20> a;
// a silly proxy algorithm for the real thing
std::transform(s.begin(), s.end(), a.begin(), [](auto b) { return std::byte(unsigned(b) << 1U); });
return a;
}
struct S {
explicit S(const std::string& s) : arr_(get_arr(s)) {}
std::array<std::byte, 20> arr_;
};
int main(int /*argc*/, char* argv[]) {
// in reality we are reading about 600'000'000 strings from a file
S s(argv[1]);
return static_cast<int>(s.arr_[19]); // use it to avoid optimising away
}
Related
I've been wondering about this question for a long time. What is the most idiomatic and / or efficient way to assign a new value to a data member (mutation)? I can think of 3 options:
Mutate directly from within method
Use a reference to mutate object
Assign the return value of the method to the data member (RVO)
Here's a demo. Please consider while reading the assembly that the compiler optimizes away probably most of the differences in this contrived example, but I just wanted to showcase the options in the most simple way. Please answer for the case where this methods are more involved.
Demo
#include <string>
#include <cstdio>
struct some_struct
{
auto assign_direct() {
str_ = "Hello World!";
}
auto assign_through_ref(std::string& ref) {
ref = "Hello World!";
}
auto assign_through_RVO() {
const std::string ret = "Hello World!";
return ret;
}
void internal_func() {
assign_direct();
assign_through_ref(str_);
str_ = assign_through_RVO();
}
std::string str_;
};
int main()
{
some_struct s;
s.internal_func();
}
My thought is that both, direct assignement and copy assignement must be equally efficient as they dereference the this-pointer and then dereference the effective address of the data member. So two dereferences are involved while the assign_thorugh_ref method only ever uses one dereferencing (except that this must be dereferenced to even call the method, but maybe this can be optimized away by an intelligent compiler).
Also what I want to know is what is most idiomatic / clear and least error prone? Maybe someone with some more years than me can give me some insights here!
If I have the following code that makes use of execution policies, do I need to synchronize all accesses to Foo::value even when I'm just reading the variable?
#include <algorithm>
#include <execution>
#include <vector>
struct Foo { int value; int getValue() const { return value; } };
int main() {
std::vector<Foo> foos;
//fill foos here...
std::sort(std::execution::par, foos.begin(), foos.end(), [](const Foo & left, const Foo & right)
{
return left.getValue() > right.getValue();
});
return 0;
}
My concern is that std::sort() will move (or copy) elements asynchronously which is effectively equivalent to asynchronously writing to Foo::value and, therefore, all read and write operations on that variable need to be synchronized. Is this correct or does the sort function itself take care of this for me?
What if I were to use std::execution::par_unseq?
If you follow the rules, i.e. you don't modify anything or rely on the identity of the objects being sorted inside your callback, then you're safe.
The parallel algorithm is responsible for synchronizing access to the objects it modifies.
See [algorithms.parallel.exec]/2:
If an object is modified by an element access function, the algorithm will perform no other unsynchronized accesses to that object. The modifying element access functions are those which are specified as modifying the object. [ Note: For example, swap(), ++, --, #=, and assignments modify the object. For the assignment and #= operators, only the left argument is modified. — end note ]
In case of std::execution::par_unseq, there's the additional requirement on the user-provided callback that it isn't allowed to call vectorization-unsafe functions, so you can't even lock anything in there.
This is OK. After all, you have told std::sort what you want of it and you would expect it to behave sensibly as a result, given that it is presented with all the relevant information up front. There's not a lot of point to the execution policy parameter at all, otherwise.
Where there might be an issue (although not in your code, as written) is if the comparison function has side effects. Suppose we innocently wrote this:
int numCompares;
std::sort(std::execution::par, foos.begin(), foos.end(), [](const Foo & left, const Foo & right)
{
++numCompares;
return left.getValue() > right.getValue();
});
Now we have introduced a race condition, since two threads of execution might be passing through that code at the same time and access to numCompares is not synchronised (or, as I would put it, serialised).
But, in my slightly contrived example, we don't need to be so naive, because we can simply say:
std::atomic_int numCompares;
and then the problem goes away (and this particular example would also work with what appears to me to be the spectacularly useless std::execution::par_unseq, because std_atomic_int is lockless on any sensible platform, thank you Rusty).
So, in summary, don't be too concerned about what std::sort does (although I would certainly knock up a quick test program and hammer it a bit to see if it does actually work as I am claiming). Instead, be concerned about what you do.
More here.
Edit And while Rusty was digging that up, I did in fact write that quick test program (had to fix your lambda) and, sure enough, it works fine. I can't find an online compiler that supports execution (MSVC seems to think it is experimental) so I can't offer you a live demo, but when run on the latest version of MSVC, this code:
#define _SILENCE_PARALLEL_ALGORITHMS_EXPERIMENTAL_WARNING
#include <algorithm>
#include <execution>
#include <vector>
#include <cstdlib>
#include <iostream>
constexpr int num_foos = 100000;
struct Foo
{
Foo (int value) : value (value) { }
int value;
int getValue() const { return value; }
};
int main()
{
std::vector<Foo> foos;
foos.reserve (num_foos);
// fill foos
for (int i = 0; i < num_foos; ++i)
foos.emplace_back (rand ());
std::sort (std::execution::par, foos.begin(), foos.end(), [](const Foo & left, const Foo & right)
{
return left.getValue() < right.getValue();
});
int last_foo = 0;
for (auto foo : foos)
{
if (foo.getValue () < last_foo)
{
std::cout << "NOT sorted\n";
break;
}
last_foo = foo.getValue ();
}
return 0;
}
Generates the following output every time I run it:
<nothing>
QED.
To learn about the intricacies of C++11 I am playing aroung with unique_ptr a bit.
I wonder, is there any way to use iota to initialize an Container of unique_ptr?
I started with the unique-ptr-less solution which works fine:
std::vector<int> nums(98); // 98 x 0
std::iota(begin(nums), end(alleZahlen), 3); // 3..100
Now lets do it as far as we can using unique_ptr
std::vector<std::unique_ptr<int>> nums(98); // 98 x nullptr
std::unique_ptr three{ new int{3} };
std::iota(begin(nums), end(nums), std::move{three});
This fails obviously. Reasons:
Although I marked three with move as a && this may not be sufficient to copy/move the initial value into the container.
++initValue will also not work, because initValue is of type unique_ptr<int>, and there is no operator++ defined. But: we could define a free function unique_ptr<int> operator++(const unique_ptr<int>&); and that would take care of that at least.
But to copy/move the results from that operation is again not permitted in unique_ptr and this time I can not see how I could trick the compiler into using move.
Well, that's where I stopped. And I wonder if I miss some interesting idea on how to tell the compiler that he may move the results of the operator++. Or are there other hindrances, too?
In order to end up with 98 instances of unique_ptr, there must be 98 calls to new. You attempt to get away with just one - that can't possibly fly.
If you are really intent on pounding a square peg into a round hole, you could do something like this:
#include <algorithm>
#include <iostream>
#include <memory>
#include <vector>
class MakeIntPtr {
public:
explicit MakeIntPtr(int v) : value_(v) {}
operator std::unique_ptr<int>() {
return std::unique_ptr<int>(new int(value_));
}
MakeIntPtr& operator++() { ++value_; return *this; }
private:
int value_;
};
int main() {
std::vector<std::unique_ptr<int>> nums(98);
std::iota(begin(nums), end(nums), MakeIntPtr(3));
std::cout << *nums[0] << ' ' << *nums[1] << ' ' << *nums[2];
return 0;
}
Maybe std::generate_n is a better algorithm for this?
std::vector<std::unique_ptr<int>> v;
{
v.reserve(98);
int n = 2;
std::generate_n(std::back_inserter(v), 98,
[&n]() { return std::make_unique<int>(++n); });
}
When using emplace_back a constructor must exist for the parameters passed (k,v) thus I need the constructor below. However since I use unique_ptr it complains about not being able to access 'delete' which I believe means I'm doing something that allows me to have more then one pointer.
I can't figure out the syntax. How do I write this constructor the right way?
struct KV{
unique_ptr<string> k, v;
KV(){}
KV (unique_ptr<string> k_,unique_ptr<string> v_):k(move(k_)),v(move(v_)){}
};
Your constructor is OK. A possible problem is that you are not moving the two unique_ptrs when supplying them to your constructor:
#include <memory>
#include <string>
using namespace std;
struct KV{
unique_ptr<string> k, v;
KV(){}
KV (unique_ptr<string> k_,unique_ptr<string> v_):k(move(k_)),v(move(v_)){}
};
int main()
{
unique_ptr<string> p1(new string());
unique_ptr<string> p2(new string());
// KV v(p1, p2); // ERROR!
KV kv(move(p1), move(p2)); // OK
vector<KV> v;
v.emplace_back(move(p1), move(p2)); // OK
}
UPDATE:
When VS2012 was shipped, VC11 did not support variadic templates. The correct implementation of emplace_back() should be variadic, but MS provided a dummy one. When the CTP has been shipped, only the compiler has been updated with support for variadic templates, but the STL hasn't been updated. Therefore, you still get the error.
There is not much to do about this if you can't change your compiler, apart from waiting for the next release of the product to be shipped. In the meanwhile, avoid using emplace_back() and use push_back() instead.
You haven't mentioned what container you're trying to emplace_back into, but assuming it is a vector, if your KV struct is really that simple, there's no need to declare any constructors. Just use aggregate initialization.
#include <memory>
#include <string>
#include <utility>
#include <vector>
using namespace std;
struct KV
{
unique_ptr<string> k, v;
// KV(){}
// KV (unique_ptr<string> k_,unique_ptr<string> v_):k(move(k_)),v(move(v_)){}
};
int main()
{
unique_ptr<string> p1(new string());
unique_ptr<string> p2(new string());
KV v{move(p1), move(p2)}; // initialize an instance
// this step is not necessary, you can skip it
vector<KV> vec;
vec.emplace_back(KV{move(v.k), move(v.v)});
}
This question raised after reading this tutorial:
http://www.cprogramming.com/tutorial/auto_ptr.html
There you can find the following statement: A subtle consequence of this behavior is that auto_ ptrs don't work well in all scenarios. For instance, using auto _ptr objects with the standard template library can lead to problems as some functions in the STL may make copies of the objects in containers such as the vector container class. One example is the sort function, which makes copies of some of the objects in the container being sorted. As a consequence, this copy can blithely delete the data in the container!
Most of the papers concerning 'auto_ptr' tell us something like following:
"Never use 'auto_ptr' with STL containers! They often copy their elements while performing intrinsic operations. For example consider sort on std::vector".
So my goal is to write the code sample that illustrates this point or prove that such examples are only theoretically true and weird on practice.
P.S. #everybody_who_also_knows_that_auto_ptr_is_deprecated
I also know this. But don't you consider technical reasons (legacy code or old compiler) that may not allow new pointer containers usage? And moreover this question is about old and bad (if you'd like) auto_ptr.
I don't have MSVC right now, but judging from the error from g++, I guess this is the reason:
auto_ptr<T> only has a "copy constructor" which takes mutable references (§D.10.1.1[auto.ptr.cons]/2–6):
auto_ptr(auto_ptr& a) throw();
template<class Y> auto_ptr(auto_ptr<Y>& a) throw();
But vector::push_back will accept a const reference (§23.3.6.1[vector.overview]/2).
void push_back(const T& x);
So it is impossible to construct an auto_ptr via push_back because no constructor takes a const reference.
From what you write, it seems that you already know everything that there is to know about containers of auto_ptrs and why they are unsafe.
Therefore, I assume that your interest in containers of auto_ptrs is purely teaching oriented. I understand your frustration in attempting to build a deliberate counter-example: in fact, most implementers of standard containers have put in place work-arounds to avoid accidentally triggering the broken semantics of auto_ptrs.
So, here's an example that I have written myself precisely for teaching:
class MyClass {
int a;
public:
MyClass (int i) : a(i) { }
int get() const { return a; }
};
int main() {
constexpr unsigned size = 10;
std::vector< std::auto_ptr<MyClass> > coap;
coap.resize(size);
for (unsigned u=0; u<size; u++)
coap[u] = std::auto_ptr<MyClass>( new MyClass( rand() % 50 ));
std::sort( coap.begin(), coap.end(),
[]( std::auto_ptr<MyClass> a,
std::auto_ptr<MyClass> b) { return a->get() < b->get(); });
}
Compiling it with g++ 4.9.2 will lead to an executable that will segfault nicely.
You can rewrite the example above even more concisely by using type deduction:
std::sort( coap.begin(), coap.end(),
[]( auto a, auto b) { return a->get() < b->get(); });
Note that the problem is not in the specific implementation of std::sort, which seems to be auto_ptr-safe. It is rather in the comparison lambda function I am passing to std::sort, that deliberately accepts its arguments by value, thus destroying the objects in the container every time a comparison is performed.
If you changed the lambda so that it receives its arguments by reference, as shown below, most STL implementations would actually behave correctly, even if you are doing something that is conceptually wrong.
std::sort( coap.begin(), coap.end(),
[]( const std::auto_ptr<MyClass> & a,
const std::auto_ptr<MyClass> & b) { return a->get() < b->get(); });
Good luck!
STEP 1
Lets' solve this problem in a straight way:
#include <iostream>
#include <vector>
#include <algorithm>
template<> struct std::less<std::auto_ptr<int>>: public std::binary_function<std::auto_ptr<int>, std::auto_ptr<int>, bool> {
bool operator()(const std::auto_ptr<int>& _Left, const std::auto_ptr<int>& _Right) const
{ // apply operator< to operands
return *_Left < *_Right;
}
};
int wmain() {
using namespace std;
auto_ptr<int> apai(new int(1)), apai2(new int(2)), apai3(new int(3));
vector<auto_ptr<int>> vec;
vec.push_back(apai3);
vec.push_back(apai);
vec.push_back(apai2);
for ( vector<auto_ptr<int>>::const_iterator i(vec.cbegin()) ; i != vec.cend() ; ++i )
wcout << i->get() << L'\t';
vector<int> vec2;
vec2.push_back(3);
vec2.push_back(2);
vec2.push_back(5);
sort(vec2.begin(), vec2.end(), less<int>());
sort(vec.begin(), vec.end(), less<auto_ptr<int>>());
return 0;
}
On MSVCPP11 the error text is following:
_Error 1 error C2558: class 'std::auto_ptr<Ty>': no copy constructor available or copy constructor is declared 'explicit' c:\program files (x86)\microsoft visual studio 11.0\vc\include\xmemory0 608
The conclusion is: I even cannot compile such example. Why do they prevent me to do something that I cannot compile?? Their preventions are not always true.
STEP 2
We cannot use auto_ptr as vector element type directly due to auto_ptr design. But we can wrap `auto_ptr' in the way presented below.
#include <iostream>
#include <vector>
#include <algorithm>
#include <memory>
#include <functional>
template<typename T> class auto_ptr_my: public std::auto_ptr<T> {
public:
explicit auto_ptr_my(T *ptr = 0) {
this->reset(ptr);
}
auto_ptr_my<T> &operator=(const auto_ptr_my<T> &right) {
*(static_cast<std::auto_ptr<T> *>(this)) = *(static_cast<std::auto_ptr<T> *>(const_cast<auto_ptr_my *>(&right)));
return *this;
}
auto_ptr_my(const auto_ptr_my<T>& right) {
*this = right;
}
};
namespace std
{
template<> struct less<auto_ptr_my<int> >: public std::binary_function<auto_ptr_my<int>, auto_ptr_my<int>, bool> {
bool operator()(const auto_ptr_my<int>& _Left, const auto_ptr_my<int>& _Right) const
{ // apply operator< to operands
return *_Left < *_Right;
}
};
}
int wmain() {
using namespace std;
auto_ptr_my<int> apai(new int(1)), apai2(new int(2)), apai3(new int(3));
vector<auto_ptr_my<int>> vec;
vec.push_back(apai3);
vec.push_back(apai);
vec.push_back(apai2);
for ( vector<auto_ptr_my<int>>::const_iterator i(vec.cbegin()) ; i != vec.cend() ; ++i )
wcout << **i << L'\t';
sort(vec.begin(), vec.end(), less<auto_ptr_my<int>>());
for ( vector<auto_ptr_my<int>>::const_iterator i(vec.cbegin()) ; i != vec.cend() ; ++i )
wcout << **i << L'\t';
return 0;
}
This code works well showing that auto_ptr can be used with vector and sort with no memory leaks and crashes.
STEP 3
As KennyTM posted below:
add this code before return 0; statement:
std::vector<auto_ptr_my<int>> vec2 = vec;
for ( vector<auto_ptr_my<int>>::const_iterator i(vec2.cbegin()) ; i != vec2.cend() ; ++i )
wcout << **i << L'\t';
wcout << std::endl;
for ( vector<auto_ptr_my<int>>::const_iterator i(vec.cbegin()) ; i != vec.cend() ; ++i )
wcout << **i << L'\t';
wcout << std::endl;
...and get memory leaks!
CONCLUSION
Sometimes we can use auto_ptr with containers without visible crash, sometimes not. Anyway it is bad practice.
But don't forget that auto_ptr is designed in such way that you cannot use it straight with STL containers and algorithms: against you have to write some wrapper code. At last using auto_ptr with STL containers is for your own risk. For example, some implementations of sort will not lead to the crash while processing vector elements, but other implementations will lead directly to the crash.
This question has academic purposes.
Thanks to KennyTM for providing STEP 3 crash example!
The conclusion is: I even cannot compile such example. Why do they prevent me to do something that I cannot compile??
IIRC, it is the other way around: the compiler vendor takes steps to prevent you from compiling something that you shouldn't be able to do. The way the standard is written, they could implement the library in a way that the code compiles, and then fails to work properly. They can also implement it this way, which is seen as superior because it's one of those few times where the compiler is actually allowed to prevent you from doing something stupid :)
The right answer is "never use auto_ptr at all" -- its deprecated and never became part of the standard at all, for precisely the reasons outlined here. Use std::unique_ptr instead.