Please consider this code :
#include <iostream>
#include <vector>
#include <utility>
std::vector<int> vecTest;
int main()
{
int someRval = 3;
vecTest.push_back(someRval);
vecTest.push_back(std::move(someRval));
return 0;
}
So as far as I understand, someRval's value will be copied into vecTest on the first call of push_back(), but on the second someRval produces an x value. My question is, will there ever be any performance benefit, I mean probably not with int but would there maybe be some performance benefit when working with much larger objects?
The performance benefit from moving usually comes from dynamic allocation being ruled out.
Consider an over-simplified (and naive) string (missing a copy-assignment operator and a move-assignment operator):
class MyString
{
public:
MyString() : data(nullptr) {}
~MyString()
{
delete[] data;
}
MyString(const MyString& other) //copy constructor
{
data = new char[strlen(other.c_str()) + 1]; // another allocation
strcpy(data, other.c_str()); // copy over the old string buffer
}
void set(const char* str)
{
char* newString = new char[strlen(str) + 1];
strcpy(newString, str);
delete[] data;
data = newString;
}
const char* c_str() const
{
return data;
}
private:
char* data;
};
This is all fine and dandy but the copy constructor here is possibly expensive if your string becomes long. The copy constructor is however required to copy over everything because it's not allowed to touch the other object, it must do exactly what it's name says, copy contents. Now this is the price you have to pay if you need a copy of the string, but if you just want to use the string's state and don't care about what happens with it afterwards you might as well move it.
Moving it only requires to leave the other object in some valid state so we can use everything in other which is exactly what we want. Now, all we have to do instead of copying the content our data pointer is pointing to is just to re-assign our data pointer to the one of other, we're basically stealing the contents of other, we'll also be nice and set the original data pointer to nullptr:
MyString(MyString&& other)
{
data = other.data;
other.data = nullptr;
}
There, this is all we have to do. This is obviously way faster than copying the whole buffer over like the copy constructor is doing.
Example.
Moving "primitive" types like int or even char* does nothing different than copying them.
Complex types, like std::string, can use the information that you are willing to sacrifice the source-object state to make moving far more efficient than copying.
Yes, but it depends on the details of your application - size of the object, and frequence of the operation.
Casting it to an r-value and moving it (by using std:move()) avoids a copy. If the size of the object is large enough, this saves time (consider for example an array with 1 000 000 doubles - copying it typically means copying 4 or more MB of memory).
The other point is frequency - if your code does the respective operation very often, it can add up considerable.
Note that the source object is destroyed (made unusable) in the process, and this might or might not be acceptable for your logic - you need to understand it and code accordingly. If you still need the source object afterwards, it obvioulsy would not work.
Generally, don't optimize unless you need to optimize.
Related
I've just finished learning about lvalues, rvalues, move constructor and move operator.
I can see the added value of using move constructor with rvalues performance and usability wise.
But i can't see the added value of using move operator with move constructor with lvalues performance wise, sure there is a added value usability wise, but can't we achieve the same functionality for lvalues using some other technologies like pointers for example.
so my question is: what is the added value of using move operator with move constructor with lvalues performance wise.
Thanks for reading my question.
example:
class string{
public:
char* data;
string(){
data = nullptr;
}
string(const char* p)
{
size_t size = std::strlen(p) + 1;
data = new char[size];
std::memcpy(data, p, size);
}
~string()
{
delete[] data;
}
string(const string& that)
{
size_t size = std::strlen(that.data) + 1;
data = new char[size];
std::memcpy(data, that.data, size);
}
string(string&& that)
{
data = that.data;
that.data = nullptr;
}
string movee(string &that){
data = that.data;
that.data = nullptr;
}};
what is the difference performence wise:
string s1("test");
string s2(std::move(s1));
string s1("test");
string s2 = string();
s2.movee(s1);
In the case of rvalues, the compiler spares us the time and memory taken to reform a new object and assign the rvalue to it, and then using that new object to change the values in the moved-to object, thus increasing performance. But in the case of a lvalue object, the move operator is not increasing performance, it is increasing usability and readability of course, but it is not increasing the performance as if it was a rvalue, am I wrong?
You could achieve the same functionality using indirection with pointers. Indeed, we used to, and even now that is exactly what is happening inside your move constructors/assignment operators!
Yes, most of the benefit when you std::move a variable is in the cleanliness of the code. But this is not pointless. Stealing resources with move constructors/assignment operators allows us to do so neatly, and quickly, without hacks or any performance penalties. Your code will be more maintainable and easier to rationalise about. You will be much less tempted to add an extra layer of indirection, dynamically allocating things that didn't need to be dynamically allocated, just to get code that's in any way understandable, because now there won't be any benefit in doing so.
But! Don't forget that your standard containers are doing this for you behind the scenes, e.g. when std::vector resizes. That's very valuable. If we could only move from temporaries, that would not be possible and we'd have a massive wasted opportunity on our hands.
Doing it with temporaries isn't really any different; it's just that it's done for you without having to type std::move.
Since C++11, when using the move assignment operator, should I std::swap all my data, including POD types? I guess it doesn't make a difference for the example below, but I'd like to know what the generally accepted best practice is.
Example code:
class a
{
double* m_d;
unsigned int n;
public:
/// Another question: Should this be a const reference return?
const a& operator=(a&& other)
{
std::swap(m_d, other.m_d); /// correct
std::swap(n, other.n); /// correct ?
/// or
// n = other.n;
// other.n = 0;
}
}
You might like to consider a constructor of the form: - ie: there are always "meaningful" or defined values stores in n or m_d.
a() : m_d(nullptr), n(0)
{
}
I think this should be rewriten this way.
class a
{
public:
a& operator=(a&& other)
{
delete this->m_d; // avoid leaking
this->m_d = other.m_d;
other.m_d = nullptr;
this->n = other.n;
other.n = 0; // n may represents array size
return *this;
}
private:
double* m_d;
unsigned int n;
};
should I std::swap all my data
Not generally. Move semantics are there to make things faster, and swapping data that's stored directly in the objects will normally be slower than copying it, and possibly assigning some value to some of the moved-from data members.
For your specific scenario...
class a
{
double* m_d;
unsigned int n;
...it's not enough to consider just the data members to know what makes sense. For example, if you use your postulated combination of swap for non-POD members and assignment otherwise...
std::swap(m_d, other.m_d);
n = other.n;
other.n = 0;
...in the move constructor or assignment operator, then it might still leave your program state invalid if say the destructor skipped deleting m_d when n was 0, or if it checked n == 0 before overwriting m_d with a pointer to newly allocated memory, old memory may be leaked. You have to decide on the class invariants: the valid relationships of m_d and n, to make sure your move constructor and/or assignment operator leave the state valid for future operations. (Most often, the moved-from object's destructor may be the only thing left to run, but it's valid for a program to reuse the moved-from object - e.g. assigning it a new value and working on it in the next iteration of a loop....)
Separately, if your invariants allow a non-nullptr m_d while n == 0, then swapping m_ds is appealing as it gives the moved-from object ongoing control of any buffer the moved-to object may have had: that may save time allocating a buffer later; counter-balancing that pro, if the buffer's not needed later you've kept it allocated longer than necessary, and if it's not big enough you'll end up deleting and newing a larger buffer, but at least you're being lazy about it which tends to help performance (but profile if you have to care).
No, if efficiency is any concern, don't swap PODs. There is just no benefit compared to normal assignment, it just results in unnecessary copies. Also consider if setting the moved from POD to 0 is even required at all.
I wouldn't even swap the pointer. If this is an owning relationship, use unique_ptr and move from it, otherwise treat it just like a POD (copy it and set it to nullptr afterwards or whatever your program logic requires).
If you don't have to set your PODs to zero and you use smart pointers, you don't even have to implement your move operator at all.
Concerning the second part of your question:
As Mateusz already stated, the assignment operator should always return a normal (non-const) reference.
I've read countless articles on copy constructors and move semantics. I feel like I 'sort' of understand what's going on, but a lot of the explanations leave out whats actually occurring under the hood (which is what is causing me confusion).
For example:
string b(x + y);
string(string&& that)
{
data = that.data;
that.data = 0;
}
What is actually happening in memory with the objects? So you have some object 'b' that takes x + y which is an rvalue and then that invokes the move constructor. This is really causing me confusion... Why do that?
I understand the benefit is to 'move' the data instead of copy it, but where I'm lost here is when I try to piece together what happens to each object/parameter at a memory level.
Sorry if this sounds confusing, talking about it is even confusing myself.
EDIT:
In summary, I understand the 'why' of the copy constructors and move constructors... I just don't understand the 'how'.
What's going on is a complex object will normally not be entirely stack based. Let's take an example object:
class String {
public:
// happy fun API
private:
size_t size;
char* data;
};
Like most strings, our string is a character array. It essentially is an object that keeps around a character array and a proper size.
In the case of a copy, there's two steps involved. First you copy size then you copy data. But data is just a pointer. So if we copy the object then modify the original, the two places are pointing to the same data, our copy changes. This is not what we want.
So instead what must be done is to do the same thing we did when we first made the object, new the data to the proper size.
So when we're copying the object we need to do something like:
String::String(String const& copy) {
size = copy.size;
data = new int[size];
memcpy(data, copy.data, size);
}
But on the other hand, if we only need to move the data, we can do something like:
String::String(String&& copy) {
size = copy.size;
data = copy.data;
copy.size = 0;
copy.data = nullptr; // So copy's dtor doesn't try to free our data.
}
Now behind the scenes, the pointer was just kinda... passed to us. We didn't have to allocate any more information. This is why moves are preferred. Allocating and copying memory on the heap can be a very expensive operation because it's not happening locally on the stack, it's happening somewhere else, so that memory has to be fetched, it might not be in cache, etc.
... (x + y);
Let's assume Short-String-Optimisation is not in play - either because the string implementation doesn't use it or the string values are too long. operator+ returns by value, so has to create a temporary with a new buffer totally unrelated to the x and y strings...
[ string { const char* _p_data; ... } ]
\
\-------------------------(heap)--------[ "hello world!" ];
Sans optimisation, that's done to prepare the argument for the string constructor - "before" considering what that constructor will do with the argument.
string b(x + y);
Here the string(string&&) constructor is invoked, as the compiler understands that the temporary above is suitable for moving from. When the constructor starts running, its pointer to text is uninitialised - something like the diagram below with the temporary shown again for context:
[ string { const char* _p_data; ... } ]
\
\-------------------------(heap)--------[ "hello world!" ];
[ string b { const char* _p_data; ... } ]
\
\----? uninitialised
What the move constructor for b then does is steal the existing heap buffer from the temporary.
nullptr
/
[ string { const char* _p_data; ... } ]
-------------------------(heap)--------[ "hello world!" ];
/
/
[ string b { const char* _p_data; ... } ]
It also needs to set the temporary's _p_data to nullptr to make sure that when the temporary's destructor runs it doesn't delete[] the buffer now considered to be owned by b. (The move constructor will "move" other data members too - the "capacity" value, either a pointer to the "end" position or a "size" value etc.).
All this avoids having b's constructor create a second heap buffer, copy all the text over into it, only to then do extra work to delete[] the temporary's buffer.
(x + y) gives you a string value. You want to store it in b without copying it. This was made possible long before C++11 and move semantics, by the Return Value Optimization (RVO).
I'm surprised that s and s2 internal pointer to "sample" are not equal, what is the explanation ?
#include <string>
#include <cassert>
int main()
{
std::string s("sample");
std::string s2(std::move(s));
assert(
reinterpret_cast<int*>(const_cast<char*>(s.data())) ==
reinterpret_cast<int*>(const_cast<char*>(s2.data()))
); // assertion failure here
return 1;
}
Why do you assume that they should be the same? You are constructing s2 from s using its move constructor. This transfers the data ownership from s over to s2 and leaves s in an “empty” state. The standard doesn’t specify in detail what this entails, but accessing s’s data after that (without re-assigning it first) is undefined.
A much simplified (and incomplete) version of string could look as follows:
class string {
char* buffer;
public:
string(char const* from)
: buffer(new char[std::strlen(from) + 1])
{
std::strcpy(buffer, from);
}
string(string&& other)
: buffer(other.buffer)
{
other.buffer = nullptr; // (*)
}
~string() {
delete[] buffer;
}
char const* data() const { return buffer; }
};
I hope that shows why the data() members are not equal. If we had omitted the line marked by (*) we would delete the internal buffer twice at the end of main: once for s and once for s2. Resetting the buffer pointer ensures that this doesn’t happen.
Each std::string has its own buffer that it points to. The fact that you moved one from the other doesn't make it share one buffer. When you initialize s2, it takes over the buffer from s and becomes the owner of that buffer. In order to avoid s to "own" the same buffer, it simply sets s's buffer to a new empty one (and which s is responsible for now).
Technically, there are also some optimizations involved, most likely there isn't a real buffer that got explicitly allocated for empty or very small strings, but instead the implementation of std::string will use a part of the std::string's memory itself. This is usually known as the small-string-optimization in the STL.
Also note that s has been moved away, so the access of your code to it's data is illegal, meaning it could return anything.
You should not use the moved-from string before replacing its value with some known value:
The library code is required to leave a valid value in the argument, but unless the type or function documents otherwise, there are no other constraints on the resulting argument value. This means that it's generally wisest to avoid using a moved from argument again. If you have to use it again, be sure to re-initialize it with a known value before doing so.
The library can stick anything it wants into the string, but it's very likely that you would end up with an empty string. That's what running an example from cppreference produces. However, one should not expect to find anything in particular inside a moved-from object.
so I have a structure like
struct GetResultStructure
{
int length;
char* ptr;
};
I need a way to make a full copy of it meaning I need a copy to have a structure with new ptr poinnting on to copy of data I had in original structure. Is It any how possible? I mean any structure I have which contains ptrs will have some fields with its lengths I need a function that would copy my structure coping all ptrs and data they point to by given array of lengthes... Any cool boost function for it? Or any way how to create such function?
For the specific scenario you describe, use a std::vector or some other sequence container. If you do so, then simply copying objects of type GetResultStructure will make copies of the pointed-to data as well:
struct GetResultStructure {
std::vector<char> data;
};
GetResultStructure a = GetData();
GetResultStructure b = a; // a and b have distinct, equivalent data vectors
In general, when you do need to implement this yourself, you do so by implementing a copy constructor and a copy assignment operator. The easiest way to do that is to use the copy-and-swap idiom, covered in great detail in What is the copy-and-swap idiom?
It's pretty much up to you to implement that. Normally you want to do it as a copy constructor so you only have to do it in one place. Unfortunately, there's no real magic to avoid telling the computer about how to copy your structure.
Of course, that only applies if your structure really is substantially different from something that's already written. The one you've given looks a lot like a string or (possibly) vector. Unless you really need to implement something new, you're probably better off just using one of those that's already provided.
Both a copy constructor and assignment operator should be implemented (in the way stated above). A technique which may aid in this process, however, is using a dereference operator (*) when copying pointer data. This will copy the pointer data rather than the memory locations. If you do ptr1 = ptr2 it simply sets the memory location of ptr1 to ptr2 which is why we dereference.
For instance, I'll just show a quick example for a copy constructor:
GetResultStructure(const GetResultStructure& other)
: length(other.length), ptr(new char[length]) // <--- VERY _important_ - initialization of pointer
{
// Alternatively, put your initialization here like so:
// ptr = new char[length];
for(int i=0;i<length;++i)
{
ptr[i] = new char;
*ptr[i] = *other.ptr[i]; // Copy the values - not the memory locations
}
}
And then obviously be sure to clean up in your destructor to prevent memory leaks.
Regards,
Dennis M.
GetResultStructure doCopy(GetResultStructure const& copy) {
GetResultStructure newstruct;
newstruct.length = copy.length;
newstruct.ptr = new char[newstruct.length];
memcpy(newstruct.ptr, copy.ptr, newstruct.length*sizeof(char));
return newstruct;
}
Should be simple. Yes, the sizeof(char) isn't really necessary, but there to show what to do for other data types.
Since you tagged it as C++: Write a copy constructor and an assignment operator,
within which you implement your deep copy code:
struct GetResultStructure
{
GetResultStructure(const GetResultStructure& other)
{
// Deep copy code in here
}
GetResultStructure& operator=(const GetResultStructure& other)
{
if (this != &other) {
// Deep copy code in here
}
return *this
}
int length;
char* ptr;
};