mechanism that make a string empty after std::move() it - c++

I have some confusion about how a std::move() really empty something.
I write some code:
int main()
{
string str1("this is a string");
std::cout<<std::boolalpha<<str1.empty()<<std::endl;
string str2(std::move(str1));
cout<<"str1: "<<str1.empty()<<endl;
cout<<"str2: "<<str2.empty()<<endl;
}
The output is:
false
true //this mean the original string is emptied
false
Why is the original string emptied every time?
I have read some meterial about the move semantics,including the original proposal of it(this one), which said:
The difference between a copy and a move is that a copy leaves the source unchanged. A move on the other hand leaves the source in a state defined differently for each type. The state of the source may be unchanged, or it may be radically different. The only requirement is that the object remain in a self consistent state (all internal invariants are still intact). From a client code point of view, choosing move instead of copy means that you don't care what happens to the state of the source.
So,according to this words, the original content of the str1 above should be some kind of undefined. But why is that every time it have been move(), it is emptied? (Actually I have test this behavior on both std::string and std::vector but the result is the same.)
To learn more, I define my own string class to test, as bellow:
class mstring
{
private:
char *arr;
unsigned size;
public:
mstring():arr(nullptr),size(0){}
mstring(char *init):size(50)
{
arr = new char[size]();
strncpy(arr,init,size);
while(arr[size-1] != '\0') //simply copy
{
char *tmp = arr;
arr = new char[size+=50]();
strncpy(arr,tmp,50);
delete tmp;
strncpy(arr-50,init+(size-50),50);
}
}
bool empty(){ return size==0;}
}
Doing the same thing:
int main()
{
mstring str("a new string");
std::cout<<std::boolalpha<<str.empty()<<std::endl;
mstring anotherStr(std::move(str));
std::cout<<"Original: "<<str.empty()<<std::endl;
std::cout<<"Another: "<<anotherStr.empty()<<std::endl;
}
The output is:
false
Original: flase //mean that the original string is still there
Another: false
Even I add a move constructor like this:
mstring(mstring&& rvalRef)
{
*this = rvalRef;
}
The result is still the same.
My question is: why is the std::string get emptied but my self-defined isn't?

Because that's how the std::string move constructor is implemented. It takes ownership of the old string's contents (i.e. the dynamically-allocated char array), leaving the old string with nothing.
Your mstring class, on the other hand, doesn't actually implement move semantics. It has a move constructor, but all it does is copy the string using operator=. A better implementation would be:
mstring(mstring&& rvalRef): arr(rvalRef.arr), size(rvalRef.size)
{
rvalRef.arr = nullptr;
rvalRef.size = 0;
}
This transfers the contents to the new string and leaves the old one in the same state that the default constructor would have created it. This avoids the need to allocate another array and copy the old one into it; instead, the existing array just gets a new owner.

So,according to this words, the original content of the str1 above should be some kind of undefined.
There is absolutely nothing about what the state should be. The specification says what the state could be: anything defined enough that it can be destroyed or assigned new value.
Empty qualifies as anything, so the state can be empty.
It also makes most sense in this case. A string is, essentially, something like
class string {
char *_M_data;
size_t _M_size;
size_t _M_alloc;
public:
...
}
where the _M_data is allocated with new and must be deleted with delete (this is customizable with the allocator parameter, but the default allocator does just that).
Now if you don't care about the state of the source, the fastest thing that can be done is to assign the buffer to the destination and replace the buffer with nullptr in the source (so it does not get deleted twice). A string with no buffer is empty.

Related

How to efficiently move underlying data from std::string to a variable of another type?

EDIT: Sorry everyone, I don't think this toy example really reflected my problem. What I should have asked is if there is a way to release a std::string object's buffer. There is not, and that makes sense. Thanks!
Suppose I have the following (broken) code:
void get_some_data(MyCustomContainer& val)
{
std::string mystr = some_function();
val.m_data = &mystr[0];
}
This won't work, because the memory pointed by mystr is freed at the end of get_some_data, and the memory referenced by val.m_data will be invalid.
How can I tell a std::string "Don't free your memory buffer at your destructor!" ? I don't want to copy the data. The MyCustomerContainer object will handle the memory free-ing at its destructor.
You cannot do this without breaking the rules. The std::string class is not allowed to release its ownership explicitly. In fact, a std::string might not even have any memory allocated due to SBO optimization:
std::string str1 = "not allocating";
std::string str2 = "allocating on the heap, the string is too large";
This behavior is completely platform- and implementation-dependent. If a string doesn't allocate its buffer on the heap, the data is placed on the stack, which doesn't need de-allocation.
{
std::string str1 = "not allocating";
} // no buffer freed
So even if there were a way to tell the string not to de-allocate its buffer, there is no way to tell if the buffer is managed on the heap or not.
Even if there were a way to tell if the string uses the stack, you'd have to allocate a buffer in place as a class member and copy its content.
The idea of transferring a string's data and stealing its ownership over that string's memory resource is fundamentally broken as you can't get away without copying, simply because there might be no ownership to steal.
What I recommend is for you to copy the string content in all cases if you don't want to change how MyCustomContainer works:
void get_some_data(MyCustomContainer& val)
{
std::string mystr = some_function();
val.m_data = new char[mystr.size()];
std::memcpy(val.m_data, mystr.data(), mystr.size());
}
In contrast, if you allow MyCustomContainer to store a std::string, you could actually get away without copying when a buffer is allocated by moving the string:
void get_some_data(MyCustomContainer& val)
{
// let m_data be a std::string
val.m_data = some_function();
// The above is equivalent to this:
// std::string mystr = some_function();
// val.m_data = std::move(mystr);
}
Moving a string will invoke move assignation. With move assignation, the string implementation will transfer the ownership of the mystr's buffer into m_data. This will prevent any additional allocation.
If mystr didn't allocate, then the move assignment will simply copy the data (so no allocation there, either).
The right way to fix this problem is:
class MyCustomContainer {
public:
std::string m_data;
};
void get_some_data(MyCustomContainer& val) {
val.m_data = some_function();
}
The get_some_data could even be made into a member function, which would make the usage even easier at the callsite, and perhaps allow m_data to be private instead of exposed.
If .m_data is an std::string you can take advantage of std::string's move-assignment operator:
val.m_data = std::move(mystr);
If m_data is not an std::string you are pretty much out of luck, the internal buffer is inaccessible (as it should be).
No, you cannot. std containers will only give up their managed memory (and then only sometimes) to std containers of the same type.
For string this would be impossible regardless, as most implementations do a short string optimization and store short strings internally.
You could throw the std string into a global buffer somewhere and reap it at cleanup, but that gets insanely complex.
If you want you can use this code that causes Undefined Behavior, so it should not be used, but if you are working on some toy project of your own that will be quickly abandoned you can see if it works for you.
// REQUIRES: str is long enough so that it is using heap,
// std::string implementation does not use CoW implementation...
// ...
char* steal_memory(string&& str){
alignas(string) char buff[sizeof(string)];
char* stolen_memory = const_cast<char*>(str.data());
new(buff) string(move(str));
return stolen_memory;
}
If you want to handle short string you should add malloc and copy from buffer for that case.
Main idea here is to use placement new that takes the ownership from our input string, and not calling the destructor on string in buff. No destructor means no calls to free, so we can steal memory from the string.
Unfortunately const_cast is UB in this case so like I said you should never use this code in serious code.
You can do mystr static
void get_some_data(MyCustomContainer& val)
{
static std::string mystr;
mystr = some_function();
val.m_data = &mystr[0];
}
but, in this way, you have only one mystr for all get_some_data() calls; so
get_some_data(mcc1);
get_some_data(mcc2);
// now both `mcc1.m_data` and `mcc2.m_data` point to the same value,
// obtained from the second `some_function()` call
If you can compile-time enumerate the calls to get_some_data(), you can differentiate your mystr using a template index
template <std::size_t>
void get_some_data(MyCustomContainer& val)
{
static std::string mystr;
mystr = some_function();
val.m_data = &mystr[0];
}
get_some_data<0U>(mcc1);
get_some_data<1U>(mcc2);
// now `mcc1.m_data` and `mcc2.m_data` point to different values

C++ program attempting to double dealloc custom object

For future visitors
It turns out I didn't have a copy assignment operator defined in my custom class, and therefore the compiler defaulted to "copy the object's pointer" behavior.
If you're confused about what a copy assignment operator is, just like I was, this resource may help you figure out what a "copy assignment operator" looks like in C++.
The original problem prompt is below. (Links to the original source code will expire in a month; sorry!)
I'm working on a console application which simulates a bookstore, but keep on getting a _BLOCK_TYPE_IS_VALID(pHead->nBlockUse) error message in my program during execution. After doing some searching around (and some frustrating debugging), I've come to the conclusion that my object's destructor is getting called twice. I hope the code snippets below will make it a little clearer what I mean. (Clicking on the file names will open a pastie with the relevant code for easier reading/decluttering.)
bookdata.h
#ifndef BOOKDATA_H
#define BOOKDATA_H
class bookData {
private:
char* bookTitle;
char* isbn;
char* author;
char* publisher;
char* dateAdded;
int qtyOnHand;
double wholesale;
double retail;
public:
bookData();
bookData(char* title, char* isbn, char* author, char* publisher, char* date, int qty, double wholesale, double retail);
bookData(bookData& book); // Meant to be called during memberwise assignment
~bookData();
// Various setter & getter funcs
};
#endif
bookData.cpp
#include "globals.h"
#include "bookData.h"
using namespace std;
// Variables in caps are const ints defined in globals.h
bookData::bookData() {
bookTitle = new char[TITLE_LENGTH];
isbn = new char[ISBN_LENGTH];
author = new char[AUTHOR_LENGTH];
publisher = new char[PUBLISHER_LENGTH];
dateAdded = new char[DATE_LENGTH];
qtyOnHand = 0;
wholesale = 0;
retail = 0;
char emptyTitle[2];
emptyTitle[0] = '\0';
setTitle(emptyTitle);
}
// Other constructors are overloaded version of bookData & copy data;
// See example setter function below destructor
bookData::~bookData() {
if (bookTitle)
delete [] bookTitle;
else
return;
delete [] isbn;
delete [] author;
delete [] publisher;
delete [] dateAdded;
}
// Setter functions are of this form (excl. ints & doubles)
void bookData::setTitle(const char* input) {
for (int len = 0; len < TITLE_LENGTH - 1; len++) {
*(bookTitle + len) = *(input + len);
if (*(input + len) == '\0')
break;
else if (len == TITLE_LENGTH - 2)
*(bookTitle + ++len) = '\0';
}
}
// Getter functions are of this form (excl. ints & doubles)
const char* bookData::getTitle() { return bookTitle; }
reports.cpp (the file that's calling the destructor twice)
void repQty() {
// Again, variables in all caps are defined in globals.h if you don't see
// their declaration
bookData bookArray[MAX_RECORDS];
// Global function which populates bookArray from a datafile
bookData* books = getBooks(bookArray);
// Some code to find the memory address of the first and last book in the records
bookData* HEAD = books;
// Keep advancing until books no longer points to a non-empty bookData object
// "Empty" defined as book's bookTitle variable starting with '\0'
bookData* TAIL = --books;
// Need all 3 pointers for a naive, in place insertion/linear sort routine
// Outputs book data following the sort
// Before returning, calls the destructors for the books in bookArray
// Also calls the destructor for books, HEAD, and TAIL as well
// ...which were already called as part of the bookArray's destructor calls
// Which is where I have my problem now
}
Bonus: globals.h
As you may have noticed, I've already attempted to check whether the bookData object has already been deleted by using if (bookTitle) in the destructor function, but it still evaluates as true when I'm running it through VS's Step Into functionality. Short of nuking the destructor all together, what can I do to get around this problem and make the destructor prematurely exit if the object in question has already been deallocated?
I've already attempted to check whether the bookData object has already been deleted by using if (bookTitle) in the destructor function
Since delete[] doesn't set the pointer to NULL, the check is effectively a no-op.
Even if you set the pointer to NULL manually, you'd be tackling the symptoms of the problem rather than the root cause.
The root cause is that you're not implementing the copy assignment operator, thereby violating the rule of three. What happens is that you're using the implicitly-generated copy assignment operator:
swap = *books;
*books = *(books - 1);
*(books - 1) = swap;
and that operator doesn't do the right thing: it copies the pointers instead of copying the data. The double deletes are a direct consequence of that.
Additionally, the implementation of the copy constructor could be buggy, but it's hard to be sure without seeing its source code.
P.S. You'd do yourself a massive favour by using std::string instead of C strings. Also, std::vector is to be preferred to C arrays.

const char * overwritten in next iteration of while loop

First of all, everything is happening in an if{} statement in a do{}while loop. I have a struct that contains some const char pointers. I'm trying to get info into a temp struct with new string values each iteration, then push this struct into a vector of said structs, so that when the function exits, the vector is populated with different struct objects.
do{
if()
{
sound_device_t newDevice; //<--- Why is this the same mem address each iteration?
//I thought it would be destroyed when it's scope was (the if block)
const char * newPath;
someFunction(&newPath); //puts a string into newPath
newDevice.firstString = newPath; //<-- This works.
QString otherPath(const char *);
//...some QString manipulation...//
newDevice.secondString = otherPath.toLocal8Bit().data(); //<--this doesn't
vector_of_structs -> push_back(newDevice);
}
}while (...)
I was under the impression that push_back copied the argument struct's values into its own version. Why is the QString giving me problems? I am using QString because it has some good string manipulation functions (i.e. insert and section), but I'll exchange it if I need to for something that works.
I have also tried putting the QString's data into a char * and then strcpy'ing it into the struct, but that has the same result. Every iteration rewrites newDevice.secondString.
QByteArray::data() is only valid so long as the ByteArray is unchanged. Destroying the temporary is changing.
In other words after the semi colon of the line newDevice.secondString = otherPath.toLocal8Bit().data(); the QByteArray returned by toLocal8Bit is destroyed and the stored array deleted.
There are a couple of issues with your code:
if statement without condition (!)
invalid construction: QString otherPath(const char *); You probably want an "otherPath" variable there similarly to "newPath".
You are mixing qt types with std containers. You should take a loo at QStringList.
Needless pointer usage: newDevice.secondString = otherPath.toLocal8Bit().data();
The last one is especially critical since you destroy otherPath before the next iteration. The solution is to use a deep copy in there.
I would write something like this:
do {
if(cond) {
sound_device_t newDevice;
const char * newPath;
someFunction(&newPath);
newDevice.firstString = newPath;
// Get other path
QString otherPath(otherPath);
//...some QString manipulation...
newDevice.secondQString = otherPath;
// or: strcpy( newDevice.secondString, otherPath.toLocal8Bit().data());
vector_of_structs->push_back(newDevice);
}
} while (...)
That being said, depending on what you are trying to do, QtMultiMedia might be better used overall for your sound device purpose. As long as dbus goes, there is also a QtDBus add-on module.
Thanks for all the help guys. I got the original code working with just one tweak:
newDevice.secondString = otherPath.toLocal8Bit().data();
should be changed to
newDevice.secondString = strdup(otherPath.toLocal8Bit().data());
This does the buffer allocation directly, as #ratchet freak was suggesting. strcpy() doesn't work because it still connects newDevice.secondString with the QByteArray, just like toLatin1().data() does.

Strange this-> behaviour

So i have the following class
class Community
{
private:
char* Name;
char foundationDate[11];
Person* founder;
int maxMembersCount;
int membersCount;
Person* members;
static int communitiesCount;
.....
and i want to implement a copy constructor :
Community::Community(const Community& other)
{
this->Name = new char[strlen(other.Name)+1];
strcpy(this->Name,other.Name);
strcpy(this->foundationDate,other.foundationDate);
this->founder = other.founder;
this->maxMembersCount = other.maxMembersCount;
this->membersCount = other.membersCount;
this->members = new Person[this->maxMembersCount];
this->members = other.members;
communitiesCount++;
}
but this code crashes whenever i say Community A=B;
so for me this code seems legit, but when i start debugging there is the message: this-> "unable to read memory". Please help me if you need more code example please let me know.
Community::Community(const char* name , char foundDate[],Person* founder,int maxMembers) {
this->Name = new char[strlen(name)+1];
strcpy(this->Name,name);
strcpy(this->foundationDate,foundDate);
this->founder = new Person(founder->getName(),founder->getEGN(),founder->getAddress());
this->maxMembersCount = maxMembers;
this->membersCount = 2;
this->members = new Person[this->maxMembersCount];
communitiesCount++;
}
this is the main constructor of the class which works just fine....
There are multiple problems here, any of whichi could be part or all of the problem.
If Name or foundationDate is not null-terminated on the right-hand side, it will run off and copy bad memory.
If founder or members are owned by the object, you will either leak memory if you don't delete them in the destructor, or cause a whole variety of memory-related problems when you shallow-copy and then delete twice, etc.
To fix this, just make your Name and foundationDate std::string, and then make founder and members be owned by value rather than by pointer. If you absolutely have to allocate them on the heap use a smart pointer such as shared_ptr to hold it instead of a bug-prone raw pointer.
First of all, check that other.Name is filled with a pointer to a null-terminated string, that other.foundationDate contains a null-terminated string. That is, you pass good pointers to strlen and strcpy.
If that's true, check that B in the assignment is accessible altogether.
If that's true too, printf everything. And debug where exactly the exception occurs. Or post whole code that is compilable and which reproduces the error.
Also note that here:
this->members = new Person[this->maxMembersCount];
this->members = other.members;
the first assignment does nothing (leaks memory, in fact) while the second double deletes your memory upon object destruction (if you properly delete[] members).

std::vector overwriting final value, rather than growing?

I'm having an issue where using vector.push_back(value) is overwriting the final value, rather than appending to the end. Why might this happen? I have a sample item in the vector, so it's size never hits zero. Below is the code..
void UpdateTable(vector<MyStruct> *Individuals, MyStruct entry)
{
MyStruct someEntry;
bool isNewEntry = true;
for (int i = 0; i < Individuals->size(); i++)
{
if (!(strcmp(Individuals->at(i).sourceAddress, entry.sourceAddress)))
{
isNewEntry = false;
//snip. some work done here.
}
}
if(isNewEntry)
{
Individuals->push_back(entry);
}
}
This let's my first "sample" value stay in, and will allow for just one more item in the vector. When 2 new entries are added, the second overwrites the first, so the size is never larger than 2.
edit: More code, since this is apparently not the issue?
void *TableManagement(void *arg)
{
//NDP table to store discovered devices.
//Filled with a row of sample data.
vector<MyStruct> discoveryTable;
MyStruct sample;
sample.sourceAddress = "Sample";
sample.lastSeen = -1;
sample.beaconReceived = 1;
discoveryTable.push_back(sample);
srand(time(NULL));
while(1)
{
int sleepTime = rand() % 3;
sleep(sleepTime);
MyStruct newDiscovery = ReceivedValue();
if (newDiscovery.lastSeen != -1000) //no new value from receivedValue()
{
UpdateTable(&discoveryTable, newDiscovery);
}
printTable(&discoveryTable);
}
return NULL;
}
I'm going to hazard a guess:
Suppose MyStruct is declared like
struct MyStruct
{
const char *sourceAddress;
// Other Gubbins ...
};
And that ReceivedValue does something like
MyStruct ReceivedValue()
{
static char nameBuffer[MAX_NAME_LEN];
// Do some work to get the value, put the name in the buffer
MyStruct s;
s.sourceAddress = nameBuffer;
// Fill out the rest of MyStruct
return s;
}
Now, every structure you push into your vector has sourceAddress pointing to the same global buffer, every time you call ReceivedValue it overwrites that buffer with the new string - so every entry in your vector ends up with the same string.
I can't be sure without seeing the rest of your code, but I can be sure that if you follow some of the good C++ style suggestions in the comments to your question this possiblity would go away.
Edit for clarification: there's no need to heap allocate your structures, simply declaring sourceAddress as a std::string would be sufficient to eliminate this possibility.
The scope for the items you are pushing into the database is expiring. They're being destructed when you leave the {} in which they were created - and as such the reference to them is no longer valid.
You need to change it from vector<MyStruct> to vector<MyStruct*> (preferably using safe pointers from Boost:: instead of pointers, but you get the idea).
You are creating the item within the (limited) scope and pushing it onto the vector (while the struct is copied, the strings in it are not!) it then reuses the same memory location (most likely if properly optimized) to store the next "new" struct and the one after that and so on and so forth.
Instead, within the limited scope create MyStruct *myObject = new MyStruct and assign its values, then push the pointer to the vector.
Remember to delete all values from the vector before clearing it/destroying it!!
Or, of course, you could use std::string/CString/whatever instead of a char array and avoid the issue entirely by having a safe-to-copy struct.
ComputerGuru's answer works however there in another alternative. You can create a copy constructor and overload operator= for MyStruct. In these operations, you need to copy the actual string into the new struct. In C++, structs are nothing more than classes with default public access instead of default private access. Another alternative is to use std::string instead of char* for the string value. C++ strings already have this behavior.
struct MyStruct {
std::string sourceAddress;
int lastSeen;
int beaconReceived;
};
Seems odd to me: Maybe there is something wrong with the //snip part of the code?
Try to log the size of the vector before and after the push_back call (either in the debugger or using cout) and also have a look at the isNewEntry variable.
Your code looks alright to me. Is it possible that you are not passing the right vector in? What I mean is that the behaviour you describe would appear if somehow your Individuals vector is being reset to its orginal 1-entry state before you tried to add the 3rd entry, then it would appear as if your 2nd entry was being overwritten.
Here is what I mean:
int test_table()
{
string SampleAddresses[] = {"Sample Address 1", "Sample Address 2"};
for (int i = 0; i < 2; i++)
{
// All this work to build the table *should* be done outside the loop; but we've accidentally put it inside
// So the 2nd time around we will destroy all the work we did the 1st time
vector<MyStruct> Individuals;
MyStruct Sample;
Sample.sourceAddress = "Sample Address 0";
Test.push_back(Sample);
// this is all we meant to have in the loop
MyStruct NewEntry;
NewEntry.sourceAddress = SampleAddresses[i];
UpdateTable(Individuals, NewEntry);
}
//Now the table has 2 entries - Sample Address 0 and Sample Address 2.
}
If this was all your code then the problem would be obvious. But it might be concealed in some other pieces of code.