C++ std::string syntax "new (&y) std::string(x);" - c++

How to understand the following std::string init syntax?
#include <iostream>
#include <string>
int main ()
{
std::string y;
std::string x = "x str";
new (&y) std::string(x);
std::cout << y << std::endl;
return 0;
}
Output:
x str
Can we split the statement into 2 steps?
1.
string* temp = new std::string(x);
2.
(&y) = temp
So the original statement is just a shortcut for step 1 + 2.
Reference:
1.
https://en.cppreference.com/w/cpp/string/basic_string/basic_string
2.
https://en.cppreference.com/w/cpp/string/basic_string

This is called "placement new". The basic idea is that you supply an address, and it invokes the constructor for that type to create an object at that address.
In a typical case, you'd give it an address that's "raw" memory though, not the address of an existing object. In fact, I'm not sure the code above really has defined behavior (though I'd have to look at the standard carefully to be sure whether it does or not).
It's most often used for things like collection classes, which allocate raw memory, the when you do something like insert or push_back, it constructs an object in that memory. The standard collections objects go through an allocator object to handle the construction, but in the end, it'll (at least usually) end up as placement new doing the real work.
The parameters you pass to new will be passed through to the constructor for the object being created. So in your case, it'll copy x into a new string at y's address (i.e., it'll replace y with a copy of x).

Related

Can you call the destructor without calling the constructor?

I've been trying not to initialize memory when I don't need to, and am using malloc arrays to do so:
This is what I've run:
#include <iostream>
struct test
{
int num = 3;
test() { std::cout << "Init\n"; }
~test() { std::cout << "Destroyed: " << num << "\n"; }
};
int main()
{
test* array = (test*)malloc(3 * sizeof(test));
for (int i = 0; i < 3; i += 1)
{
std::cout << array[i].num << "\n";
array[i].num = i;
//new(array + i) i; placement new is not being used
std::cout << array[i].num << "\n";
}
for (int i = 0; i < 3; i += 1)
{
(array + i)->~test();
}
free(array);
return 0;
}
Which outputs:
0 ->- 0
0 ->- 1
0 ->- 2
Destroyed: 0
Destroyed: 1
Destroyed: 2
Despite not having constructed the array indices. Is this "healthy"? That is to say, can I simply treat the destructor as "just a function"?
(besides the fact that the destructor has implicit knowledge of where the data members are located relative to the pointer I specified)
Just to specify: I'm not looking for warnings on the proper usage of c++. I would simply like to know if there's things I should be wary of when using this no-constructor method.
(footnote: the reason I don't wanna use constructors is because many times, memory simply does not need to be initialized and doing so is slow)
No, this is undefined behaviour. An object's lifetime starts after the call to a constructor is completed, hence if a constructor is never called, the object technically never exists.
This likely "seems" to behave correctly in your example because your struct is trivial (int::~int is a no-op).
You are also leaking memory (destructors destroy the given object, but the original memory allocated via malloc still needs to be freed).
Edit: You might want to look at this question as well, as this is an extremely similar situation, simply using stack allocation instead of malloc. This gives some of the actual quotes from the standard around object lifetime and construction.
I'll add this as well: in the case where you don't use placement new and it clearly is required (e.g. struct contains some container class or a vtable, etc.) you are going to run into real trouble. In this case, omitting the placement-new call is almost certainly going to gain you 0 performance benefit for very fragile code - either way, it's just not a good idea.
Yes, the destructor is nothing more than a function. You can call it at any time. However, calling it without a matching constructor is a bad idea.
So the rule is: If you did not initialize memory as a specific type, you may not interpret and use that memory as an object of that type; otherwise it is undefined behavior. (with char and unsigned char as exceptions).
Let us do a line by line analysis of your code.
test* array = (test*)malloc(3 * sizeof(test));
This line initializes a pointer scalar array using a memory address provided by the system. Note that the memory is not initialized for any kind of type. This means you should not treat these memory as any object (even as scalars like int, let aside your test class type).
Later, you wrote:
std::cout << array[i].num << "\n";
This uses the memory as test type, which violates the rule stated above, leading to undefined behavior.
And later:
(array + i)->~test();
You used the memory a test type again! Calling destructor also uses the object ! This is also UB.
In your case you are lucky that nothing harmful happens and you get something reasonable. However UBs are solely dependent on your compiler's implementation. It can even decide to format your disk and that's still standard-conforming.
That is to say, can I simply treat the destructor as "just a function"?
No. While it is like other functions in many ways, there are some special features of the destructor. These boil down to a pattern similar to manual memory management. Just as memory allocation and deallocation need to come in pairs, so do construction and destruction. If you skip one, skip the other. If you call one, call the other. If you insist upon manual memory management, the tools for construction and destruction are placement new and explicitly calling the destructor. (Code that uses new and delete combine allocation and construction into one step, while destruction and deallocation are combined into the other.)
Do not skip the constructor for an object that will be used. This is undefined behavior. Furthermore, the less trivial the constructor, the more likely that something will go wildly wrong if you skip it. That is, as you save more, you break more. Skipping the constructor for a used object is not a way to be more efficient — it is a way to write broken code. Inefficient, correct code trumps efficient code that does not work.
One bit of discouragement: this sort of low-level management can become a big investment of time. Only go this route if there is a realistic chance of a performance payback. Do not complicate your code with optimizations simply for the sake of optimizing. Also consider simpler alternatives that might get similar results with less code overhead. Perhaps a constructor that performs no initializations other than somehow flagging the object as not initialized? (Details and feasibility depend on the class involved, hence extend outside the scope of this question.)
One bit of encouragement: If you think about the standard library, you should realize that your goal is achievable. I would present vector::reserve as an example of something that can allocate memory without initializing it.
You currently have UB as you access field from non-existing object.
You might let field uninitialized by doing a constructor noop. compiler might then easily doing no initialization, for example:
struct test
{
int num; // no = 3
test() { std::cout << "Init\n"; } // num not initalized
~test() { std::cout << "Destroyed: " << num << "\n"; }
};
Demo
For readability, you should probably wrap it in dedicated class, something like:
struct uninitialized_tag {};
struct uninitializable_int
{
uninitializable_int(uninitialized_tag) {} // No initalization
uninitializable_int(int num) : num(num) {}
int num;
};
Demo

Why the parameter doesn't change in the class constructor?

Here it is my code. I don't get it why it doesn't print 3, even though in the class constructor param1 becomes 3.
#include <iostream>
using namespace std;
class A{
int valoare;
public:
A(int param1 = 3):valoare(param1){}
int getValoare(){return this -> valoare;}
};
int main()
{
A vector[] = {*(new A(3)), *(new A(4)), *(new A(5)), *(new A(6))};
cout << vector[2].getValoare();
return 0;
}
You might want to read about default arguments: https://en.cppreference.com/w/cpp/language/default_arguments
When you specify an argument for a function with a defualt argument it overrides that default value. Thus, your code will print out 5.
As a side note, your code has a memory leak becuase you allocated memory with the new keyword and never deleted it. You should change the declaration of your Vector, that is, allocate memory on the stack like follows:
Vector = {A(3), A(4), A(5), A(6)}
The element at index 2 in the vector was constructed as A(5), so it's value ("valoare") is 5. The = 3 in the function definition is a default argument - which is used if you don't specify one yourself. So if you were to write:
std::cout << A().getValoare();
that would print 3.
But a few more observations are in order:
Prefer English-language names. valoare means "value" in some Latin or European language, right? Romanian perhaps? But - people who don't speak that language won't know that. Since you have to know English to program anyways, that's a safe choice for names.
Try not to use names for variables which are also names of classes in a different namespace. For example, your vector has the same name as std::vector, a class, or rather a class template, in the standard library. Try vec or my_vector or something else that's more distinctive.
You're leaking memory! Why are you using new to create values? Just use the construct, i.e.
A vector[] = { A(3), A(4), A(5), A(6) };
is just fine.
More generally, you should avoid calling new and delete explicitly, and instead prefer RAII-style classes - which allocate on construction and deallocate on destruction. The simplest thing to do is to switch to using smart pointers
Why the parameter doesn't change in the class constructor?
It doesn't print the value 3 because you're giving it another value.
from cppreference:
Default arguments are used in place of the missing trailing arguments in a function call:
void point(int x = 3, int y = 4);
point(1,2); // calls point(1,2)
point(1); // calls point(1,4)
point(); // calls point(3,4)

when we return a c++ container like vector,list. What happens?

vector<int> function(...)
{
.......
.......
vector<int> C = some value;
return C
}
int main()
{
X = function(...)
}
what would the value of x be, will it be the address of C like when we return an array or would the returned value be the complete vector C copied into X. in what cases should you dynamically allocate a container? Would there be any differnce in the final X if pass the vector by reference or by value?
Normally whenever you return something from a function it is copied. And when this something has a copy constructor then it will be fired. For example arrays: they are just pointers (with trivial copy constructor) to some block of memory. So what actually is copied is only the pointer itself. But vectors are quite complicated classes. And they have copy constructors which actually copy entire content. This does not scale well.
But here's the thing: you almost never allocate vectors dynamically. In many cases (for example in your pseudo-code) the content is not copied due to (Named) Return Value Optimization or other copy elision optimization. It is literally the same vector.
If in some case copy elision does not fire (or you are not sure) it is still better to pass the vector by ref to the function rather then dynamically allocate it on the heap. Allocating memory in a function and then returning it to the caller is an anti-pattern (even though sometimes necessary). This causes the big problem: who is responsible for freeing the memory? You need to know the source code of the function (or at least the docs) to know that.
Another option (when copy elision does not apply) is to use std::move to avoid copies.
On the other hand passing vector by value will create a copy of that vector. Most certainly you want to pass it by ref or const ref.
Also I encourage you to check all those things yourself. Try printing raw pointers &C and &X to see if it is the same object.
By a pure semantic standpoint, despite of its type name, C is a plain local variable (that most likely contains three pointers: to the begin, end and capacity of the dynamic buffer containing the data).
That variable is moved in a temporary object given to the return statement that is in turn moved into the X variable to replace its content.
RVO optimization can skip the first move, by making the C variable itself to live inside the stack frame of the outer function (main - in your case).
Since you did not declare any type for X, I must assume it is already existent std::vector<int>, so the = is actually an assignment.
Since std::vector implement move semantics:
The actual inner pointers in X and C are swapped (so that X holds C content and C holds the old X content
C is destroyed
C destructor will destroy the retained "old X content"
Standard library containers are themselves dynamic content managers. Allocate them, dynamically and pass them around as pointers is the biggest nonsense a C++ programmer can do. At least from 2011 onward.
If "function" does not access X in any way, further optimization can even remove the assignment at all, by making C an alias of X, so that The content of X is replaced by the one of C at the time of its construction. the return and the = are simply removed.

std::string loses value when passed in function inside a class object

I am really confused how compiler allocates STL objects. Consider the following code:
#include <string>
using namespace std ;
class s {
public:
string k ;
s(string k) : k(k) {}
} ;
void x ( s obj ) {
string k = (obj.k) ;
k += "haha" ;
}
int main () {
std::string mystr ("laughter is..") ;
s mys(mystr) ;
x(mys) ;
printf ("%s", mystr.c_str() ) ;
}
The output of this program is laughter is.. and I expect the output to be:
laughter is haha
Why doesn't mystr string get haha . I need to store it in a class as a part of my code.
If I had passes mystr by value to function x, the string mystr would have got haha into it.
a) How and when do STL objects get allocated? I supposed mystr is on a stack and must be accessible to all functions called from main() .
b) What if I need to store STL objects in a old fashioned Linked list which needs "void*". Cant I just do:
std::string mystr ("mystring.." );
MyList.Add((void*)&mystr) ;
fun(MyList) ;
Can the function fun, now use and modify mystr by accessing MyList ?
c) As an alternative to (b) , can I use pass by reference. The issue is can I declare a class to keep a reference of mystr? I mean the constructor of MyList can be like this:
class MyList {
string& mStr ;
...
};
MyList::MyList ( string& mystr ) {
mStr = mystr ;
}
Is that constructor valid ? Is that class valid?
Your class is just complicating the situation for you. You have exactly the same problem here:
void x ( string str ) {
str += "haha" ;
}
int main () {
std::string mystr ("laughter is..") ;
x(mystr) ;
printf ("%s", mystr.c_str() ) ;
}
I've gotten rid of the class. Instead of putting mystr into an s object and passing the s object to x, I just pass mystr directly. x then attempts to add "haha" to the string.
The problem is that x takes its argument by value. If you pass an object by value, you are going to get a copy of it. That is, the str object is a different object to mystr. It's a copy of it, but it's a different object. If you modify str, you're not going to affect mystr at all.
If you wanted x to be able to modify its argument, you'd need to make it take a reference:
void x ( string& str ) {
str += "haha" ;
}
However, I understand why you introduced the class. You're thinking "Well if I give the string to another object and then pass that object along, the string should be the same both outside and inside the function." That's not the case because your class is storing a copy of the string. That is, your class has a member string k; which will be part of any object of that class type. The string k isn't the same object as mystr.
If you want to modify objects between functions, then you need some form of reference semantics. That means using pointers or references.
As for your questions:
Yes, the string object mystr is on the stack. That has nothing to do with it coming from the standard library though. If you write a declaration inside a function, that object is going to be on the stack, whether it's int x;, string s;, SomeClass c;, or whatever.
The internal storage of data inside mystr is, on the other hand, dynamically allocated. It has to be because the size of a std::string can vary, but objects in C++ always have fixed size. Some dynamic allocation is necessary. However, you shouldn't need to care about this. This allocation is encapsulated by the class. You can just treat mystr as a string.
Please don't use a linked list that stores void*s. Use std::list instead. If you want a linked list of strings, you want std::list<std::string>. But yes, if you have an object that stores pointers to some other objects and you pass that object around by value, the pointers in the copies will still be pointing at the same locations, so you can still modify the objects that they point to.
If you have a std::list<std::string> and you want to pass it to a function so that the function can modify the contents of the container, then you need to pass it by reference. If you also need the elements of the list to be references to the objects you created outside the list, you need to use a std::list<std::reference_wrapper> instead.
As far as initialising a reference member is concerned, you need to use a member initialisation list:
MyList::MyList(string& mystr)
: mStr(mystr)
{ }
The string k that you manipulate in your function x is a copy of the string k in your object obj. And obj itself is already a copy of what you pass and the string you pass and store in obj is also already a copy. So it's very, very far from the original mystr that you expect to being altered.
To your other questions:
a) Yes, objects in this way are stack allocated. Not just stl, any objects. Otherwise you need to use new.
b) No you cannot pass it like this, since it's stack allocated the memory will become invalid. You need to heap allocate it using new.
c) Yes you can pass by reference, but again, it's important where you allocate things.
As others point out, those are some very basic questions and you need t read about heap vs stack allocation and pass by reference and pass by value first and then have a look at some basic STL classes and containers.
Strictly speaking, your question has nothing to do with the STL, even if you accept "STL" as a synonym for the correct "containers, iterators and algorithms of the C++ standard library". std::string was at one point of is history made to appear like a container (a container of characters, that is), but it is generally used in quite a different fashion than "real" container classes like std::vector or std::set.
Anyway,
Why doesnt mystr string get "haha"
Because you don't use references. x modifies a copy of the argument; likewise, string k = (obj.k) creates a copy of the string. Here is the code with references:
void x ( s &obj ) {
string &k = (obj.k) ;
k += "haha" ;
}
a) How and when do STL objects get allocated?
The container object itself is allocated as you define it. How it allocates memory internally is defined by its allocator template parameter, by default std::allocator. You don't really want to know the internals of std::allocator - it almost always does the right thing. And I don't think your question is about internal allocations, anyway.
I supposed mystr is on a stack and must be accessible to all functions called from main()
Yes.
b) What if I need to store STL objects in a old fashioned Linked list
which needs "void*".
Use std::list<void*>.
But you don't have to do this. Use std::list<std::string> and you likely won't need pointers in your code at all.
As for your further code examples:
std::string mystr ("mystring.." );
MyList.Add((void*)&mystr) ;
fun(MyList) ;
Can the function fun, now use and modify mystr by accessing MyList ?
Yes. However, the code has two problems. The smaller one is (void*)&mystr. Generally, you should avoid C-style casts but use one of static_cast, reinterpret_cast, const_cast or dynamic_cast, depending on which conversion you need. And in this piece of code, you don't need a cast at all, anyway.
The bigger problem is adding the address of a local variable to something which looks like it expects dynamically allocated objects. If you return MyList from a function, mystr will be destroyed and the copied list will contain a pointer to a dead object, eventually leading to undefined results.
In order to solve this, you have to learn more about new, delete and, possibly, smart pointers. This is beyond the scope of a simple answer, and the outcome would probably still be worse than std::list<std::string>.
The issue is can I declare a class to keep a reference of mystr?
Yes, but you should generally avoid it, because it easily leads to dangling references, i.e. references to dead objects, for the reasons explained above.
class MyList {
string& mStr ;
...
};
MyList::MyList ( string& mystr ) {
mStr = mystr ;
}
Is that constructor valid ?
No, it won't compile. You'd need to use an initialisation list:
MyList::MyList ( string& mystr ) : myStr(mystr) {}
I can only repeat my recommendation from above. Use std::list<std::string>.

Qt variable re-assignment

I have two examples I have a question about. Let me explain via some code:
Question 1:
QStringList qsl(); // Create a list and store something in it
qsl << "foo";
QString test = "this is a test";
qsl = test.split(" ", QString::SkipEmptyParts); // Memory Leak?
What happens when I re-assign the qsl variable what happens to "foo" and the original data allocated on the first line?
Question 2:
class Foo
{
QStringList mylist;
void MyFunc(QStringList& mylist)
{
this->m_mylist = mylist;
}
void AddString(QString str)
{
mylist << str;
}
}
int main()
{
Foo f;
QStringList *qsl = new QStringList();
f.MyFunc(*qsl);
delete qsl;
f.AddString("this is a test"); // Segfault?
}
Here I'm passing a list by reference to a class which is then stored in said class. I then delete the original object.
It basically all comes down to what happens when you assign a QObject to a QObject. I assume a copy of the object is made, even if the object was passed in via reference (not via pointer of course, that would just be a pointer copy).
I also assume that something like QStringList performs a deepcopy...is this correct?
Assigning to a QStringList variable works the same as assigning to any other variable in C++. For objects, the assignment operator of the object on the left is called to copy the content of the object on the right into the object on the left. Usually this does just a memberwise assignment:
struct A {
int x;
QString y;
A& operator=(const A &other) {
// do the assignment:
x = other.x;
y = other.y;
return *this;
}
};
The object on the left of the assignment "adapts itself" to contain the same things as the object on the right. There is no new object allocated, just the existing one is modified.
If the class is more complicated and for example contains pointers to dynamically allocated data (like it is probably is the case for QStringList), the assignment operator might be more complicated to implement. But this is an implementation detail of the QStringList class and you should not have to worry about that. The QStringList object on the left of the assignment will be modified to be equal to the object on the right.
In Question 2 you assign an object to a member variable, which causes the object in the member variable to be modified so that it contains the same things as the object that is assigned to it. That this other object later is deleted doesn't matter to the member variable.
Semantically this is the same as when assigning simple integers:
int i, j;
i = j;
The memory where i is stored is modified, so that it contains the same value as j. What happens to j later on doesn't matter to the value of i.
What happens when I re-assign the qsl variable what happens to "foo" and the original data allocated on the first line?
You can't reassign qsl to something else within the same scope.
Once it goes out of scope the memory will be reclaimed in it's destructor.
You can put different data into qsl, in which case it will replace "foo", more memory might be allocated if necessary
edit: eg. you can't have
"QStringlist qsl;" Then in the same code block have "int qsl;"
You can replace the strings in qsl with a different list and the container will handle the memory for you
I also assume that something like QStringList performs a deepcopy
Yes, - actually it's a little more complicated, to save time/memory Qt will only do the copy when it needs to, ie when it changes. If you copy "a string" to lots of different string lists, Qt will just keep one copy and share it around, when one changes it will allocate a new copy for the changed one - it's called "copy on write" but happens automatically and you don't need to care.