I am really confused how compiler allocates STL objects. Consider the following code:
#include <string>
using namespace std ;
class s {
public:
string k ;
s(string k) : k(k) {}
} ;
void x ( s obj ) {
string k = (obj.k) ;
k += "haha" ;
}
int main () {
std::string mystr ("laughter is..") ;
s mys(mystr) ;
x(mys) ;
printf ("%s", mystr.c_str() ) ;
}
The output of this program is laughter is.. and I expect the output to be:
laughter is haha
Why doesn't mystr string get haha . I need to store it in a class as a part of my code.
If I had passes mystr by value to function x, the string mystr would have got haha into it.
a) How and when do STL objects get allocated? I supposed mystr is on a stack and must be accessible to all functions called from main() .
b) What if I need to store STL objects in a old fashioned Linked list which needs "void*". Cant I just do:
std::string mystr ("mystring.." );
MyList.Add((void*)&mystr) ;
fun(MyList) ;
Can the function fun, now use and modify mystr by accessing MyList ?
c) As an alternative to (b) , can I use pass by reference. The issue is can I declare a class to keep a reference of mystr? I mean the constructor of MyList can be like this:
class MyList {
string& mStr ;
...
};
MyList::MyList ( string& mystr ) {
mStr = mystr ;
}
Is that constructor valid ? Is that class valid?
Your class is just complicating the situation for you. You have exactly the same problem here:
void x ( string str ) {
str += "haha" ;
}
int main () {
std::string mystr ("laughter is..") ;
x(mystr) ;
printf ("%s", mystr.c_str() ) ;
}
I've gotten rid of the class. Instead of putting mystr into an s object and passing the s object to x, I just pass mystr directly. x then attempts to add "haha" to the string.
The problem is that x takes its argument by value. If you pass an object by value, you are going to get a copy of it. That is, the str object is a different object to mystr. It's a copy of it, but it's a different object. If you modify str, you're not going to affect mystr at all.
If you wanted x to be able to modify its argument, you'd need to make it take a reference:
void x ( string& str ) {
str += "haha" ;
}
However, I understand why you introduced the class. You're thinking "Well if I give the string to another object and then pass that object along, the string should be the same both outside and inside the function." That's not the case because your class is storing a copy of the string. That is, your class has a member string k; which will be part of any object of that class type. The string k isn't the same object as mystr.
If you want to modify objects between functions, then you need some form of reference semantics. That means using pointers or references.
As for your questions:
Yes, the string object mystr is on the stack. That has nothing to do with it coming from the standard library though. If you write a declaration inside a function, that object is going to be on the stack, whether it's int x;, string s;, SomeClass c;, or whatever.
The internal storage of data inside mystr is, on the other hand, dynamically allocated. It has to be because the size of a std::string can vary, but objects in C++ always have fixed size. Some dynamic allocation is necessary. However, you shouldn't need to care about this. This allocation is encapsulated by the class. You can just treat mystr as a string.
Please don't use a linked list that stores void*s. Use std::list instead. If you want a linked list of strings, you want std::list<std::string>. But yes, if you have an object that stores pointers to some other objects and you pass that object around by value, the pointers in the copies will still be pointing at the same locations, so you can still modify the objects that they point to.
If you have a std::list<std::string> and you want to pass it to a function so that the function can modify the contents of the container, then you need to pass it by reference. If you also need the elements of the list to be references to the objects you created outside the list, you need to use a std::list<std::reference_wrapper> instead.
As far as initialising a reference member is concerned, you need to use a member initialisation list:
MyList::MyList(string& mystr)
: mStr(mystr)
{ }
The string k that you manipulate in your function x is a copy of the string k in your object obj. And obj itself is already a copy of what you pass and the string you pass and store in obj is also already a copy. So it's very, very far from the original mystr that you expect to being altered.
To your other questions:
a) Yes, objects in this way are stack allocated. Not just stl, any objects. Otherwise you need to use new.
b) No you cannot pass it like this, since it's stack allocated the memory will become invalid. You need to heap allocate it using new.
c) Yes you can pass by reference, but again, it's important where you allocate things.
As others point out, those are some very basic questions and you need t read about heap vs stack allocation and pass by reference and pass by value first and then have a look at some basic STL classes and containers.
Strictly speaking, your question has nothing to do with the STL, even if you accept "STL" as a synonym for the correct "containers, iterators and algorithms of the C++ standard library". std::string was at one point of is history made to appear like a container (a container of characters, that is), but it is generally used in quite a different fashion than "real" container classes like std::vector or std::set.
Anyway,
Why doesnt mystr string get "haha"
Because you don't use references. x modifies a copy of the argument; likewise, string k = (obj.k) creates a copy of the string. Here is the code with references:
void x ( s &obj ) {
string &k = (obj.k) ;
k += "haha" ;
}
a) How and when do STL objects get allocated?
The container object itself is allocated as you define it. How it allocates memory internally is defined by its allocator template parameter, by default std::allocator. You don't really want to know the internals of std::allocator - it almost always does the right thing. And I don't think your question is about internal allocations, anyway.
I supposed mystr is on a stack and must be accessible to all functions called from main()
Yes.
b) What if I need to store STL objects in a old fashioned Linked list
which needs "void*".
Use std::list<void*>.
But you don't have to do this. Use std::list<std::string> and you likely won't need pointers in your code at all.
As for your further code examples:
std::string mystr ("mystring.." );
MyList.Add((void*)&mystr) ;
fun(MyList) ;
Can the function fun, now use and modify mystr by accessing MyList ?
Yes. However, the code has two problems. The smaller one is (void*)&mystr. Generally, you should avoid C-style casts but use one of static_cast, reinterpret_cast, const_cast or dynamic_cast, depending on which conversion you need. And in this piece of code, you don't need a cast at all, anyway.
The bigger problem is adding the address of a local variable to something which looks like it expects dynamically allocated objects. If you return MyList from a function, mystr will be destroyed and the copied list will contain a pointer to a dead object, eventually leading to undefined results.
In order to solve this, you have to learn more about new, delete and, possibly, smart pointers. This is beyond the scope of a simple answer, and the outcome would probably still be worse than std::list<std::string>.
The issue is can I declare a class to keep a reference of mystr?
Yes, but you should generally avoid it, because it easily leads to dangling references, i.e. references to dead objects, for the reasons explained above.
class MyList {
string& mStr ;
...
};
MyList::MyList ( string& mystr ) {
mStr = mystr ;
}
Is that constructor valid ?
No, it won't compile. You'd need to use an initialisation list:
MyList::MyList ( string& mystr ) : myStr(mystr) {}
I can only repeat my recommendation from above. Use std::list<std::string>.
Related
Say I have something like this
extern "C" void make_foo (char** tgt) {
*tgt = (char*) malloc(4*sizeof(char));
strncpy(*tgt, "foo", 4);
}
int main() {
char* foo;
make_foo(&foo);
std::string foos{{foo}};
free(foo);
...
return 0;
}
Now, I would like to avoid using and then deleting the foo buffer. I.e., I'd like to change the initialisation of foos to something like
std::string foos{{std::move(foo)}};
and use no explicit free.
Turns out this actually compiles and seems to work, but I have a rather suspicious feel about it: does it actually move the C-defined string and properly free the storage? Or does it just ignore the std::move and leak the storage once the foo pointer goes out of scope?
It's not that I worry too much about the extra copy, but I do wonder if it's possible to write this in modern move-semantics style.
std::string constructor #5:
Constructs the string with the contents initialized with a copy of
the null-terminated character string pointed to by s. The length of
the string is determined by the first null character. The behavior is
undefined if s does not point at an array of at least
Traits::length(s)+1 elements of CharT, including the case when s is a
null pointer.
Your C-string is copied (the std::move doesn't matter here) and thus it is up to you to call free on foo.
A std::string will never take ownership.
tl;dr: Not really.
Pointers don't have any special move semantics. x = std::move(my_char_ptr) is the same as x = my_char_ptr. They are not similar in that regard to, say, std::vector's, in which moving takes away the allocated space.
However, in your case, if you want to keep existing heap buffers and treat them as strings - it can't be using std::string's, as they can't be constructed as a wrapper of an existing buffer (and there's small-string optimization etc.). Instead, consider either implementing a custom container, e.g. with some string data buffer (std::vector<char>) and an std::vector<std::string_view>, whose elements point into that buffer.
Say I have a variable
std::string str; // initialized with some value
And a struct defined as:
struct test
{
public:
const char* name;
};
I know this can be done :
test t1;
t1.name = str.c_str();
But this will store the address of the variable str in t1.name
Instead I want the values of str to put in a char array member of the structure which should be of exact same size as variable str.
Is there a way that can be achieved or a better design
Thanks in advance!
But this will store the address of the variable str in t1.name
Not exactly. str.c_str() does not return the address of variable str. It returns the address of the character array owned by str.
Instead I want the values of str to put in a char array member of the structure
To do that, the structure must have a char array member. Your structure does not; it has a pointer member.
char array member of the structure which should be of exact same size as variable str.
This is not possible. The size of the string is dynamic i.e. it may change at run time. The size of a member array must be known at compile time.
You can instead allocate an array dynamically. As the name implies, the size of dynamic allocation may be determined at run time. However, dynamic allocations must be manually deallocated, or else your program will leak memory.
or a better design
A popular design pattern for dynamic allocation is RAII. The standard library already has a RAII container for character strings: std::string. So, to copy a string into a member of a struct, a good design is to have a string as the member:
struct test {
std::string name;
};
test t1;
t1.name = str;
There is no reason to use const char *, since its more error-prone and harder to implement you should use std::string instead.
std::string also allows you to get const char * to string using c_str() method.
But if you have to implement name as C-style string, here is what you have to do:
Allocate enough space on heap using new.
Cast to non-const
Copy strings using strcpy
Free memory in destructor
Constructor:
test(const std::string& str) : name(new char[str.length() + 1])
{
strcpy((char*)name, str.c_str()); }
};
[Live demo]
Also as #Pixelchemist correctly noted, there are important rules of zero/three/five. If your class contains resources which arent copied/destructed correctly them self, like pure pointers aren't (if you would use smart pointers, it would be different story), you have to implement these as well:
copy constructor
copy assignment operator
destructor
move constuctor
move assignment operator
For extended informations read this excellent answer about rule of three.
I'm fairly novice with C++'s strings so the following pattern may be a little fugly. I'm reviewing some code I've written before beginning integration testing with a larger system. What I'd like to know is if it is safe, or if it would be prone to leaking memory?
string somefunc( void ) {
string returnString;
returnString.assign( "A string" );
return returnString;
}
void anotherfunc( void ) {
string myString;
myString.assign( somefunc() );
// ...
return;
}
The understanding I have is that the value of returnString is assigned to a new object myString and then the returnString object is destroyed as part of resolving the call to somefunc. At some point in the future when myString goes out of scope, it too is destroyed.
I would have typically passed a pointer to myString into somefunc() and directly assigned to values to myString but I'm striving to be a little clearer in my code ( and relying on the side effect function style less ).
Yes, returning a string this way (by value) is safe,albeit I would prefer assigning it this way:
string myString = somefunc();
This is easier to read, and is also more efficient (saving the construction of an empty string, which would then be overwritten by the next call to assign).
std::string manages its own memory, and it has properly written copy constructor and assignment operator, so it is safe to use strings this way.
Yes by doing
return returnString
You are invoking the string's copy constructor. Which performs a copy* of returnString into the temporary (aka rValue) that takes the place of "somefunc()" in the calling expression:
myString.assign( somefunc() /*somefunc()'s return becomes temporary*/);
This is in turn passed to assign and used by assign to perform a copy into myString.
So in your case, the copy constructor of string guarantees a deep copy and ensures no memory leaks.
* Note this may or may not be a true deep copy, the behavior of the copy constructor is implementation specific. Some string libraries implement copy-on-write which has some internal bookkeeping to prevent copying until actually needed.
You're completely safe because you're returning the string by value, where the string will be "copied", and not by reference. If you were to return a std::string &, then you'd be doing it wrong, as you'd have a dangling reference. Some compilers, even, might perform return value optimization, which won't even really copy the string upon return. See this post for more information.
Yes, it's (at least normally) safe. One of the most basic contributions of almost any reasonable string class is the ability to act like a basic value for which normal assignment, returns, etc., "just work".
As you said a string returnStringis created inside somefunc and a copy is given back when the function returns. This is perfectly safe.
What you want is to give a reference to myString to somefunc (don't use pointer). It will be perfectly clear:
void somefunc( string& myString ) {
myString.assign( "A string" );
}
void anotherfunc( void ) {
string myString;
somefunc(myString);
// ...
return;
}
Is there any way to know if an object is a const object or regular object, for instance consider the following class
class String
{
String(const char* str);
};
if user create a const object from String then there is no reason to copy the passed native string and that because he will not make any manipulation on it, the only thing he will do is get string size, string search and other functions that will not change the string.
There is a very good reason for copying - you can't know that the lifetime of the const char * is the same as that of the String object. And no, there is no way of knowing that you are constructing a const object.
Unfortunately, C++ does not provide a way to do what you are attempting. Simply passing a const char * does not guarantee the lifetime of the memory being pointed to. Consider:
char * a = new char[10];
char const *b = a;
String c (b);
delete[] a;
// c is now broken
There is no way for you to know. You could write a class that tightly interacts with String and that creates a constant string pointing to an external buffer (by making the corresponding constructor private and making the interacting class a nested class or a friend of String).
If all you worry about is doing dynamic memory management on a potentially small constant string, you can implement the Small String Optimization (also Small Object/Buffer Optimization). It works by having an embedded buffer in your string class, and copying each string up to some predefined size into that buffer, and each string that's larger to a dynamically allocated storage (the same technique is used by boost::function for storing small sized function objects).
class String {
union {
char *dynamicptr;
char buffer[16];
};
bool isDynamic;
};
There are clever techniques for storing even the length of the embedded string into the buffer itself (storing its length as buffer[15] and similar trickeries).
You could use const_string to do what you're looking for. However, even with const string you have to "tell" it that the string doesn't need to be copied.
const char* foo = "c-string";
boost::const_string bar(foo); // will copy foo
boost::const_string baz(boost::ref(foo)); // assumes foo will always be a valid pointer.
if user create a const object from String then there is no reason to copy the passed native string and that because he will not make any manipulation on it, the only thing he will do is get string size, string search and other functions that will not change the string.
Oh yes there is. Just that it is passes as const doesn't mean that it actually is const outside of the constructor call, and it especially doesn't mean it won't be destroyed while the string object still exists. The keyword const for a function argument only means that the function won't modify or delete it (trying to implement a function that modifies a const argument will result in a compiler error), but there's no way for the function to know what happens outside.
What you're looking for is basically a COW (copy on write) string. Such things are entirely possible, but getting them to work well is somewhat non-trivial. In a multithreaded environment, getting good performance can go beyond non-trivial into the decidedly difficult range.
EDIT: I know in this case, if it were an actual class i would be better off not putting the string on the heap. However, this is just a sample code to make sure i understand the theory. The actual code is going to be a red black tree, with all the nodes stored on the heap.
I want to make sure i have these basic ideas correct before moving on (I am coming from a Java/Python background). I have been searching the net, but haven't found a concrete answer to this question yet.
When you reassign a pointer to a new object, do you have to call delete on the old object first to avoid a memory leak? My intuition is telling me yes, but i want a concrete answer before moving on.
For example, let say you had a class that stored a pointer to a string
class MyClass
{
private:
std::string *str;
public:
MyClass (const std::string &_str)
{
str=new std::string(_str);
}
void ChangeString(const std::string &_str)
{
// I am wondering if this is correct?
delete str;
str = new std::string(_str)
/*
* or could you simply do it like:
* str = _str;
*/
}
....
In the ChangeString method, which would be correct?
I think i am getting hung up on if you dont use the new keyword for the second way, it will still compile and run like you expected. Does this just overwrite the data that this pointer points to? Or does it do something else?
Any advice would be greatly appricated :D
If you must deallocate the old instance and create another one, you should first make sure that creating the new object succeeds:
void reset(const std::string& str)
{
std::string* tmp = new std::string(str);
delete m_str;
m_str = tmp;
}
If you call delete first, and then creating a new one throws an exception, then the class instance will be left with a dangling pointer. E.g, your destructor might end up attempting to delete the pointer again (undefined behavior).
You could also avoid that by setting the pointer to NULL in-between, but the above way is still better: if resetting fails, the object will keep its original value.
As to the question in the code comment.
*str = _str;
This would be the correct thing to do. It is normal string assignment.
str = &_str;
This would be assigning pointers and completely wrong. You would leak the string instance previously pointed to by str. Even worse, it is quite likely that the string passed to the function isn't allocated with new in the first place (you shouldn't be mixing pointers to dynamically allocated and automatic objects). Furthermore, you might be storing the address of a string object whose lifetime ends with the function call (if the const reference is bound to a temporary).
Why do you think you need to store a pointer to a string in your class? Pointers to C++ collections such as string are actually very rarely necessary. Your class should almost certainly look like:
class MyClass
{
private:
std::string str;
public:
MyClass (const std::string & astr) : str( astr )
{
}
void ChangeString(const std::string & astr)
{
str = astr;
}
....
};
Just pinpointing here, but
str = _str;
would not compile (you're trying to assign _str, which is the value of a string passed by reference, to str, which is the address of a string). If you wanted to do that, you would write :
str = &_str;
(and you would have to change either _str or str so that the constnest matches).
But then, as your intuition told you, you would have leaked the memory of whatever string object was already pointed to by str.
As pointed earlier, when you add a variable to a class in C++, you must think of whether the variable is owned by the object, or by something else.
If it is owned by the object, than you're probably better off with storing it as a value, and copying stuff around (but then you need to make sure that copies don't happen in your back).
It is is not owned, then you can store it as a pointer, and you don't necessarily need to copy things all the time.
Other people will explain this better than me, because I am not really confortable with it.
What I end up doing a lot is writing code like this :
class Foo {
private :
Bar & dep_bar_;
Baz & dep_baz_;
Bing * p_bing_;
public:
Foo(Bar & dep_bar, Baz & dep_baz) : dep_bar_(dep_bar), dep_baz_(dep_baz) {
p_bing = new Bing(...);
}
~Foo() {
delete p_bing;
}
That is, if an object depends on something in the 'Java' / 'Ioc' sense (the objects exists elsewhere, you're not creating it, and you only wants to call method on it), I would store the dependency as a reference, using dep_xxxx.
If I create the object, I would use a pointer, with a p_ prefix.
This is just to make the code more "immediate". Not sure it helps.
Just my 2c.
Good luck with the memory mgt, you're right that it is the tricky part comming from Java ; don't write code until you're confortable, or you're going to spend hours chasing segaults.
Hoping this helps !
The general rule in C++ is that for every object created with "new" there must be a "delete". Making sure that always happens in the hard part ;) Modern C++ programmers avoid creating memory on the heap (i.e. with "new") like the plague and use stack objects instead. Really consider whether you need to be using "new" in your code. It's rarely needed.
If you're coming from a background with garbage collected languages and find yourself really needing to use heap memory, I suggest using the boost shared pointers. You use them like this:
#include <boost/shared_ptr.hpp>
...
boost::shared_ptr<MyClass> myPointer = boost::shared_ptr<MyClass>(new MyClass());
myPointer has pretty much the same language semantics as a regular pointer, but shared_ptr uses reference counting to determine when delete the object it's referencing. It's basically do it yourself garbage collection. The docs are here: http://www.boost.org/doc/libs/1_42_0/libs/smart_ptr/smart_ptr.htm
I'll just write a class for you.
class A
{
Foo * foo; // private by default
public:
A(Foo * foo_): foo(foo_) {}
A(): foo(0) {} // in case you need a no-arguments ("default") constructor
A(const A &a):foo(new Foo(a.foo)) {} // this is tricky; explanation below
A& operator=(const &A a) { foo = new Foo(a.foo); return *this; }
void setFoo(Foo * foo_) { delete foo; foo = foo_; }
~A() { delete foo; }
}
For classes that hold resources like this, the copy constructor, assignment operator, and destructor are all necessary. The tricky part of the copy constructor and assignment operator is that you need to delete each Foo precisely once. If the copy constructor initializer had said :foo(a.foo), then that particular Foo would be deleted once when the object being initialized was destroyed and once when the object being initialized from (a) was destroyed.
The class, the way I've written it, needs to be documented as taking ownership of the Foo pointer it's being passed, because Foo * f = new Foo(); A a(f); delete f; will also cause double deletion.
Another way to do that would be to use Boost's smart pointers (which were the core of the next standard's smart pointers) and have boost::shared_ptr<Foo> foo; instead of Foo * f; in the class definition. In that case, the copy constructor should be A(const A &a):foo(a.foo) {}, since the smart pointer will take care of deleting the Foo when all the copies of the shared pointer pointing at it are destroyed. (There's problems you can get into here, too, particularly if you mix shared_ptr<>s with any other form of pointer, but if you stick to shared_ptr<> throughout you should be OK.)
Note: I'm writing this without running it through a compiler. I'm aiming for accuracy and good style (such as the use of initializers in constructors). If somebody finds a problem, please comment.
Three comments:
You need a destructor as well.
~MyClass()
{
delete str;
}
You really don't need to use heap allocated memory in this case. You could do the following:
class MyClass {
private:
std::string str;
public:
MyClass (const std::string &_str) {
str= _str;
}
void ChangeString(const std::string &_str) {
str = _str;
};
You can't do the commented out version. That would be a memory leak. Java takes care of that because it has garbage collection. C++ does not have that feature.
When you reassign a pointer to a new object, do you have to call delete on the old object first to avoid a memory leak? My intuition is telling me yes, but i want a concrete answer before moving on.
Yes. If it's a raw pointer, you must delete the old object first.
There are smart pointer classes that will do this for you when you assign a new value.