String management in C++ - c++

I've decided to come back to C++ after some time spent in Java and now I'm quite confused about how strings work in C++.
To start with, suppose we have a function:
void fun() {
int a = 1;
Point b(1,2);
char c[] = "c-string";
}
As I understand, a and b are allocated on the stack. c (the pointer) is allocated on the stack too, but the contents ("c-string") live happily on the heap.
Q1: Are the contents of c automatically deallocated when the function fun ends?
Secondly let's suppose we have a c++ string:
void fun2() {
(1) string s = "c++ string";
(2) s += "append";
(3) s = "new contents";
(4) s = "a" + s + "c";
}
String documentation isn't too specific about how the strings work, so here are the questions:
Q2: Are the contents of s automatically deallocated after fun2 ends?
Q3: What does happen when we concatenate two strings? Should I care about memory usage? (line 2)
Q4: What happens when we overwrite the contents of a string (line 3) - what about memory, should I worry? Is the originally allocated space reused?
Q5: What if I construct a string like this (line 4). Is it expensive? Are string literals ("a","c") pooled (like in Java) or repeated throughout the final executable?
What I am ultimately trying to learn is how to correctly use strings in C++.
Thanks for reading this,
Queequeg

As I understand, a and b are allocated on the stack. c (the pointer) is allocated on the stack too, but the contents ("c-string") live happily on the heap.
That's wrong, they all live in automatic memory (the stack). Even the char array. In C++, a string is an object of type std::string.
Q1: Are the contents of c automatically deallocated when the function fun ends?
Yes.
Q2: Are the contents of s automatically deallocated after fun2 ends?
Yes.
Q3: What does happen when we concatenate two strings? Should I care about memory usage? (line 2)
They are concatenated, and the memory is managed automatically. (assuming we're talking about std::string and not char[] or char*.
Q4: What happens when we overwrite the contents of a string (line 3) - what about memory, should I worry? Is the originally allocated space reused?
Implementation detail. It can be reused, it can be re-allocated if the previous memory can't hold the new contents.
Q5: What if I construct a string like this (line 4). Is it expensive? Are string literals ("a","c") pooled (like in Java) or repeated throughout the final executable?
String literals can be pooled, but it's not required. For large concatenation, it's usual to use a std::stringstream instead (similar to Java). But profile first, don't do premature optimizations. Not that neither of those are string literals though.
char* pStr = "this is a string literal";
This resides in read-only memory and can't be modified.
What I am ultimately trying to learn is how to correctly use strings in C++.
Use a std::string.

c is not a pointer. It is an array. You can tell that it's an array because it has square brackets, while pointers have a star. Since c is an automatic variable, it does not require any manual lifetime or memory management.
Ad Q2: s is also an automatic variable, and since it is of a well-designed class type, this means you don't need to take care of anything manually.
Ad Q3: Local string objects are suitably modified to contain the new string; in the process, there may or may not exist temporary string objects for the duration of the concatenation expression. (This applies only to line 4; there's no temporary in line 2.)
Ad Q4: Everything is fine and works as expected; see Q2. The original memory may or may no be used, depending on the details of the assignment. In your example, the original memory would probably be overwritten; in a case like s = std::string("hello");, the buffers of the two strings would probably be swapped.
Ad Q5: String literals are read-only global constants, which the compiler may implement any way it likes. The details are not so important; you will definitely end up with the desired string object in s. See Q3 re temporary objects.
To "learn how to use strings in C++", just go and use them. Treat them like integers. It'll be correct. The beauty of the standard library is that you really don't need to know "how things work"; when you use standard library classes in an idiomatic C++ fashion, all resource management is done automatically and efficiently for you.

Q1: Yes, it is de-allocated. The character array resides on the functions's stack.
Q2: Yes, std::string takes care of releasing all of its resources on destruction, which happens on leaving scope, as it does for all automatically allocated variables.
Q3: No, you should not worry, unless profiling tells you you should.
Q4: You should not worry, and the original space may or may not be re-used. In any case, all space used by the strings is de-allocated on exiting the function.
Q5: Given the optimizations the compiler is allowed to make the only way to determining for sure whether X is more expensive than Y is to profile both of them.

Q1 Should to note that literals "c++ string","append","new contents","a","c" are in static memory

Related

Why does the delete[] syntax exist in C++?

Every time somebody asks a question about delete[] on here, there is always a pretty general "that's how C++ does it, use delete[]" kind of response. Coming from a vanilla C background what I don't understand is why there needs to be a different invocation at all.
With malloc()/free() your options are to get a pointer to a contiguous block of memory and to free a block of contiguous memory. Something in implementation land comes along and knows what size the block you allocated was based on the base address, for when you have to free it.
There is no function free_array(). I've seen some crazy theories on other questions tangentially related to this, such as calling delete ptr will only free the top of the array, not the whole array. Or the more correct, it is not defined by the implementation. And sure... if this was the first version of C++ and you made a weird design choice that makes sense. But why with $PRESENT_YEAR's standard of C++ has it not been overloaded???
It seems to be the only extra bit that C++ adds is going through the array and calling destructors, and I think maybe this is the crux of it, and it literally is using a separate function to save us a single runtime length lookup, or nullptr at end of the list in exchange for torturing every new C++ programmer or programmer who had a fuzzy day and forgot that there is a different reserve word.
Can someone please clarify once and for all if there is a reason besides "that's what the standard says and nobody questions it"?
Objects in C++ often have destructors that need to run at the end of their lifetime. delete[] makes sure the destructors of each element of the array are called. But doing this has unspecified overhead, while delete does not. This is why there are two forms of delete expressions. One for arrays, which pays the overhead and one for single objects which does not.
In order to only have one version, an implementation would need a mechanism for tracking extra information about every pointer. But one of the founding principles of C++ is that the user shouldn't be forced to pay a cost that they don't absolutely have to.
Always delete what you new and always delete[] what you new[]. But in modern C++, new and new[] are generally not used anymore. Use std::make_unique, std::make_shared, std::vector or other more expressive and safer alternatives.
Basically, malloc and free allocate memory, and new and delete create and destroy objects. So you have to know what the objects are.
To elaborate on the unspecified overhead François Andrieux's answer mentions, you can see my answer on this question in which I examined what does a specific implementation do (Visual C++ 2013, 32-bit). Other implementations may or may not do a similar thing.
In case the new[] was used with an array of objects with a non-trivial destructor, what it did was allocating 4 bytes more, and returning the pointer shifted by 4 bytes ahead, so when delete[] wants to know how many objects are there, it takes the pointer, shifts it 4 bytes prior, and takes the number at that address and treats it as the number of objects stored there. It then calls a destructor on each object (the size of the object is known from the type of the pointer passed). Then, in order to release the exact address, it passes the address that was 4 bytes prior to the passed address.
On this implementation, passing an array allocated with new[] to a regular delete results in calling a single destructor, of the first element, followed by passing the wrong address to the deallocation function, corrupting the heap. Don't do it!
Something not mentioned in the other (all good) answers is that the root cause of this is that arrays - inherited from C - have never been a "first-class" thing in C++.
They have primitive C semantics and do not have C++ semantics, and therefore C++ compiler and runtime support, which would let you or the compiler runtime systems do useful things with pointers to them.
In fact, they're so unsupported by C++ that a pointer to an array of things looks just like a pointer to a single thing. That, in particular, would not happen if arrays were proper parts of the language - even as part of a library, like string or vector.
This wart on the C++ language happened because of this heritage from C. And it remains part of the language - even though we now have std::array for fixed-length arrays and (have always had) std::vector for variable-length arrays - largely for purposes of compatibility: Being able to call out from C++ to operating system APIs and to libraries written in other languages using C-language interop.
And ... because there are truckloads of books and websites and classrooms out there teaching arrays very early in their C++ pedagogy, because of a) being able to write useful/interesting examples early on that do in fact call OS APIs, and of course because of the awesome power of b) "that's the way we've always done it".
Generally, C++ compilers and their associated runtimes build on top of the platform's C runtime. In particular in this case the C memory manager.
The C memory manager allows you to free a block of memory without knowing its size, but there is no standard way to get the size of the block from the runtime and there is no guarantee that the block that was actually allocated is exactly the size you requested. It may well be larger.
Thus the block size stored by the C memory manager can't usefully be used to enable higher-level functionality. If higher-level functionality needs information on the size of the allocation then it must store it itself. (And C++ delete[] does need this for types with destructors, to run them for every element.)
C++ also has an attitude of "you only pay for what you use", storing an extra length field for every allocation (separate from the underlying allocator's bookkeeping) would not fit well with this attitude.
Since the normal way to represent an array of unknown (at compile time) size in C and C++ is with a pointer to its first element, there is no way the compiler can distinguish between a single object allocation and an array allocation based on the type system. So it leaves it up to the programmer to distinguish.
The cover story is that delete is required because of C++'s relationship with C.
The new operator can make a dynamically allocated object of almost any object type.
But, due to the C heritage, a pointer to an object type is ambiguous between two abstractions:
being the location of a single object, and
being the base of a dynamic array.
The delete versus delete[] situation just follows from that.
However, that's does not ring true, because, in spite of the above observations being true, a single delete operator could be used. It does not logically follow that two operators are required.
Here is informal proof. The new T operator invocation (single object case) could implicitly behave as if it were new T[1]. So that is to say, every new could always allocate an array. When no array syntax is mentioned, it could be implicit that an array of [1] will be allocated. Then, there would just have to exist a single delete which behaves like today's delete[].
Why isn't that design followed?
I think it boils down to the usual: it's a goat that was sacrificed to the gods of efficiency. When you allocate an array with new [], extra storage is allocated for meta-data to keep track of the number of elements, so that delete [] can know how many elements need to be iterated for destruction. When you allocate a single object with new, no such meta-data is required. The object can be constructed directly in the memory which comes from the underlying allocator without any extra header.
It's a part of "don't pay for what you don't use" in terms of run-time costs. If you're allocating single objects, you don't have to "pay" for any representational overhead in those objects to deal with the possibility that any dynamic object referenced by pointer might be an array. However, you are burdened with the responsibility of encoding that information in the way you allocate the object with the array new and subsequently delete it.
An example might help. When you allocate a C-style array of objects, those objects may have their own destructor that needs to be called. The delete operator does not do that. It works on container objects, but not C-style arrays. You need delete[] for them.
Here is an example:
#include <iostream>
#include <stdlib.h>
#include <string>
using std::cerr;
using std::cout;
using std::endl;
class silly_string : private std::string {
public:
silly_string(const char* const s) :
std::string(s) {}
~silly_string() {
cout.flush();
cerr << "Deleting \"" << *this << "\"."
<< endl;
// The destructor of the base class is now implicitly invoked.
}
friend std::ostream& operator<< ( std::ostream&, const silly_string& );
};
std::ostream& operator<< ( std::ostream& out, const silly_string& s )
{
return out << static_cast<const std::string>(s);
}
int main()
{
constexpr size_t nwords = 2;
silly_string *const words = new silly_string[nwords]{
"hello,",
"world!" };
cout << words[0] << ' '
<< words[1] << '\n';
delete[] words;
return EXIT_SUCCESS;
}
That test program explicitly instruments the destructor calls. It’s obviously a contrived example. For one thing, a program does not need to free memory immediately before it terminates and releases all its resources. But it does demonstrate what happens and in what order.
Some compilers, such as clang++, are smart enough to warn you if you leave out the [] in delete[] words;, but if you force it to compile the buggy code anyway, you get heap corruption.
Delete is an operator that destroys array and non-array(pointer) objects which are generated by new expression.
It can be used by either using the Delete operator or Delete [ ] operator
A new operator is used for dynamic memory allocation which puts variables on heap memory.
This means the Delete operator deallocates memory from the heap.
Pointer to object is not destroyed, value or memory block pointed by the pointer is destroyed.
The delete operator has a void return type that does not return a value.

Does char* with text automatically reserves memory as if malloc is used?

// Example program
#include <iostream>
#include <string>
using namespace std;
int main()
{
char **p;
p = (char **)malloc(100);
p[0] = (char *)"Apple"; // or write *p, points to location of 'A'
p[1] = (char *)"Banana"; // or write *(p+1), points to location of 'B'
cout << *p << endl; //Prints the first pointer
}
In the above code :
p[0] = (char *)"Apple";
seems to reserve memory automatically. There is no malloc. Is this C/C++ standard or compiler specific?
UPDATE 1
I am actually interested how it is in C and then in C++. It is just that I did not have a C compiler installed for the code above, so I used C++.
So p is allocated on the STACK pointing to a block of memory (array) in the HEAP where each element points (is a pointer) to a literal in the DATA segment? Wow!
seems to reserve memory automatically. There is no malloc.
Dynamic allocation is not the only way to acquire memory in C++.
The variable p has automatic storage. The string literals are arrays that have static storage. Objects with automatic, static or thread local storage are destroyed automatically. All variables have one of these three storage durations.
Automatic objects are destroyed at the end of the (narrowest surrounding) scope, static objects are destroyed after main returns and thread local objects are destroyed when the thread exits.
Is this C/C++ standard or compiler specific?
The example program is mostly standard, except:
You haven't included a header that is guaranteed to declare malloc.
You haven't created char* objects into the dynamically allocated memory, so the behaviour of the program is technically undefined in C++. See P.P.P.S. below for how to fix this.
P.S. It is quite unsafe to point at string literals with non-const pointers to char. Attempting to modify the literal through such pointer would be syntactically correct, but the behaviour of the program would be undefined at runtime. Use const char* instead. Conveniently, you can get rid of some of the explicit conversions.
P.P.S. C-style explicit conversions are not recommended in C++. Use static_cast, const_cast or reinterpret_cast or their combination instead.
P.P.P.S. It is not recommended to use malloc in C++. Use new or new[] instead... or even better, see next point.
P.P.P.P.S. It is not recommended to have bare owning pointers to dynamic memory. Using a RAII container such as std::vector here would be a good idea.
P.P.P.P.P.S. Your example program leaks the dynamic allocation. This is one of the reasons to avoid bare owning pointers.
So p is allocated on the STACK pointing to a block of memory (array) in the HEAP where each element points (is a pointer) to a literal in the DATA segment?
The language itself is agnostic to concepts such as stack and heap memory and data segment. These are details specific to the implementation of the language on the system that you are using.
malloc does dynamic memory allocation. Here you have classic static memory allocation, where string constants will be allocated in data section of your binary (if I'm not mistaken). Compiler knows in advance how many bytes you need, so it will just allocate it during compilation. This is as opposed to malloc, where you can ask for any number of bytes calculated in runtime and unknown in advance.
Same with arrays that you declare with constant length, without using malloc.
[This is the C answer, since the question was originally tagged [c] also. But it mostly applies to C++ as well.]
When you say
char *str = "text"
or, as in your code,
p[0] = "Apple";
the compiler does allocate memory for those strings "test" and "Apple". It definitely is not as if malloc was called, however. In particular, the memory where those strings are stored is not guaranteed to be (and, these days, typically is not) writable. And you can't pass those pointers to free or realloc -- because, again, they did not come from malloc in the first place.
This is a longstanding aspect of C (and, by extension, C++), true since forever and under any compiler.

"New" and "delete" in a function

Writing a dll for file manipulation, I'm running into some issue.
To read bytes from a file via file.read I require a char* array of the desired length.
Since the length is variable, I cannot use
char* ret_chars[next_bytes];
It gives the error that next_bytes is not a constant.
Another topic here in StackOverflow says to use:
char* ret_chars = new char[next_bytes];
Creating it with "new" requires to use "delete" later though, as far as I know.
Now, how am I supposed to delete the array if the return-value of this function is supposed to be exactly this array?
Isn't it a memory leak if I don't use "delete" anywhere?
If that helps anything: This is a DLL I'll be calling from "Game Maker". Therefore I don't have the possibility to delete anything afterwards.
Hope someone can help me!
When you're writing a callback which will be invoked by existing code, you have to follow its rules.
Assuming that the authors of "Game Maker" aren't complete idiots, they will free the memory you return. So you have to check the documentation to find out what function they will use to free the memory, and then you have to call the matching allocator.
In these cases, the framework usually will provide an allocation function which is specially designed for you to use to allocate a return buffer.
Another common approach is that you never return a buffer allocated by the callback. Instead, the framework passes a buffer to your callback, and you simply fill it in. Check the documentation for that possibility as well.
Is there no sample code for writing "Game Maker" plugins/extensions?
It looks like the developers are indeed complete idiots, at least when it comes to design of plugin interfaces, but they do provide some guidance.
Note that you have to be careful with memory management. That is why I declared the resulting string global.
This implies that the Game Maker engine makes no attempt to free the returned buffer.
You too can use a global, or indeed any variable with static storage duration such as a function-local static variable. std::vector<char> would be a good choice, because it's easy to resize. This way, every time the function is called, the memory allocated for the previous call will be reused or freed. So your "leak" will be limited to the amount you return at once.
char* somefunc( void )
{
static std::vector<char> ret_buffer;
ret_buffer.resize(next_bytes);
// fill it in, blah blah
return &ret_buffer[0];
}
// std::string and return ret_string.c_str(); is another reasonable option
Your script in Game Maker Language will be responsible for making a copy of that result string before it calls your function again and overwrites it.
The new char[ n ] trick works with a runtime value, and yes - you need to delete[] the array when you're done with it or it leaks.
If you are unable to change how "Game Maker" (whatever that is) works, then the memory will be leaked.
If you can change "Game Maker" to do the right thing, then it must manage the lifetime of the returned array.
That's the real problem here - the DLL code can't know when it's no longer needed, so the calling code needs to delete it when it's done, but the calling code cannot delete it directly - it must call back to the DLL to delete it, since it was the DLL's memory manager that allocated it in the first place.
Since you say the return value must be a char[], you therefore need to export a second function from your DLL that takes the char[], and calls delete[] on it. The calling code can then call that function when it's finished with the array returned previously.
Use vector <char *> (or vector <char> depending on which you really want - the question isn't entirely clear), that way, you don't need to delete anything.
You can not use new inside a function, without calling delete, or your application will be leaking memory (which is a bad thing, because EVENTUALLY, you'll have no memory left). There is no EASY solution for this, that doesn't have some relatively strict restrictions in some way or another.
The first code sample you quoted allocates memory on the stack.
The second code sample you quote allocates memory on the heap. (Two totally different concepts).
If you are returning the array, then the function allocating the memory does not free it. It is up to the caller to delete the memory. If the caller forgets, then yes, it is a memory leak.
First, if you use new char[]; you can't use delete, but you have to use delete [].
But like you said, if you use new [] in this function without using delete [] at the end your program will be leaking.
If you want a kind of garbage collection, you can use the *smart ptr* now in the standard c++ library.
I think a `shared_ptr` would be good to achieve what you want.
> **Shared ptr** : Manages the storage of a pointer, providing a limited garbage-collection facility, possibly sharing that management with other objects.
Here is some documentation about it : http://www.cplusplus.com/reference/memory/shared_ptr/
Ok, I'll jump in as well.
If the Game Maker doesn't explicitly say it will delete this memory, then you should check to see just how big a buffer it wants and pass in a static buffer of that size instead. This avoids all sorts of nastiness relating to cross dll versioning issues with memory management. There has to be some documentation on this in their code or their API and I strongly suggest you find and read it. Game Maker is a pretty large and well known API so Google should work for info if you don't have the docs yourself.
If you're returning a char pointer, which it looks as though you are, then you can simply call delete on that pointer.
Example:
char * getString()
{
char* ret_chars = new char[next_bytes];
strcpy(ret_chars, "Hello world")
return ret_chars
}
void displayChars()
{
char* chars = getString()
cout << chars
delete [] chars
}
Just be sure to deallocate (delete) all allocated (new'd) pointers or else you'll have memory leaks where the memory is allocated and then not collected after runtime and becomes unusable. A quick and dirty way to glance through and see if you've deallocated all allocated space to count your new's and count your deletes, and they should be 1-to-1 unless some appear in condition or looped blocks.

std::vector - how to free the memory of char* elements in a vector?

Consider the following C++ codes :
using namespace std;
vector<char*> aCharPointerRow;
aCharPointerRow.push_back("String_11");
aCharPointerRow.push_back("String_12");
aCharPointerRow.push_back("String_13");
for (int i=0; i<aCharPointerRow.size(); i++) {
cout << aCharPointerRow[i] << ",";
}
aCharPointerRow.clear();
After the aCharPointerRow.clear(); line, the character pointer elements in aCharPointerRow should all be removed.
Is there a memory leak in the above C++ code ? Do I need to explicitly free the memory allocated to the char* strings ? If yes, how ?
Thanks for any suggestion.
Is there a memory leak in the above C++ code?
There is no memory leak.
Since you never used new you do not need to call delete. You only need to deallocate dynamicmemory if it was allocated in first place.
Note that ideally, You should be using vector of std::string.
std::vector<std::string> str;
str.push_back("String_11");
str.push_back("String_12");
str.push_back("String_13");
You could use std::string.c_str() in case you need to get the underlying character pointer(char *), which lot of C api expect as an parameter.
You are pushing in your vector string literals (the strings in "..."). These aren't allocated by you. They are given to you by the C++ compiler/runtime and they have a lifetime equal to the lifetime of the app, so you can't/mustn't free them.
See for example Scope of (string) literals
Note that everything I told you was based on the fact that you are using string literals. If you need to allocate your strings' memory, then you will have to use some automatic deallocators like std::unique_ptr (of C++11) or boost::unique_ptr or boost::shared_ptr (of Boost) or better use the std::string class as suggested by Als
The sample has no leak, since the pointer you give don't refer to dynamic memory.
But is also a bad written code: string literals are constant, but C++ allow to refer them as char* to retain a C library backward compatibility. If you intend to refer to string literals, you should better use const char* instead of char* (in case of an attempt to modify them you got a compiler error, not a runtime exception)
Another bad thing, here, is that in a more extensive code, you sooner or later lose the control on what are the char* effectively stored in the vector: Are they granted to always be string literals or can they also be some other way allocated dynamic char[] ?? And who is responsible for their allocation / deallocation ?
std::vector says nothing about that, and if you are in the position you cannot give a clean answer to the above questions (each const char* referred buffer can be either exist outside the vector existence scope or not), you have probably better to use std::vector<std::string>, and treat the strings as "values" (not referenced objects), letting the string class to do the dirty job.
There is no leak. As long as you're not making a copy of those strings, you don't need to explicitly delete or free() them.

Performance on strings initialization in C++

I have following questions regarding strings in C++:
1>> which is a better option(considering performance) and why?
1.
string a;
a = "hello!";
OR
2.
string *a;
a = new string("hello!");
...
delete(a);
2>>
string a;
a = "less";
a = "moreeeeeee";
how exactly memory management is handled in c++ when a bigger string is copied into a smaller string? Are c++ strings mutable?
It is almost never necessary or desirable to say
string * s = new string("hello");
After all, you would (almost) never say:
int * i = new int(42);
You should instead say
string s( "hello" );
or
string s = "hello";
And yes, C++ strings are mutable.
All the following is what a naive compiler would do. Of course as long as it doesn't change the behavior of the program, the compiler is free to make any optimization.
string a;
a = "hello!";
First you initialize a to contain the empty string. (set length to 0, and one or two other operations). Then you assign a new value, overwriting the length value that was already set. It may also have to perform a check to see how big the current buffer is, and whether or not more memory should be allocated.
string *a;
a = new string("hello!");
...
delete(a);
Calling new requires the OS and the memory allocator to find a free chunk of memory. That's slow. Then you initialize it immediately, so you don't assign anything twice or require the buffer to be resized, like you do in the first version.
Then something bad happens, and you forget to call delete, and you have a memory leak, in addition to a string that is extremely slow to allocate. So this is bad.
string a;
a = "less";
a = "moreeeeeee";
Like in the first case, you first initialize a to contain the empty string. Then you assign a new string, and then another. Each of these may require a call to new to allocate more memory. Each line also requires length, and possibly other internal variables to be assigned.
Normally, you'd allocate it like this:
string a = "hello";
One line, perform initialization once, rather than first default-initializing, and then assigning the value you want.
It also minimizes errors, because you don't have a nonsensical empty string anywhere in your program. If the string exists, it contains the value you want.
About memory management, google RAII.
In short, string calls new/delete internally to resize its buffer. That means you never need to allocate a string with new. The string object has a fixed size, and is designed to be allocated on the stack, so that the destructor is automatically called when it goes out of scope. The destructor then guarantees that any allocated memory is freed. That way, you don't have to use new/delete in your user code, which means you won't leak memory.
Is there a specific reason why you constantly use assignment instead of intialization? That is, why don't you write
string a = "Hello";
etc.? This avoids a default construction and just makes more sense semantically. Creating a pointer to a string just for the sake of allocating it on the heap is never meaningful, i.e. your case 2 doesn't make sense and is slightly less efficient.
As to your last question, yes, strings in C++ are mutable unless declared const.
string a;
a = "hello!";
2 operations: calls the default constructor std:string() and then calls the operator::=
string *a; a = new string("hello!"); ... delete(a);
only one operation: calls the constructor std:string(const char*) but you should not forget to release your pointer.
What about
string a("hello");
In case 1.1, your string members (which include pointer to the data) are held in stack and the memory occupied by the class instance is freed when a goes out of scope.
In case 1.2, memory for the members is allocated dynamically from heap too.
When you assign a char* constant to a string, memory that will contain the data will be realloc'ed to fit the new data.
You may see how much memory is allocated by calling string::capacity().
When you call string a("hello"), memory gets allocated in the constructor.
Both constructor and assignment operator call same methods internally to allocated memory and copy new data there.
If you look at the docs for the STL string class (I believe the SGI docs are compliant to the spec), many of the methods list complexity guarantees. I believe many of the complexity guarantees are intentionally left vague to allow different implementations. I think some implementations actually use a copy-on-modify approach such that assigning one string to another is a constant-time operation, but you may incur an unexpected cost when you try to modify one of those instances. Not sure if that's still true in modern STL though.
You should also check out the capacity() function, which will tell you the maximum length string you can put into a given string instance before it will be forced to reallocate memory. You can also use reserve() to cause a reallocation to a specific amount if you know you're going to be storing a large string in the variable at a later time.
As others have said, as far as your examples go, you should really favor initialization over other approaches to avoid the creation of temporary objects.
Most likely
string a("hello!");
is faster than anything else.
You're coming from Java, right? In C++, objects are treated the same (in most ways) as the basic value types. Objects can live on the stack or in static storage, and be passed by value. When you declare a string in a function, that allocates on the stack however many bytes the string object takes. The string object itself does use dynamic memory to store the actual characters, but that's transparent to you. The other thing to remember is that when the function exits and the string you declared is no longer in scope, all of the memory it used is freed. No need for garbage collection (RAII is your best friend).
In your example:
string a;
a = "less";
a = "moreeeeeee";
This puts a block of memory on the stack and names it a, then the constructor is called and a is initialized to an empty string. The compiler stores the bytes for "less" and "moreeeeeee" in (I think) the .rdata section of your exe. String a will have a few fields, like a length field and a char* (I'm simplifying greatly). When you assign "less" to a, the operator=() method is called. It dynamically allocates memory to store the input value, then copies it in. When you later assign "moreeeeeee" to a, the operator=() method is again called and it reallocates enough memory to hold the new value if necessary, then copies it in to the internal buffer.
When string a's scope exits, the string destructor is called and the memory that was dynamically allocated to hold the actual characters is freed. Then the stack pointer is decremented and the memory that held a is no longer "on" the stack.
Creating a string directly in the heap is usually not a good idea, just like creating base types. It's not worth it since the object can easily stay on the stack and it has all the copy constructors and assignment operator needed for an efficient copy.
The std:string itself has a buffer in heap that may be shared by several string depending on the implementation.
For instance, with Microsoft's STL implementation you could do that:
string a = "Hello!";
string b = a;
And both string would share the same buffer until you changed it:
a = "Something else!";
That's why it was very bad to store the c_str() for latter use; c_str() guarantee only validity until another call to that string object is made.
This lead to very nasty concurrency bugs that required this sharing functionality to be turned off with a define if you used them in a multithreaded application.