Initialize a pointer to struct syntax - c++

Is there a any difference in terms of memory allocation between
struct_type * mystruct = new struct_type();
and
struct_type *mystruct = new struct_type[1];
?

It depends on what you mean by "difference in memory allocation".
Firstly, new and new[] are two independent memory allocation mechanisms, which can (and will) allocate memory with different internal layout, e.g. with different implementation-dependent household information associated with allocated memory block. It is important to remember that the first allocation has to be paired with delete and the second - with delete []. Also, for this reason, in a typical implementation the second allocation might consume more memory than the first.
Secondly, the initializer syntax () you used in the first allocation triggers value-initialization of the allocated object. Meanwhile, in your second allocation you supplied no initializer at all. Depending on the specifics of struct_type class, this can lead to significant differences in initialization. For example, if struct_type is defined as struct struct_type { int x; }, the first allocation is guaranteed to set mystruct->x to zero, while the second one will leave a garbage value in mystruct->x. You have to do new struct_type[1]() to eliminate this (probably unintended) difference.

They will allocate same amount visible/usable of memory, namely memory required to hold one object. But the semantics are different, former is a pointer to single object, while latter is an array containing one object. And when de-allocating you should use
delete mystruct;
in first case while
delete []mystruct;
in second case.
Another difference is compiler must hold some book-keeping information about the latter case, for example it must know the number of items in the array, so that it can be deleted correctly. And of course your structure must have a default constructor in to be used in latter case.

The first line would create a structure object and return its address to your pointer.
The second line would create an array of 1 structure object and return the starting address of the array to your pointer

I think there is no difference in terms of memory allocation between these two lines of code.

Related

Getting dynamically allocated array size

In "The C++ Programming Language" book Stroustrup says:
"To deallocate space allocated by new, delete and delete[] must be able to determine the size of the object allocated. This implies that an object allocated using the standard implementation of new will occupy slightly more space than a static object. Typically, one word is used to hold the object’s size.
That means every object allocated by new has its size located somewhere in the heap. Is the location known and if it is how can I access it?
In actual fact, the typical implementation of the memory allocators store some other information too.
There is no standard way to access this information, in fact there is nothing in the standard saying WHAT information is stored either (the size in bytes, number of elements and their size, a pointer to the last element, etc).
Edit:
If you have the base-address of the object and the correct type, I suspect the size of the allocation could be relatively easily found (not necessarily "at no cost at all"). However, there are several problems:
It assumes you have the original pointer.
It assumes the memory is allocated exactly with that runtime library's allocation code.
It assumes the allocator doesn't "round" the allocation address in some way.
To illustrate how this could go wrong, let's say we do this:
size_t get_len_array(int *mem)
{
return allcoated_length(mem);
}
...
void func()
{
int *p = new int[100];
cout << get_len_array(p);
delete [] p;
}
void func2()
{
int buf[100];
cout << get_len_array(buf); // Ouch!
}
That means every object allocated by new has its size located somewhere in the heap. Is the location known and if it is how can I access it?
Not really, that is not needed for all cases. To simplify the reasoning, there are two levels at which the sizes could be needed. At the language level, the compiler needs to know what to destroy. At the allocator level, the allocator needs to know how to release the memory given only a pointer.
At the language level, only the array versions new[] and delete[] need to handle any size. When you allocate with new, you get a pointer with the type of the object, and that type has a given size.
To destroy the object the size is not needed. When you delete, either the pointer is to the correct type, or the static type of the pointer is a base and the destructor is virtual. All other cases are undefined behavior, and thus can be ignored (anything can happen). If it is the correct type, then the size is known. If it is a base with a virtual destructor, the dynamic dispatch will find the final overrider, and at that point the type is known.
There could be different strategies to manage this, the one used in the Itanium C++ ABI (used by multiple compilers in multiple platforms, although not Visual Studio) for example generates up to 3 different destructors per type, one of them being a version that takes care of releasing the memory, so although delete ptr is defined in terms of calling the appropriate destructor and then releasing the memory, in this particular ABI delete ptr call a special destructor that both destroys and releases the memory.
When you use new[] the type of the pointer is the same regardless of the number of elements in the dynamic array, so the type cannot be used to retrieve that information back. A common implementation is allocating an extra integral value and storing the size there, followed by the real objects, then returning a pointer to the first object. delete[] would then move the received pointer one integer back, read the number of elements, call the destructor for all of them and then release the memory (pointer retrieved by the allocator, not the pointer given to the program). This is really only needed if the type has a non-trivial destructor, if the type has a trivial destructor, the implementation does not need to store the size and you can avoid storing that number.
Out of the language level, the real memory allocator (think of malloc) needs to know how much memory was allocated so that the same amount can be released. In some cases that can be done by attaching the metadata to the memory buffer in the same way that new[] stores the size of the array, by acquiring a larger block, storing the metadata there and returning a pointer beyond it. The deallocator would then undo the transformation to get to the metadata.
This is, on the other hand, not always needed. A common implementation for allocators of small size is to allocate pages of memory to form pools from which the small allocations are then obtained. To make this efficient, the allocator considers only a few different sizes, and allocations that don't fit one of the sizes exactly are bumped to the next size. If you request, for example, 65 bytes, the allocator might actually give you 128 bytes (assuming pools of 64 and 128 bytes). Thus given one of the larger blocks managed by the allocator, all pointers that were allocated from it have the same size. The allocator can then find the block from which pointer was allocated and infer the size from it.
Of course, this is all implementation details that are not accessible to the C++ program in a standard portable way, and the exact implementation can differ not just based on the program, but also de execution environment. If you are interested in knowing how the information is really kept in your environment, you might be able to find the information, but I would think twice before trying to use it for anything other than learning purposes.
Your are not deleting a object directly, instead you send a pointer to delete operator.
Reference C++
You use delete by following
it with a pointer to a block of memory originally allocated with new:
int * ps = new int; // allocate memory with new
. . . // use the memory
delete ps; // free memory with delete when done
This removes the memory to which ps points; it doesn’t remove the pointer ps itself.
You can reuse ps, for example, to point to another new allocation

C++ dynamically allocated memory

I don't quite get the point of dynamically allocated memory and I am hoping you guys can make things clearer for me.
First of all, every time we allocate memory we simply get a pointer to that memory.
int * dynInt = new int;
So what is the difference between doing what I did above and:
int someInt;
int* dynInt = &someInt;
As I understand, in both cases memory is allocated for an int, and we get a pointer to that memory.
So what's the difference between the two. When is one method preferred to the other.
Further more why do I need to free up memory with
delete dynInt;
in the first case, but not in the second case.
My guesses are:
When dynamically allocating memory for an object, the object doesn't get initialized while if you do something like in the second case, the object get's initialized. If this is the only difference, is there a any motivation behind this apart from the fact that dynamically allocating memory is faster.
The reason we don't need to use delete for the second case is because the fact that the object was initialized creates some kind of an automatic destruction routine.
Those are just guesses would love it if someone corrected me and clarified things for me.
The difference is in storage duration.
Objects with automatic storage duration are your "normal" objects that automatically go out of scope at the end of the block in which they're defined.
Create them like int someInt;
You may have heard of them as "stack objects", though I object to this terminology.
Objects with dynamic storage duration have something of a "manual" lifetime; you have to destroy them yourself with delete, and create them with the keyword new.
You may have heard of them as "heap objects", though I object to this, too.
The use of pointers is actually not strictly relevant to either of them. You can have a pointer to an object of automatic storage duration (your second example), and you can have a pointer to an object of dynamic storage duration (your first example).
But it's rare that you'll want a pointer to an automatic object, because:
you don't have one "by default";
the object isn't going to last very long, so there's not a lot you can do with such a pointer.
By contrast, dynamic objects are often accessed through pointers, simply because the syntax comes close to enforcing it. new returns a pointer for you to use, you have to pass a pointer to delete, and (aside from using references) there's actually no other way to access the object. It lives "out there" in a cloud of dynamicness that's not sitting in the local scope.
Because of this, the usage of pointers is sometimes confused with the usage of dynamic storage, but in fact the former is not causally related to the latter.
An object created like this:
int foo;
has automatic storage duration - the object lives until the variable foo goes out of scope. This means that in your first example, dynInt will be an invalid pointer once someInt goes out of scope (for example, at the end of a function).
An object created like this:
int foo* = new int;
Has dynamic storage duration - the object lives until you explicitly call delete on it.
Initialization of the objects is an orthogonal concept; it is not directly related to which type of storage-duration you use. See here for more information on initialization.
Your program gets an initial chunk of memory at startup. This memory is called the stack. The amount is usually around 2MB these days.
Your program can ask the OS for additional memory. This is called dynamic memory allocation. This allocates memory on the free store (C++ terminology) or the heap (C terminology). You can ask for as much memory as the system is willing to give (multiple gigabytes).
The syntax for allocating a variable on the stack looks like this:
{
int a; // allocate on the stack
} // automatic cleanup on scope exit
The syntax for allocating a variable using memory from the free store looks like this:
int * a = new int; // ask OS memory for storing an int
delete a; // user is responsible for deleting the object
To answer your questions:
When is one method preferred to the other.
Generally stack allocation is preferred.
Dynamic allocation required when you need to store a polymorphic object using its base type.
Always use smart pointer to automate deletion:
C++03: boost::scoped_ptr, boost::shared_ptr or std::auto_ptr.
C++11: std::unique_ptr or std::shared_ptr.
For example:
// stack allocation (safe)
Circle c;
// heap allocation (unsafe)
Shape * shape = new Circle;
delete shape;
// heap allocation with smart pointers (safe)
std::unique_ptr<Shape> shape(new Circle);
Further more why do I need to free up memory in the first case, but not in the second case.
As I mentioned above stack allocated variables are automatically deallocated on scope exit.
Note that you are not allowed to delete stack memory. Doing so would inevitably crash your application.
For a single integer it only makes sense if you need the keep the value after for example, returning from a function. Had you declared someInt as you said, it would have been invalidated as soon as it went out of scope.
However, in general there is a greater use for dynamic allocation. There are many things that your program doesn't know before allocation and depends on input. For example, your program needs to read an image file. How big is that image file? We could say we store it in an array like this:
unsigned char data[1000000];
But that would only work if the image size was less than or equal to 1000000 bytes, and would also be wasteful for smaller images. Instead, we can dynamically allocate the memory:
unsigned char* data = new unsigned char[file_size];
Here, file_size is determined at runtime. You couldn't possibly tell this value at the time of compilation.
Read more about dynamic memory allocation and also garbage collection
You really need to read a good C or C++ programming book.
Explaining in detail would take a lot of time.
The heap is the memory inside which dynamic allocation (with new in C++ or malloc in C) happens. There are system calls involved with growing and shrinking the heap. On Linux, they are mmap & munmap (used to implement malloc and new etc...).
You can call a lot of times the allocation primitive. So you could put int *p = new int; inside a loop, and get a fresh location every time you loop!
Don't forget to release memory (with delete in C++ or free in C). Otherwise, you'll get a memory leak -a naughty kind of bug-. On Linux, valgrind helps to catch them.
Whenever you are using new in C++ memory is allocated through malloc which calls the sbrk system call (or similar) itself. Therefore no one, except the OS, has knowledge about the requested size. So you'll have to use delete (which calls free which goes to sbrk again) for giving memory back to the system. Otherwise you'll get a memory leak.
Now, when it comes to your second case, the compiler has knowledge about the size of the allocated memory. That is, in your case, the size of one int. Setting a pointer to the address of this int does not change anything in the knowledge of the needed memory. Or with other words: The compiler is able to take care about freeing of the memory. In the first case with new this is not possible.
In addition to that: new respectively malloc do not need to allocate exactly the requsted size, which makes things a bit more complicated.
Edit
Two more common phrases: The first case is also known as static memory allocation (done by the compiler), the second case refers to dynamic memory allocation (done by the runtime system).
What happens if your program is supposed to let the user store any number of integers? Then you'll need to decide during run-time, based on the user's input, how many ints to allocate, so this must be done dynamically.
In a nutshell, dynamically allocated object's lifetime is controlled by you and not by the language. This allows you to let it live as long as it is required (as opposed to end of the scope), possibly determined by a condition that can only be calculated at run-rime.
Also, dynamic memory is typically much more "scalable" - i.e. you can allocate more and/or larger objects compared to stack-based allocation.
The allocation essentially "marks" a piece of memory so no other object can be allocated in the same space. De-allocation "unmarks" that piece of memory so it can be reused for later allocations. If you fail to deallocate memory after it is no longer needed, you get a condition known as "memory leak" - your program is occupying a memory it no longer needs, leading to possible failure to allocate new memory (due to the lack of free memory), and just generally putting an unnecessary strain on the system.

How does delete differentiate between built-in data types and user defined ones?

If I do this:
// (1.)
int* p = new int;
//...do something
delete p;
// (2.)
class sample
{
public:
sample(){}
~sample(){}
};
sample* pObj = new sample;
//...do something
delete pObj;
Then how does C++ compiler know that object following delete is built-in data type or a class object?
My other question is that if I new a pointer to an array of int's and then I delete [] then how does compiler know the size of memory block to de-allocate?
The compiler knows the type of the pointed-to object because it knows the type of the pointer:
p is an int*, therefore the pointed-to object will be an int.
pObj is a sample*, therefore the pointed-to object will be a sample.
The compiler does not know if your int* p points to a single int object or to an array (int[N]). That's why you must remember to use delete[] instead of delete for arrays.
The size of the memory block to de-allocate and, most importantly, the number of objects to destroy, are known because new[] stores them somewhere, and delete[] knows where to retrieve these values. This question from C++ FAQ Lite shows two common techniques to implement new[] and delete[].
It knows the difference between them because of the type of the pointer you pass to it: It is undefined behavior to pass a different pointer type than you allocated with (except you may pass a pointer to a base class, if the destructor is virtual, of course).
The size of an array will be stored somewhere. It's like in C where you can malloc a certain amount of memory, and free afterwards - the runtime will have to manage to know the size allocated previously.
For example it can store the count of elements prior to the buffer allocated. The Standard explicitly allows the compiler to pass a different request size to the allocation function (operator new[]) in case of array allocations - this can be used by the compiler to stick the count into, and offset the address returned by the new expression by the size of that counter.
It doesn't!
All what delete does is that it invokes the destructor of the type, which is "no action" in the case of primitive types. Then, it passes the pointer to ::operator delete (or an overloaded version if you like), and thats operator returns back the memory(a memory manager issue). i.e. You can write your own memory manager easily in C++ if you like, the language provides one by default!
The compiler knows the type of the object being deleted and writes different code for you to achieve the right results:
delete p can call the run time delete with the size of an int.
delete pObj can call pObj->~sample() first, then delete with the size of sample
I think with arrays, there is a hidden value for the size of the array, so it could be that the whole array is deleted in one go.
Then how does C++ compiler know that object following delete is built-in data type or a class object?
Because at compile time the compiler tracks the types of each object and plants the appropraite code.
My other question is that if I new a pointer to an array of int's and then I delete [] then how does compiler know the size of memory block to de-allocate?
It does not. This is kept track of by the runtime system.
When you dynamically allocate an array the runtime library associates the size of the object with the object thus when it deletes it it knows (by looking up the associated value) the size.
But I guess you want to know how it does the association?
This depends on the system and is an implementation detail. But a simple stratergy is to allocate an extra 4 bytes store the size in the first four bytes then return a pointer to the 4th byte allocated. When you delete a pointer you know that the size is the 4 bytes before the pointer. Note: I am not saying your system is using this technique but it is one stratergy.
For the first (non-array) part of the question, the answers above indicating that the compiler inserts code to de-allocate the appropriate number of bytes based on the pointer type, don't quite provide a clear answer for me... the delete operator 1) calls a destructor if applicable and then 2) calls the "operator delete()" function... it is operator delete which actually de-allocates. I can see compiler-generated code playing a role in part (1), ie. the destination address of the destructor must be inserted. But in part (2), it is a pre-existing library function handling the de-allocation, so how will it know the size of the data? The global operator delete--which, I believe is used in all cases unless a class-member/overloaded-global version is defined by the programmer--accepts only a void* argument spec'ing the start of the data, so it can't even be passed the data size.
I've read things indicating the compiler-generated code idea, as well as things suggesting that the global operator delete for non-arrays simply uses free(), ie. it knows the data size not by the pointer type, but by looking a few bytes before the data itself, where the size will have been stashed by new/malloc. The latter is the only solution that makes sense to me, but maybe someone can enlighten me differently...

Is there any danger in calling free() or delete instead of delete[]? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
( POD )freeing memory : is delete[] equal to delete ?
Does delete deallocate the elements beyond the first in an array?
char *s = new char[n];
delete s;
Does it matter in the above case seeing as all the elements of s are allocated contiguously, and it shouldn't be possible to delete only a portion of the array?
For more complex types, would delete call the destructor of objects beyond the first one?
Object *p = new Object[n];
delete p;
How can delete[] deduce the number of Objects beyond the first, wouldn't this mean it must know the size of the allocated memory region? What if the memory region was allocated with some overhang for performance reasons? For example one could assume that not all allocators would provide a granularity of a single byte. Then any particular allocation could exceed the required size for each element by a whole element or more.
For primitive types, such as char, int, is there any difference between:
int *p = new int[n];
delete p;
delete[] p;
free p;
Except for the routes taken by the respective calls through the delete->free deallocation machinery?
It's undefined behaviour (most likely will corrupt heap or crash the program immediately) and you should never do it. Only free memory with a primitive corresponding to the one used to allocate that memory.
Violating this rule may lead to proper functioning by coincidence, but the program can break once anything is changed - the compiler, the runtime, the compiler settings. You should never rely on such proper functioning and expect it.
delete[] uses compiler-specific service data for determining the number of elements. Usually a bigger block is allocated when new[] is called, the number is stored at the beginning and the caller is given the address behind the stored number. Anyway delete[] relies on the block being allocated by new[], not anything else. If you pair anything except new[] with delete[] or vice versa you run into undefined behaviour.
Read the FAQ: 16.3 Can I free() pointers allocated with new? Can I delete pointers allocated with malloc()?
Does it matter in the above case seeing as all the elements of s are allocated contiguously, and it shouldn't be possible to delete only a portion of the array?
Yes it does.
How can delete[] deduce the number of Objects beyond the first, wouldn't this mean it must know the size of the allocated memory region?
The compiler needs to know. See FAQ 16.11
Because the compiler stores that information.
What I mean is the compiler needs different deletes to generate appropriate book-keeping code. I hope this is clear now.
Yes, this is dangerous!
Dont do it!
It will lead to programm crashes or even worse behavior!
For objects allocated with new you MUST use delete;
For objects allocated with new [] you MUST use delete [];
For objects allocated with malloc() or calloc() you MUST use free();
Be aware also that for all these cases its illegal to delete/free a already deleted/freed pointer a second time. free may also NOT be called with null. calling delete/delete[] with NULL is legal.
Yes, there's a real practical danger. Even implementation details aside, remember that operator new/operator delete and operator new[]/operator delete[] functions can be replaced completely independently. For this reason, it is wise to think of new/delete, new[]/delete[], malloc/free etc. as different, completely independent methods of memory allocaton, which have absolutely nothing in common.
Raymond Chen (Microsoft developer) has an in-depth article covering scaler vs. vector deletes, and gives some background to the differences. See:
http://blogs.msdn.com/oldnewthing/archive/2004/02/03/66660.aspx
Does delete deallocate the elements
beyond the first in an array?
No. delete will deallocate only the first element regardless on which compiler you do this. It may work in some cases but that's co-incidental.
Does it matter in the above case seeing as all the elements of s are allocated
contiguously, and it shouldn't be possible to delete only a portion of the array?
Depends on how the memory is marke as free. Again implementation dependant.
For more complex types, would delete call the destructor of objects beyond the first one?
No. Try this:
#include <cstdio>
class DelTest {
static int next;
int i;
public:
DelTest() : i(next++) { printf("Allocated %d\n", i); }
~DelTest(){ printf("Deleted %d\n", i); }
};
int DelTest::next = 0;
int main(){
DelTest *p = new DelTest[5];
delete p;
return 0;
}
How can delete[] deduce the number of
Objects beyond the first, wouldn't
this mean it must know the size of the
allocated memory region?
Yes, the size is stored some place. Where it is stored depends on implementation. Example, the allocator could store the size in a header preceding the allocated address.
What if the memory region was
allocated with some overhang for
performance reasons? For example one
could assume that not all allocators
would provide a granularity of a
single byte. Then any particular
allocation could exceed the required
size for each element by a whole
element or more.
It is for this reason that the returned address is made to align to word boundaries. The "overhang" can be seen using the sizeof operator and applies to objects on the stack as well.
For primitive types, such as char, int, is there any difference between ...?
Yes. malloc and new could be using separate blocks of memory. Even if this were not the case, it's a good practice not to assume they are the same.
It's undefined behavior. Hence, the anser is: yes, there could be danger. And it's impossible to predict exactly what will trigger problems. Even if it works one time, will it work again? Does it depend on the type? Element count?
For primitive types, such as char, int, is there any difference between:
I'd say you'll get undefined behaviour. So you shouldn't count on stable behaviour. You should always use new/delete, new[]/delete[] and malloc/free pairs.
Although it might seem in some logic way that you can mix new[] and free or delete instead of delete[], this is under the assumption about the compiler being a fairly simplistic, i.e., that it will always use malloc() to implement the memory allocation for new[].
The problem is that if your compiler has a smart enough optimizer it might see that there is no "delete[]" corresponding to the new[] for the object you created. It might therefore assume that it can fetch the memory for it from anywhere, including the stack in order to save the cost of calling the real malloc() for the new[]. Then when you try to call free() or the wrong kind of delete on it, it is likely to malfunction hard.
Step 1 read this: what-is-the-difference-between-new-delete-and-malloc-free
You are only looking at what you see on the developer side.
What you are not considering is how the std lib does memory management.
The first difference is that new and malloc allocate memroy from two different areas in memory (New from FreeStore and malloc from Heap (Don't focus on the names they are both basically heaps, those are just there official names from the standard)). If you allocate from one and de-allocate to the other you will messs up the data structures used to manage the memory (there is no gurantee they will use the same structure for memory management).
When you allocate a block like this:
int* x= new int; // 0x32
Memory May look like this: It probably wont since I made this up without thinking that hard.
Memory Value Comment
0x08 0x40 // Chunk Size
0x16 0x10000008 // Free list for Chunk size 40
0x24 0x08 // Block Size
0x32 ?? // Address returned by New.
0x40 0x08 // Pointer back to head block.
0x48 0x0x32 // Link to next item in a chain of somthing.
The point is that there is a lot more information in the allocated block than just the int you allocated to handle memory management.
The standard does not specify how this is done becuase (in C/C++ style) they did not want to inpinge on the compiler/library manufacturers ability to implement the most effecient memory management method for there architecture.
Taking this into account you want the manufacturer the ability to distinguish array allocation/deallocation from normal allocation/deallocation so that it is possable to make it as effecient as possable for both types independantly. As a result you can not mix and match as internally they may use different data structures.
If you actually analyse the memory allocation differences between C and C++ applications you find that they are very different. And thus it is not unresonable to use completely different techniques of memory management to optimise for the application type. This is another reason to prefer new over malloc() in C++ as it will probably be more effecient (The more important reason though will always be to reducing complexity (IMO)).

Performance on strings initialization in C++

I have following questions regarding strings in C++:
1>> which is a better option(considering performance) and why?
1.
string a;
a = "hello!";
OR
2.
string *a;
a = new string("hello!");
...
delete(a);
2>>
string a;
a = "less";
a = "moreeeeeee";
how exactly memory management is handled in c++ when a bigger string is copied into a smaller string? Are c++ strings mutable?
It is almost never necessary or desirable to say
string * s = new string("hello");
After all, you would (almost) never say:
int * i = new int(42);
You should instead say
string s( "hello" );
or
string s = "hello";
And yes, C++ strings are mutable.
All the following is what a naive compiler would do. Of course as long as it doesn't change the behavior of the program, the compiler is free to make any optimization.
string a;
a = "hello!";
First you initialize a to contain the empty string. (set length to 0, and one or two other operations). Then you assign a new value, overwriting the length value that was already set. It may also have to perform a check to see how big the current buffer is, and whether or not more memory should be allocated.
string *a;
a = new string("hello!");
...
delete(a);
Calling new requires the OS and the memory allocator to find a free chunk of memory. That's slow. Then you initialize it immediately, so you don't assign anything twice or require the buffer to be resized, like you do in the first version.
Then something bad happens, and you forget to call delete, and you have a memory leak, in addition to a string that is extremely slow to allocate. So this is bad.
string a;
a = "less";
a = "moreeeeeee";
Like in the first case, you first initialize a to contain the empty string. Then you assign a new string, and then another. Each of these may require a call to new to allocate more memory. Each line also requires length, and possibly other internal variables to be assigned.
Normally, you'd allocate it like this:
string a = "hello";
One line, perform initialization once, rather than first default-initializing, and then assigning the value you want.
It also minimizes errors, because you don't have a nonsensical empty string anywhere in your program. If the string exists, it contains the value you want.
About memory management, google RAII.
In short, string calls new/delete internally to resize its buffer. That means you never need to allocate a string with new. The string object has a fixed size, and is designed to be allocated on the stack, so that the destructor is automatically called when it goes out of scope. The destructor then guarantees that any allocated memory is freed. That way, you don't have to use new/delete in your user code, which means you won't leak memory.
Is there a specific reason why you constantly use assignment instead of intialization? That is, why don't you write
string a = "Hello";
etc.? This avoids a default construction and just makes more sense semantically. Creating a pointer to a string just for the sake of allocating it on the heap is never meaningful, i.e. your case 2 doesn't make sense and is slightly less efficient.
As to your last question, yes, strings in C++ are mutable unless declared const.
string a;
a = "hello!";
2 operations: calls the default constructor std:string() and then calls the operator::=
string *a; a = new string("hello!"); ... delete(a);
only one operation: calls the constructor std:string(const char*) but you should not forget to release your pointer.
What about
string a("hello");
In case 1.1, your string members (which include pointer to the data) are held in stack and the memory occupied by the class instance is freed when a goes out of scope.
In case 1.2, memory for the members is allocated dynamically from heap too.
When you assign a char* constant to a string, memory that will contain the data will be realloc'ed to fit the new data.
You may see how much memory is allocated by calling string::capacity().
When you call string a("hello"), memory gets allocated in the constructor.
Both constructor and assignment operator call same methods internally to allocated memory and copy new data there.
If you look at the docs for the STL string class (I believe the SGI docs are compliant to the spec), many of the methods list complexity guarantees. I believe many of the complexity guarantees are intentionally left vague to allow different implementations. I think some implementations actually use a copy-on-modify approach such that assigning one string to another is a constant-time operation, but you may incur an unexpected cost when you try to modify one of those instances. Not sure if that's still true in modern STL though.
You should also check out the capacity() function, which will tell you the maximum length string you can put into a given string instance before it will be forced to reallocate memory. You can also use reserve() to cause a reallocation to a specific amount if you know you're going to be storing a large string in the variable at a later time.
As others have said, as far as your examples go, you should really favor initialization over other approaches to avoid the creation of temporary objects.
Most likely
string a("hello!");
is faster than anything else.
You're coming from Java, right? In C++, objects are treated the same (in most ways) as the basic value types. Objects can live on the stack or in static storage, and be passed by value. When you declare a string in a function, that allocates on the stack however many bytes the string object takes. The string object itself does use dynamic memory to store the actual characters, but that's transparent to you. The other thing to remember is that when the function exits and the string you declared is no longer in scope, all of the memory it used is freed. No need for garbage collection (RAII is your best friend).
In your example:
string a;
a = "less";
a = "moreeeeeee";
This puts a block of memory on the stack and names it a, then the constructor is called and a is initialized to an empty string. The compiler stores the bytes for "less" and "moreeeeeee" in (I think) the .rdata section of your exe. String a will have a few fields, like a length field and a char* (I'm simplifying greatly). When you assign "less" to a, the operator=() method is called. It dynamically allocates memory to store the input value, then copies it in. When you later assign "moreeeeeee" to a, the operator=() method is again called and it reallocates enough memory to hold the new value if necessary, then copies it in to the internal buffer.
When string a's scope exits, the string destructor is called and the memory that was dynamically allocated to hold the actual characters is freed. Then the stack pointer is decremented and the memory that held a is no longer "on" the stack.
Creating a string directly in the heap is usually not a good idea, just like creating base types. It's not worth it since the object can easily stay on the stack and it has all the copy constructors and assignment operator needed for an efficient copy.
The std:string itself has a buffer in heap that may be shared by several string depending on the implementation.
For instance, with Microsoft's STL implementation you could do that:
string a = "Hello!";
string b = a;
And both string would share the same buffer until you changed it:
a = "Something else!";
That's why it was very bad to store the c_str() for latter use; c_str() guarantee only validity until another call to that string object is made.
This lead to very nasty concurrency bugs that required this sharing functionality to be turned off with a define if you used them in a multithreaded application.