Strings in memory - c++

struct Example
{
char* string;
int x;
};
When I allocate a new instance of Example 8 bytes are allocated (assuming that sizeof(char*)=4). So when I call this:
Example* sp = new Example();
sp->string = "some text";
How is the string allocated? Is is placed in a random empty memory location? or is there some kind of relation between sp and member string?
So, "some text" makes a dynamic memory allocation?

String literals like that are put wherever the compiler wants to put them, they have a static storage duration (they last for the life of the entire program), and they never move around in memory.
The compiler usually stores them in the executable file itself in a read-only portion of memory, so when you do something = "some text"; it just makes something point to that location in memory.

The string is in the executable when you compile it.
sp->string = "some text";
This line is just setting the pointer in the struct to that string. (note: you have a double typo there, it's sp, and it's a pointer so you need ->)

In this case, the constant string value should be put into the program's data area, and the pointer in your struct will point to that area rather unambiguously as long as it has the value. In your words, it is placed into a random area of memory (in that it is unrelated to where your struct instance goes).

this way you made a string "CONSTANT" first, it stays in the program's heap (but not stack), You needn't manage it (alloc the memory of free it), and it can not be dynamically freed indeed.

Related

C++ std::string* s; Memory reclaimed?

Given a function foo with a statement in it:
void foo() {
std::string * s;
}
Is memory reclaimed after this function returns?
I am assuming yes because this pointer isn't pointing to anything but some people are saying no - that it is a dangling pointer.
std::string* s is just an uninitialized pointer to a string. The pointer will be destroyed when function foo returns (because the pointer itself is a local variable allocated on the stack). No std::string was ever created, hence you won't have any memory leak.
If you say
void foo() {
std::string * s = new std::string;
}
Then you will have memory leak
This code is typical when people learn about strings a-la C, and then start using C++ through C idioms.
C++ classes (in particular standard library classes) treat objects as values, and manage themselves the memory they need.
std::string, in this sense is not different from an int. If you need a "text container", just declare an std::string (not std::string*) and initialize it accordingly (uninitialized std::strings are empty by definition - and default constructor) than use it to form expression using method, operators and related functions like you will do with other simple types.
std::string* itself is a symptom of a bad designed environment.
Explicit dynamic memory in C++ is typically used in two situation:
You don't know at compile time the size of an object (typical with unknown size arrays, like C strings are)
You don't know at compile time the runtime-type of an object (since its class will be decided on execution, based on some other input)
Now, std:string manage itself the first point, and does not support the second (it has no virtual methods), so allocating it dynamically adds no value: it just adds all the complication to manage yourself the memory to contain the string object that is itself a manager of other memory to contain its actual text.
This code just creates a pointer to somewhere in memory, which contains string value and it points to somewhere which has been allocated before and it does not allocate new string.
it just allocate a memory for pointer value and after function return it's no more valid...

Is string.c_str() deallocation necessary?

My code converts C++ strings to C strings somewhat often, and I am wondering if the original string is allocated on the stack. Will the C string be allocated on the stack as well? For instance:
string s = "Hello, World!";
char* s2 = s.c_str();
Will s2 be allocated on the stack, or in the heap? In other words, will I need to delete s2?
Conversely, if I have this code:
string s = new string("Hello, mr. heap...");
char* s2 = s.c_str();
Will s2 now be on the heap, as its origin was on the heap?
To clarify, when I ask if s2 is on the heap, I know that the pointer is on the stack. I'm asking if what it points to will be on the heap or the stack.
string s = "Hello world";
char* s2 = s.c_str();
Will s2 be allocated on the stack, or in the heap? In other words... Will I need to delete s2?
No, don't delete s2!
s2 is on the stack if the above code is inside a function; if the code's at global or namespace scope then s2 will be in some statically-allocated dynamically-initialised data segment. Either way, it is a pointer to a character (which in this case happens to be the first 'H' character in the null-terminated string_ representation of the text content of s). That text itself is wherever the s object felt like constructing that representation. Implementations are allowed to do that however they like, but the crucial implementation choice for std::string is whether it provides a "short-string optimisation" that allows very short strings to be embedded directly in the s object and whether "Hello world" is short enough to benefit from that optimisation:
if so, then s2 would point to memory inside s, which will be stack- or statically-allocated as explained for s2 above
otherwise, inside s there would be a pointer to dynamically allocated (free-store / heap) memory wherein the "Hello world\0" content whose address is returned by .c_str() would appear, and s2 would be a copy of that pointer value.
Note that c_str() is const, so for your code to compile you need to change to const char* s2 = ....
You must notdelete s2. The data to which s2 points is still owned and managed by the s object, will be invalidated by any call to non-const methods of s or by s going out of scope.
string s = new string("Hello, mr. heap...");
char* s2 = s.c_str();
Will s2 now be on the heap, as its origin was on the heap?
This code doesn't compile, as s is not a pointer and a string doesn't have a constructor like string(std::string*). You could change it to either:
string* s = new string("Hello, mr. heap...");
...or...
string s = *new string("Hello, mr. heap...");
The latter creates a memory leak and serves no useful purpose, so let's assume the former. Then:
char* s2 = s.c_str();
...needs to become...
const char* s2 = s->c_str();
Will s2 now be on the heap, as its origin was on the heap?
Yes. In all the scenarios, specifically if s itself is on the heap, then:
even if there's a short string optimisation buffer inside s to which c_str() yields a pointer, it must be on the heap, otherwise
if s uses a pointer to further memory to store the text, that memory will also be allocated from the heap.
But again, even knowing for sure that s2 points to heap-allocated memory, your code does not need to deallocate that memory - it will be done automatically when s is deleted:
string* s = new string("Hello, mr. heap...");
const char* s2 = s->c_str();
// <...use s2 for something...>
delete s; // "destruct" s and deallocate the heap used for it...
Of course, it's usually better just to use string s("xyz"); unless you need a lifetime beyond the local scope, and a std::unique_ptr<std::string> or std::shared_ptr<std::string> otherwise.
c_str() returns a pointer to an internal buffer in the string object. You don't ever free()/delete it.
It is only valid as long as the string it points into is in scope. In addition, if you call a non-const method of the string object, it is no longer guaranteed to be valid.
See std::string::c_str
std::string::c_str() returns a const char*, not a char *. That's a pretty good indication that you don't need to free it. Memory is managed by the instance (see some details in this link, for example), so it's only valid while the string instance is valid.
Firstly, even your original string is not allocated on the stack, as you seem to believe. At least not entirely. If your string s is declared as a local variable, only the string object itself is "allocated on the stack". The controlled sequence of that string object is allocated somewhere else. You are not supposed to know where it is allocated, but in most cases it is allocated on the heap. I.e. the actual string "Hello world" stored by s in your first example is generally allocated on the heap, regardless of where you declare your s.
Secondly, about c_str().
In the original specification of C++ (C++98) c_str generally returned a pointer to an independent buffer allocated somewhere. Again, you are not supposed to know where it is allocated, but in general case it was supposed to be allocated on the heap. Most implementations of std::string made sure that their controlled sequence was always zero-terminated, so their c_str returned a direct pointer to the controlled sequence.
In the new specification of C++ (C++11) it is now required that c_str returns a direct pointer to the controlled sequence.
In other words, in general case the result of c_str will point to a heap-allocated memory even for local std::string objects. Your first example is not duifferent from your second example in that regard. However, in any case the memory pointed by c_str() is not owned by you. You are not supposed to deallocate it. You are not supposed to even know where it is allocated.
s2 will be valid as long as s remains in scope. It's a pointer to memory that s owns. See e.g. this MSDN documentation: "the string has a limited lifetime and is owned by the class string."
If you want to use std::string inside a function as a factory for string manipulation, and then return C-style strings, you must allocate heap storage for the return value. Get space using malloc or new, and then copy the contents of s.c_str().
Will s2 be allocated on the stack, or in the heap?
Could be in either. For example, if the std::string class does small string optimization, the data will reside on the stack if its size is below the SSO threshold, and on the heap otherwise. (And this is all assuming the std::string object itself is on the stack.)
Will I need to delete s2?
No, the character array object returned by c_str is owned by the string object.
Will s2 now be on the heap, as its origin was on the heap?
In this case the data will likely reside in the heap anyway, even when doing SSO. But there's rarely a reason to dynamically allocate a std::string object.
That depends. If I remember correctly, CString makes a copy of the input string, so no, you wouldn't need to have any special heap allocation routines.

Pointers to objects in C++ - what's on the stack/heap?

I started out with Java, so I am a bit confused on what's going on with the stack/heap on the following line:
string *x = new string("Hello");
where x is a local variable. In C++, does ANYTHING happen on the stack at all in regards to that statement? I know from reading it says that the object is on the heap, but what about x? In Java, x would be on the stack just holding the memory address that points to the object, but I haven't found a clear source that says what's happening in C++.
Any object you just created, e.g. x in your example is on the stack. The object x is just a pointer, though, which points to a heap allocated string which you put on the heap using new string("Hello"). Typically, you wouldn't create a string like this in C++, however. Instead you would use
string x("Hello");
This would still allocate x on the stack. Whether the characters representing x's value also live on the stack or rather on the heap, depends on the string implementation. As a reasonable model you should assume that they are on the heap (some std::string implementation put short string into the stack object, avoiding any heap allocations and helping with locality).
yes, x is on the stack : it is a local variable, which are all on the stack.
The new operator provokes the allocation of memory on the heap.
It's the same as in Java. The string or String is on the heap, and the pointer (or reference, in Java) is on the stack.
In Java, all the Objects are on the heap, and the stack is only made up of primitive types and the references themselves. (The stack has other stuff like return addresses and so on, but never mind that).
The main difference between the C++ stack and the Java stack is that, in C++, you can put the entire object directly onto the stack. e.g. string x = string("Hello");
It's also possible, in C++, to put primitive types directly onto the heap. e.g. int * x = new int();. (In other words, "if autoboxing is the solution, then what was the problem?")
In short, Java has rigid distinctions between primitive types and Objects, and the primitives are very much second-class. C++ is much more relaxed.
It depends on where this line is located. If it is located somewhere at file scope (i.e. outside of a function), then x is a global variable which definitely is not on the stack (well, in theory it could be put on the stack before calling main(), but I strongly doubt any compiler does that), but also not on the heap. If, however, the line is part of a function, then indeed x is on the stack.
Since x is of type string* (i.e. pointer to string), it indeed just contains the address of the string object allocated with new. Since that string object is allocated with new, it indeed lives on the heap.
Note however that, unlike in Java, there's no need that built-in types live on the stack and class objects live on the heap. The following is an example of a pointer living on the heap, pointing to an object living in the stack:
int main()
{
std::string str("Hello"); // a string object on the stack
std::string** ptr = new std::string*(&str); // a pointer to string living on the heap, pointing to str
(**ptr) += " world"; // this adds "world" to the string on the stack
delete ptr; // get rid of the pointer on the heap
std::cout << str << std::endl; // prints "Hello world"
} // the compiler automatically destroys str here

Heap or Stack? When a constant string is referred in function call in C++

Consider the function:
char *func()
{
return "Some thing";
}
Is the constant string (char array) "Some thing" stored in the stack as local to the function call or as global in the heap?
I'm guessing it's in the heap.
If the function is called multiple times, how many copies of "Some thing" are in the memory? (And is it the heap or stack?)
String literal "Some thing" is of type const char*. So, they are neither on heap nor on stack but on a read only location which is a implementation detail.
From Wikipedia
Data
The data area contains global and static variables used by the program
that are initialized. This segment can be further classified into
initialized read-only area and initialized read-write area. For
instance the string defined by char s[] = "hello world" in C and a C
statement like int debug=1 outside the "main" would be stored in
initialized read-write area. And a C statement like const char* string
= "hello world" makes the string literal "hello world" to be stored in
initialized read-only area and the character pointer variable string
in initialized read-write area. Ex: static int i = 10 will be stored
in data segment and global int i = 10 will be stored in data segment
Constant strings are usually placed with program code, which is neither heap nor stack (this is an implementation detail). Only one copy will exist, each time the function returns it will return the same pointer value (this is guaranteed by the standard). Since the string is in program memory, it is possible that it will never be loaded into memory, and if you run two copies of the program then they will share the same copy in RAM (this only works for read-only strings, which includes string constants in C).
Neither, its in the static section of the program. Similar to having the string as a global variable. There is only ever one copy of the string within the translation unit.
Neither on the heap, nor on stack, it is part of the so-called init section in the executable image (COFF). This is loaded into memory and contains stuff like strings.

when does the memory, pointed by (char *) as function parameter gets deleted?

code 1 :
void foo(char * text) {}
foo("Test");
as far as i understand, this will happen :
memory is allocated for "Test"
pointer is created and its value is copyed to (char * text pointer), so (char * text) points to the place in memory, where "Test" is (better to say, on the first char of "Test")
after the function is done, it destroys the pointer(char * text), pointing to the beginning of "Test", doesnt create this a memory leak?
and the question is, when does the "Test" gets deleted, when the function destroys only the pointer
isn't it better to do smth. like that? :
char * _text = "Test";
foo(_text);
delete[] _text;
You can think of string literals as being part of the code. They aren't dynamically allocated, they have so-called "static storage duration", which means they exist for the duration of the program, and they don't need to be freed (indeed, must not be freed).
It is always wrong to delete[] something that wasn't created with new[], so your second code snippet has undefined behavior.
"Test" is a string literal, which has a static storage duration. It will not be deleted until the program works. And you should not delete it by yourself.
Actually it doesn't get deleted.
That string is allocated in data segment and the call to foo("Test") just pushes on the stack the pointer to that string, without "copying" it as you are saying.
It is not leaked memory because the string is part of the final binary file and it will always be there, in a section of the binary that is just for that kind of things (constants and so on).
It happens that the string itself (the bytes for "Test") are placed in data segment in a read-only section while the pointer (eg char *_test = "Test") is instead stored in a read-write section (stack or heap, it depends how the pointer is initialized and used). You are allowed to modify the pointer but that won't delete the string from the data segment.
Hard coded values in C are actually compiled into the binary and as such are not allocated. More correctly they appear in the "data" section of the executable and live as long as the program does.
Also, pointers are not "destroyed". Remember that pointers are just addresses to memory that may be anywhere (stack/heap) but pointers a not objects.