memory allocation for array of strings - c++

As far as I know arrays are allocated statically but strings are dynamic as their length frequently changes during runtime.
What happens when I define an array like this:
std::string array[] = {"abc", "defghi", "jk", "lmnop", "qrstuvwxyz"};?
Is there a limited amount of memory allocated for every string? Or is the array allocated dynamically?

Don't overthink things. When you say T x[N];, you declare an automatic (i.e. scope-local) array. This is very similar to just declaring T x1;, T x2;, ..., T xN;. An instance of std::string always occupies the same, small size in its declaration context (e.g. on the stack); it is only the memory which it manages (by default on the free store) that is dynamic.
Note that when you write std::string s("Hello"); in your code, then the string literal (which is passed to the constructor) is of course stored in your program binary somewhere, and it gets loaded into the program memory (usually a read-only data segment). So if you really just need to read those strings (as opposed to, say, modify them), then you might as well just declare an array of char pointers and save some memory:
const char * strings[] = { "Hello", "World", "!" }; // just one copy in r/o memory

The Literal strings are stored in the data segment of the binary.
Edit:
Good point in the comment by ildjarn. The std::strings are created by copy constructor in the arrays. Where they are stored is implementation dependent. Some implementation store small strings (less than 32 chars) inline in an array. Others will allocate from the std::allocator which is typically malloc-ed from the heap.

In this case the compiler places those literals in the .data section of your ELF binary. So in this case they are allocated statically in the executable (or library) file that you create.
The .data section is usually reserved for compile time constants that aren't just hardcoded into the assembly instructions. As you can see it's much cheaper to simply save an address to the string than to put it everywhere it is used (as is done by some integers and defined macros).

Related

How do strings allocate memory in c++?

I know that dynamic memory has advantages over setting a fixed size array and and using a portion of it. But in dynamic memory you have to enter the amount data that you want to store in the array. When using strings you can type as many letters as you want(you can even use strings for numbers and then use a function to convert them). This fact makes me think that dynamic memory for character arrays is obsolete compared to strings.
So i wanna know what are the advantages and disadvantages when using strings? When is the space occupied by strings freed? Is maybe the option to free your dynamically allocated memory with delete an advantage over strings? Please explain.
The short answer is "no, there is no drawbacks, only advantages" with std::string over character arrays.
Of course, strings do USE dynamic memory, it just hides the fact behind the scenes so you don't have to worry about it.
In answer to you question: When is the space occupied by strings freed? this post may be helpful. Basically, std::strings are freed once they go out of scope. Often the compiler can decide when to allocate and release the memory.
std::string usually contains an internal dynamically allocated buffer. When you assign data, or if you push back new data, and the current buffer size is not sufficient, a new buffer is allocated with an increased size and the old data is copied or moved to the new buffer. The old buffer is then deallocated.
The main buffer is deallocated when the string goes out of scope. If the string object is a local variable in a function (on the stack), it will deallocate at the end of the current code block. If it's a function parameter, when the function exits. If it's a class member, whenever the class is destroyed.
The advantage of strings is flexibility (increases in size automatically) and safety (harder to go over the bounds of an array). A fixed-size char array on the stack is faster as no dynamic allocation is required. But you should worry about that if you have a performance problem, and not before.
well, your question got me thinking, and then i understood that you are talking about syntax differences, because both ways are dynamic allocating char arrays. the only difference is in the need:
if you need to create a string containing a sentence then you can, and
that's fine, not to use malloc
if you want an array and to "play" with it, meaning change or set the cells cording to some method, or changing it's size, then initiating it with malloc would be the appropriate way
the only reason i see to a static allocating char a[17] (for example) is for a single purpose string that you need, meaning only when you know the exact size you'll need and it won't change
and one important point the i found:
In dynamic memory allocation, if the memory is being continually allocated but the one allocated for objects that are not in use, is not released, then it can lead to stack overflow condition or memory leak which is a big disadvantage.

C++ string / container allocation

This is probably obvious to a C++ non-noob, but it's stumping me a bit - does a string member of a class allocate a variable amount of space in that class? Or does it just allocate a pointer internally to some other space in memory? E.g. in this example:
class Parent {
public:
vector<Child> Children;
}
class Child {
public:
string Name;
}
How is that allocated on the heap if I create a "new Parent()" and add some children with varying length strings? Is Parent 4 bytes, Child 4 bytes (or whatever the pointer size, plus fixed size internal data), and then a random pile of strings somewhere else on the heap? Or is it all bundled together in memory?
I guess in general, are container types always fixed size themselves, and just contain pointers to their variable-sized data, and is that data always on the heap?
Classes in C++ are always fixed size. When there is a variable sized component, e.g., the elements of a vector or the characters in a string, they may be allocated on the heap (for small strings they may also be embedded in the string itself; this is known as the small string optimization). That is, your Parent object would contain a std::vector<Child> where the Child objects are allocated on the heap (the std::vector<...> object itself probably keeps three words to its data but there are several ways things may be laid out). The std::string objects in Child allocate their own memory. That is, there may be quite a few memory allocations.
The C++ 2011 standard thoroughly defines allocators to support passing an allocation mechanism to an object and all its children. Of course, the classes need to also support this mechanism. If your Parent and Child classes had suitable constructors taking an allocator and would pass this allocator to all members doing allocations, it would be propagated through the system. This way, allocation of objects belong together can be arranged to be in reasonably close proximity.
Classes in C++ always have a fixed size. Therefore vector and string can only contain pointers to heap allocated memory* (although they contain typically more data then one pointer, since it also needs to store the length). Therefore the object itself always has a fixed length.
*For string this is not entirely correct. Often an optimization technique called short string optimization is used. In that case small strings are embedded inside the object (in the place where otherwise the pointer to heap data would be stored) and heap memory is only allocated if the string is too long.
Yes -- using your words -- container types always fixed size themselves, and just contain pointers to their variable-sized data.
If we have vector<int> vi;, the size of vi is always fixed, sizeof(vector<int>) to be exact, irrespective of the number of int's in vi.
does a string member of a class allocate a variable amount of space in that class?
No, it does not.
Or does it just allocate a pointer internally to some other space in memory?
No, it does not.
An std::string allocates wahtever sizeof(std::string) is.
Do not confuse
the size of an object
the size of the resources, that an object is responsible for.

(Why) does an empty string have an address?

I guessed no, but this output of something like this shows it does
string s="";
cout<<&s;
what is the point of having empty string with an address ?
Do you think that should not cost any memory at all ?
Yes, every variable that you keep in memory has an address. As for what the "point" is, there may be several:
Your (literal) string is not actually "empty", it contains a single '\0' character. The std::string object that is created to contain it may allocate its own character buffer for holding this data, so it is not necessarily empty either.
If you are using a language in which strings are mutable (as is the case in C++), then there is no guarantee that an empty string will remain empty.
In an object-oriented language, a string instance with no data associated with it can still be used to call various instance methods on the string class. This requires a valid object instance in memory.
There is a difference between an empty string and a null string. Sometimes the distinction can be important.
And yes, I very much agree with the implementation of the language that an "empty" variable should still exist in and consume memory. In an object-oriented language an instance of an object is more than just the data that it stores, and there's nothing wrong with having an instance of an object that is not currently storing any actual data.
Following your logic, int i; would also not allocate any memory space, since you are not assigning any value to it. But how is it possible then, that this subsequent operation i = 10; works after that?
When you declare a variable, you are actually allocating memory space of a certain size (depending on the variable's type) to store something. If you want to use this space right way or not is up to you, but the declaration of the variable is what triggers memory allocation for it.
Some coding practices say you shouldn't declare a variable until the moment you need to use it.
An 'empty' string object is still an object - there may be more to its internal implementation than just the memory required to store the literal string itself. Besides that, most C-style strings (like the ones used in C++) are null-terminated, meaning even that "empty" string still uses one byte for the terminator.
Every named object in C++ has an address. There is even a specific requirement that the size of every type be at least 1 so that T[N] and T[N+1] are different, or so that in T a, b; both variables have distinct addresses.
In your case, s is a named object of type std::string, so it has an address. The fact that you constructed s from a particular value is immaterial. What matters is that s has been constructed, so it is an object, so it has an address.
s is a string object so it has an address. It has some internal data structures keeping track of the string. For example, current length of the string, current storage reserved for string, etc.
More generally, the C++ standard requires all objects to have a nonzero size. This helps ensure that every object has a unique address.
9 Classes
Complete objects and member subobjects of class type shall have nonzero size.
In C++, all classes are a specific, unchanging size. (varying by compiler and library, but specific at compile-time.) The std::string usually consists of a pointer, a length of allocation, and a length used. That's ~12 bytes, no matter how long the string is, and you have allocated std::string s on the call stack. When you display the address of the std::string, cout displays the location of the std::string in memory.
If the string doesn't point at anything, it won't allocate any space from the heap, which is like what you're thinking. But, all c-strings end in a trailing NULL, so the c-string "" is one character long, not zero. This means when you assign the c-string "" to the std::string, the std::string allocates 1 (or more) bytes, and assigns it the value of the trailing NULL character (usually zero '\0').
If there truly was no point to the empty string, then the programmer would not write the instruction at all. The language is loyal and trusting! And will never assume memory you allocate to be "wasted". Even if you are lost and heading over a cliff, it will hold your hand to the bitter end.
I think it'd be interesting to know, just as a curiosity though, that if you create a variable that isn't 'used' later, such as your empty string, the compiler may very well optimize it away so it incurs no cost to begin with. I guess compilers aren't as trusting...

c++ maximum std::string length is dictated by stack size or heap size?

as asked in the question.
std::string myVar; the maximum character it can hold is dictated by stack or heap?
Thank you
By default, the memory allocated for std::string is allocated dynamically.
Note that std::string has a max_size() function returning the maximum number of character supported by the implementation. The usefulness of this is questionable, though, as it's a implementation maximum, and doesn't take into consideration other resources, like memory. Your real limit is much lower. (Try allocating 4GB of contiguous memory, or take into account memory exhaustion elsewhere.)
A std::string object will be allocated the same way an int or any other type must be: on the stack if it's a local variable, or it might be static, or on the heap if new std::string is used or new X where X contains the string etc..
But, that std::string object may contain at least a pointer to additional memory provided by the allocator with which basic_string<> was instantiated - for the std::string typedef that means heap-allocated memory. Either directly in the original std::string object memory or in pointed-to heap you can expect to find:
a string size member,
possibly some manner of reference counter or links,
the textual data the string stores (if any)
Some std::string implementations have "short string" optimisations where they pack strings of only a few characters directly into the string object itself (for memory efficiency, often using some kind of union with fields that are used for other purposes when the strings are longer). But, for other string implementations, and even for those with short-string optimisations when dealing with strings that are too long to fit directly in the std::string object, they will have to follow pointers/references to the textual data which is stored in the allocator-provided (heap) memory.

Can C++ automatic variables vary in size?

In the following C++ program:
#include <string>
using namespace std;
int main()
{
string s = "small";
s = "bigger";
}
is it more correct to say that the variable s has a fixed size or that the variable s varies in size?
It depends on what you mean by "size".
The static size of s (as returned by sizeof(s)) will be the same.
However, the size occupied on the heap will vary between the two cases.
What do you want to do with the information?
i'll say yes and no.
s will be the same string instance but it's internal buffer (which is preallocated depending on your STL implementation) will contain a copy of the constant string you wanted to affect to it.
Should the constant string (or any other char* or string) have a bigger size than the internal preallocated buffer of s, s buffer will be reallocated depending on string buffer reallocation algorithm implemented in your STL implmentation.
This is going to lead to a dangerous discussion because the concept of "size" is not well defined in your question.
The size of a class s is known at compile time, it's simply the sum of the sizes of it's members + whatever extra information needs to be kept for classes (I'll admit I don't know all the details) The important thing to get out of this, however is the sizeof(s) will NOT change between assignments.
HOWEVER, the memory footprint of s can change during runtime through the use of heap allocations. So as you assign the bigger string to s, it's memory footprint will increase because it will probably need more space allocated on the heap. You should probably try and specify what you want.
The std::string variable never changes its size. It just refers to a different piece of memory with a different size and different data.
Neither, exactly. The variable s is referring to a string object.
#include <string>
using namespace std;
int main()
{
string s = "small"; //s is assigned a reference to a new string object containing "small"
s = "bigger"; //s is modified using an overloaded operator
}
Edit, corrected some details and clarified point
See: http://www.cplusplus.com/reference/string/string/ and in particular http://www.cplusplus.com/reference/string/string/operator=/
The assignment results in the original content being dropped and the content of the right side of the operation being copied into the object. similar to doing s.assign("bigger"), but assign has a broader range of acceptable parameters.
To get to your original question, the contents of the object s can have variable size. See http://www.cplusplus.com/reference/string/string/resize/ for more details on this.
A variable is an object we refer to by a name. The "physical" size of an object -- sizeof(s) in this case -- doesn't change, ever. They type is still std::string and the size of a std::string is always constant. However, things like strings and vectors (and other containers for that matter) have a "logical size" that tells us how many elements of some type they store. A string "logically" stores characters. I say "logically" because a string object doesn't really contain the characters directly. Usually it has only a couple of pointers as "physical members". Since the string objects manages a dynamically allocated array of characters and provides proper copy semantics and convenient access to the characters we can thing of those characters as members ("logical members"). Since growing a string is a matter of reallocating memory and updating pointers we don't even need sizeof(s) to change.
i would say this is string object , And it has capability to grow dynamically and vice-versa