std::string capacity size - c++

Is the capacity size of string always a multiple value of 15?
for example: In all cases the capacity is 15
string s1 = "Hello";
string s2 = "Hi";
string s3 = "Hey";
or is it random?

Is the capacity size of string always a multiple value of 15?
No; the only guarantee about the capacity of a std::string is that s.capacity() >= s.size().
A good implementation will probably grow the capacity exponentially so that it doubles in size each time a reallocation of the underlying array is required. This is required for std::vector so that push_back can have amortized constant time complexity, but there is no such requirement for std::string.
In addition, a std::string implementation can perform small string optimizations where strings smaller than some number of characters are stored in the std::string object itself, not in a dynamically allocated array. This is useful because many strings are short and dynamic allocation can be expensive. Usually a small string optimization is performed if the number of bytes required to store the string is smaller than the number of bytes required to store the pointers into a dynamically allocated buffer.
Whether or not your particular implementation performs small string optimizations, I don't know.

Implementation specific - std::String usually allocates a small starting string, 16bytes is common.
It's a compromise between not having to do a realloc and move for very short strings and not wasting space

It's an implementation detail on which you're not supposed to rely on; to see exactly how std::string grow in your implementation you can have a look at the sources of its CRT. In general it has an exponential grow.

Related

What is difference between adding a character and pushing a character in a string?

I was doing a question where I was adding a character using for loop in string like this:
for(int i=0;i<n;i++){
str = ch + str;
}
This code was running fine for small inputs. But when the input became quite large the memory limit exceeded.
Then I switched my code with:
for(int i=0;i<n;i++){
str.push_back(ch);
}
reverse(str.begin(),str.end());
And it worked. I want to know the reason why for my own understanding.
The first adds a character to the beginning of the string. I'm order to do this it creates a new string containing just ch, then it copies all of str in to it then moves this temporary string back into str. This operation requires at least enough memory to hold two copies of the string.
The second adds a character on to the end of the existing string, the string likely already has space for this character so just appends the character without using any additional memory. At some point the string will need to grow, at this point you'll again need more memory, typically this would need enough memory for the current size of the string plus the new size (e.g if the string implementation decides to double in size you'll need enough memory for three times the current size of the string)
I would say this is due to memory fragmentation and being very close to the out-of-memory boundary.
In str = ch + str; you are allocating a new string that is 1 char longer every time and then freeing the old string. If there are no other allocations it will need about 3 times the total length of the string of memory. If there are any other allocations in the loop then it can go as high as O(n^2).
With str.push_back(ch); the string will grow by some factor when it reaches capacity causing a lot fewer allocations. Without other allocations this could still be something like 2.3 times the total string size. With allocations it could be O(n * log(n)) where log is to the base of the growth factor.

Trimming C++ string in constant time

Is there a STL/library method to reduce the string size (trim it) in constant time.
In C this this can be done in constant time by just adding the '\0' past the last index.
C++ resize compexity is undefined and mostly likely be O(N)
http://www.cplusplus.com/reference/string/string/resize/
#SamVarshavchik is being coy in the comments, but it's worth spelling out: in many implementations, including libstdc++, std::string::resize() will reduce the size of a string in constant time, by reducing the length of the string and not reallocating/copying the data:
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/basic_string.tcc . The estimate of O(n) is if you increase the size, forcing the string to be reallocated and copied over.
Alternatively, in C++17, std::string_view is rough immutable counterpart to "trimming" a C string by slapping a null byte. It takes a slice of a string (pointer + size) without copying its content, and you can then pass it around to various STL functions or print it from a stream.
std::string hello_world = "hello world";
auto hello = std::string_view(hello_world.data(), 5);
std::cout << hello; // prints hello
The caveat is that the string_view doesn't own the data, so you can't use it once the original string goes out of scope, and you don't want to modify the original string in a way that might cause it to reallocate.
The C++17 way, we can achieve the substr operation in O(1).
https://www.modernescpp.com/index.php/c-17-avoid-copying-with-std-string-view
std::string_view do not allocate the memory on heap for large string as well.
std::string allocate memory on heap but exception is for std::string size is 15 for MSVC and GCC and 23 for Clang. std::string below above mentioned size are not allocated memory on heap.

std::string capacity remains same after deleting elements, so is it holding up some memory?

The following piece of code:
string a = "abc";
cout << a.capacity();
a.erase(a.begin() + 1, a.end());
cout << a.capacity();
...outputs:
33
Even if I remove some elements from the string, the capacity remains the same. So my questions are:
Is some memory being held up because of capacity? What if I have not explicitly reserve()'d?
If I use reserve() and don't end up using the entire capacity, am I wasting memory?
Will this extra memory (which I am not using) be allocated to something else if required?
EDIT:
Suppose i have
string a= "something";
a = "ab";
Now I know that a won't ever have more than two characters. So is it wise to call reserve(2) so that memory is not wasted?
I'll answer your questions first:
The memory belongs to the string, but isn't used entirely. If you don't reserve, you can't control the capacity. You just know it is sufficiently large.
Correct.
As said in 1): no. It belongs to the string. Nothing else can use this memory. The string itself could use it for additional characters though.
For further details I'd recommend the documentation of string::reserve.
One final remark: If you don't ever reserve, everything will work fine - it might be unnecessarily slow though. That is only ever the case if you were to frequently add few characters and the string has to re-alloc alot (much like a vector). Reserving is basically intended to bypass this situation.
On the addendum: Calling reserve can help to save memory. If you call reserve(n), this ensures the string has an internal capacity for at least n characters. Note that reserve is not required to set the capacity to exactly n nor to reduce the capacity at all for small n (cf. reserve documentation).
Back to your example: If you call reserve it can never do any harm. It's the best you can do in general. (In case you have C++11 features, I'd recommend shrink_to_fit).
I tested with (older) versions of gcc / clang in which cases the capacity of a got changed to exactly 2. Since I'm not 100% sure what the added question referes to, here is what I ran:
auto a = string{"something"};
a = "ab";
cout << a << " " << a.capacity() << endl;
a.reserve(2);
cout << a << " " << a.capacity() << endl;
Which produces:
ab 9
ab 2
1) Is some memory being held up because of capacity? What if i have not
explicitly reserve()'d?
Even if you did not call reserve, the std::string object can still hold up some memory1 for (even a default constructed) std::string. And this is true with std::string implementations that uses Short String Optimization
2)If i use std::reserve() and dont end up using the entire capcity
then am i wasting memory?
It depends on what you mean by wasting memory; std::string dynamically resizes its buffer to accommodate changes in the size of the string. Well, in the case of Short String Optimized std::string implementations, there is nothing you can do about it. The memory is in the string object itself.
3)Will this extra memory which i am not using be allocated to
something else, if required?
No. A std::string object manages the memory it allocated, and it may or may not give it up2 wholly or partly, until its destroyed. See David Schwartz's comment
EDIT: Suppose i have
string a= "something";
Now i edit the string and know that a won't have more than two
characters.So is it wise to call a.reserve(2) so that memory is not
wasted?
If you modified a in a way that changes a.size(), such as calling resize method or assigning it to a new string of length 2, then the proceeding reserve call can2 be beneficial.
Note that, calling reserve would not reduce the string's contents. std::string::reserve is not permitted to truncate the string. It is only permitted to work on unused memory. If you call std::string::reserve(new_capacity_intended) with new_capacity_intended less than the size() of the string, the best that could possibly happen is the same effect of std::string::shrink_to_fit.
To reduce the string's memory (if the implementation does a binding shrink_to_fit request) and shrink it to the first two characters:
string a= "something";
//resize it first
a.resize(2); //or by some assignment such as a = "so";
//then
a.reserve(2); // or better still a.shrink_to_fit();
1: by memory, I assume Virtual Memory
2: std::string::reserve or std::string::shrink_to_fit may or may not give up the string's unused memory.

When does std::string reallocate memory?

When using an std::string object and I want to add characters to it, would it preallocate some memory, or would it only allocate as much as I need?
To be precise:
std::string s;
s.reserve(20);
char c = 'a';
s = "";
for(int i = 0; i < 25; i++)
s += c;
In the above example I reserve an amount of memory. Now when I clear the string, will it cause the reserved memory to be discarded?
In the loop would it fill up the reserved memory and then reallocate for the extra 5 characters each time?
There is no requirement that std::string release allocated memory when you assign an empty string to it. Nor when you assign a short string to it. The only requirement is that when it allocates memory to hold a larger string, the allocation must be done in a way that achieves amortized constant time. A simple implementation would be to grow by a factor of 2 each time more space is needed.
If you want the string's capacity to be minimized, you can use string::shrink_to_fit() in C++11. Before C++11 some people resorted to the "swap trick" when they needed to reduce capacity.
string doesn't "remember" that you said 20 characters, it just knows its current capacity. So your reserve(20) call increases the capacity to at least 20 and has no further effect.
When you add 25 characters to the string, the capacity increases to at least 25 along the way. Then it remains at this new level unless you do something that reduces the capacity. Calling clear() does not alter the capacity.
No, reserved memory isn't discarded, swapwith an empty object for that.
Yes, when your reserve space is full, the next append, etc will cause a reallocation.

Can std::string capacity be changed for optimisation?

Can the std::string capacity be changed to optimise it?
For example:
std::string name0 = "ABCDEABCDEABCDEF";
int cap = name0.capacity(); //cap = 31
int size = name0.size(); //size = 16
Okay, this is perfectly fine for a couple of strings in memory, but what if there are thousands? This wastes a lot of memory. Isn't it then better to use char* so you can control how much memory is allocated for the specific string?
(I know some people will ask why are there thousands of strings in memory, but I would like to stick to my question of asking if the string capacity can be optimised?)
If you're asking how to reduce capacity() so that it matches size(), then C++11 added shrink_to_fit() for this purpose, but be aware that it is a non-binding request, so implementations are allowed to ignore it.
name0.shrink_to_fit();
Or there's the trick of creating a temporary string and swapping:
std::string(name0.begin(), name0.end()).swap(name0);
However, neither of these are guaranteed to give you a capacity() that matches size(). From GotW #54:
Some implementations may choose to round up the capacity slightly to their next larger internal "chunk size," with the result that the capacity actually ends up being slightly larger than the size.