C++ string uses maximum buffer allocated? - c++

I declare a variable string s;
and do s = "abc"; now it has buffer of 3 characters.
After
s = "abcd" it has a buffer of 4 characters.
Now after the third statement
s = "ab" question is will it keep the buffer of 4 characters or will it reallocate a 2 character buffer?
If it will allocate 2 character buffer is there any way I can tell it to keep the allocated maximum buffer.
So does it keep the buffer of maximum size ever allocated ?
s = "ab"
s="abc"
s="a"
s="abcd"
s="b"
Now it should keep a buffer of size 4.
Is that possible?

The string will keep its buffer once it is allocated, and only reallocate if its needs an even larger buffer. It will also likely start with an initial buffer size larger than 3 or 4.
You can check the allocated size using the capacity() member function.
After James' comments below I still believe my answer is correct for the examples given in the question.
However, for a reference counted implementation a sequence like this
s = "some rather long string...";
std::string t = "b";
s = t;
would set s.capacity() to be equal to t.capacity() if the implementation decides to share the internal buffer between s and t.

s = "ab" question is will it keep buffer of 4 words or will it
reallocate 2 word buffer ?
It will not reallocate the buffer. I don't know if it's mentioned in the standard, but all the implementations I have ever seen issue a reallocation only if they need to increase the capacity. Never to decrease. Even if you have a string with 4 characters and call .resize(2) or .reserve(2) the capacity will not change. In order to force the string (or containers) to reallocate memory to fit the exact size, there's a simple swap trick for that
s.swap(string(s));
What happens here? You create a temporary from s which will have its capacity exactly equal to s.size() then swap it with your original string. The destructor of the temporary will free all the necessary resources.
Again, I am not claiming this is standard, but all the implementations I've seen have this behavior.

You can easily see the behavior of your implementation by calling
std::string::capacity at various times. In general, I'd be surprised
if any implementation ever had a buffer of three characters. (Not
words, but bytes, at least on most modern machines.) In practice,
implementations vary, and also vary depending on how the new length
comes about: with g++, for example, removing characters with
std::string::erase will not reduce the capacity of the string, but
assigning a new, smaller string will. VC++ doesn't reduce the capacity
in either case. (In general, VC++ and g++ have very different
strategies with regards to memory management in strings.)
EDIT:
Given the other responses (which don't even correspond to usual
practice): here's the small test program I used to verify my statements
above (although I really didn't need it for g++—I know the
internals of the implementation quite well):
#include <string>
#include <iostream>
#include <iomanip>
template<typename Traits>
void
test()
{
std::string s;
size_t lastSeen = -1;
std::cout << Traits::name() << ": Ascending:" << std::endl;
while ( s.size() < 150 ) {
if ( s.capacity() != lastSeen ) {
std::cout << " " << std::setw( 3 ) << s.size()
<< ": " << std::setw( 3 ) << s.capacity() << std::endl;
lastSeen = s.capacity();
}
Traits::grow( s );
}
std::cout << Traits::name() << ": Descending: " << std::endl;
while ( s.size() != 0 ) {
Traits::shrink( s );
if ( s.capacity() != lastSeen ) {
std::cout << " " << std::setw( 3 ) << s.size()
<< ": " << std::setw( 3 ) << s.capacity() << std::endl;
lastSeen = s.capacity();
}
}
std::cout << "Final: capacity = " << s.capacity() << std::endl;
}
struct Append
{
static void grow( std::string& s )
{
s += 'x';
}
static void shrink( std::string& s )
{
s.erase( s.end() - 1 );
}
static std::string name()
{
return "Append";
}
};
struct Assign
{
static void grow( std::string& s )
{
s = std::string( s.size() + 1, 'x' );
}
static void shrink( std::string& s )
{
s = std::string( s.size() - 1, 'x' );
}
static std::string name()
{
return "Assign";
}
};
int
main()
{
test<Append>();
test<Assign>();
return 0;
}
Try it. The results are quite instructive.

Related

size and the type of object created by the vector string constuctor in C++

int numRows = 5;
string s ="hellohi";
vector<string> rows(min(numRows, int(s.size())));
I think it is using the fill constructor. https://www.cplusplus.com/reference/vector/vector/vector/
but I don't know it creates a vector of NULL string or a vector of an empty string ?
And what is the size of the NULL ?
And what is the size of the empty string? 1 bytes ("/0"char) ?
The constructor you're using will create empty strings. For example you can check with:
// check the number of entries in rows, should be 5
std::cout << rows.size() << std::endl;
// check the number of characters in first string, should be 0
std::cout << rows[0].size() << std::endl;
// now the size should be 11, since there are 11 entries
rows[0] = "hello world";
std::cout << rows[0].size() << std::endl;
I believe the size of NULL is implementation defined, you could find it with:
std::cout << sizeof(nullptr) << std::endl;
I get 8 as the size (which is 64 bits)
Similar to the nullptr, the size of an empty string is probably also implementation defined, you can find it like:
std::string test_string;
std::cout << sizeof(test_string) << "\n";
std::cout << test_string.size() << "\n"; // should be 0 since the string is empty
test_string = "hello world"; // it doesn't matter how long the string is, it's the same size
std::cout << sizeof(test_string) << "\n";
std::cout << test_string.size() << "\n"; // should be 11 since the string has data now
I get 32 bytes for the size. The reason the size of the string doesn't change is due to how it works behind the scenes, instead of storing data (most of the time) it only stores a pointer to the data (which is always a fixed size).

Is a object of std::string really movable?

as we known, a movable object is one would not be copied deeply when it be assigned to another one of same type. By this way, we can save a lot of time.
But today, I found a phenomenon stange to me. Please view code as following.
#include <string>
#include <iostream>
int main() {
std::string s1 = "s1";
std::string s2 = "s2";
std::cout << " s1[" << ( void* ) &s1[0] << "]:" + s1
<< ", s2[" << ( void* ) &s2[0] << "]:" + s2
<< std::endl;
s1.swap( s2 );
std::cout << " s1[" << ( void* ) &s1[0] << "]:" + s1
<< ", s2[" << ( void* ) &s2[0] << "]:" + s2
<< std::endl;
s2 = std::move(s1);
std::cout << " s1[" << ( void* ) &s1[0] << "]:" + s1
<< ", s2[" << ( void* ) &s2[0] << "]:" + s2
<< std::endl;
return EXIT_SUCCESS; }
After moving, although the contents of strings have been changed, but the address that really storing the data of a string has not been changed.
If the memory addesses would not be changed, can we have a reason to confirm that in fact a deeply copy will be performed instead of just only assigen a pointer to target's member?
thanks!
Leon
a movable object is one would not be copied deeply when it be assigned
to another one of same type
Only if it makes sense. in the following snippet
int i0 = 11;
int i1 = std::move(i0);
there will be no "stealing" simply because there is nothing to steal. so the premise of the question is flawed - a move operation would "steal" the content of the movee if it makes sense to do so.
Also note that in the C++ world, unlike Java and C#, an object is everything that occupies memory - integers, pointers, characters - all of them are objects.
std::string uses an optimization technique called "short string optimization" or SSO. If the string is short enough (and "short enough" is implementation defined), no buffer is dynamically allocated and hence nothing to "steal". when such short string is moved, the content of the string is so short it's just copied to the moved-into string without messing with dynamically allocated buffers.

C++ Character Array Error Handling

If I declare a string array in c++ such as
char name[10]
how would you error handle if the input is over the character limit?
Edit: My assignment says to use cstring rather than string. Input will be the person's full name.
Here is an example where setName checks the size is OK before assigning the char[10] attribute.
Note char[10] can only store a 9-characters name, because you need one character to store the end-of-string.
Maybe that's what you want:
#include <iostream>
#include <cstring>
using namespace std;
#define FIXED_SIZE 10
class Dummy
{
public:
bool setName( const char* newName )
{
if ( strlen( newName ) + 1 > FIXED_SIZE )
return false;
strcpy( name, newName );
return true;
}
private:
char name[FIXED_SIZE];
};
int main()
{
Dummy foo;
if ( foo.setName( "ok" ) )
std::cout << "short works" << std::endl;
if ( foo.setName( "012345678" ) )
std::cout << "9 chars OK,leavs space for \0" << std::endl;
if ( !foo.setName( "0123456789" ) )
std::cout << "10 chars not OK, needs space for \0" << std::endl;
if ( !foo.setName( "not ok because too long" ) )
std::cout << "long does not work" << std::endl;
// your code goes here
return 0;
}
I'm piecing together that your instructions say to use <cstring> so you can use strlen to check the length of the string prior to "assigning" it to your name array.
so something like...
const int MAX_NAME_LEN = 10;
char name[MAX_NAME_LEN];
// ...
// ...
if (strlen(input)+1 >= MAX_NAME_LEN) {
// can't save it, too big to store w/ null char
}
else {
// good to go
}
First of all your question is not clear. Anyway I assume you want to ask for a way to ensure array index does not get out of bound.
Anything outside of that range causes undefined behavior. If the index was near the range, most probably you read your own program's memory. If the index was largely out of range, most probably your program will be killed by the operating system.
That means undefined behaviour could mean program crash, correct output etc.
Since others mentioned how to do this with a predefined input string, here's a solution which reads a c-string from input:
#include <iostream>
#define BUF_SIZE 10
using namespace std;
int main()
{
char name[BUF_SIZE];
cin.get(name, BUF_SIZE-1);
if (cin) //No eof
if (cin.get() != '\n')
cerr << "Name may not exceed " << BUF_SIZE-1 << " characters";
}

How to truncate a string [formating] ? c++

I want to truncate a string in a cout,
string word = "Very long word";
int i = 1;
cout << word << " " << i;
I want to have as an output of the string a maximum of 8 letters
so in my case, I want to have
Very lon 1
instead of :
Very long word 1
I don't want to use the wget(8) function, since it will not truncate my word to the size I want unfortunately. I also don't want the 'word' string to change its value ( I just want to show to the user a part of the word, but keep it full in my variable)
I know you already have a solution, but I thought this was worth mentioning: Yes, you can simply use string::substr, but it's a common practice to use an ellipsis to indicate that a string has been truncated.
If that's something you wanted to incorporate, you could just make a simple truncate function.
#include <iostream>
#include <string>
std::string truncate(std::string str, size_t width, bool show_ellipsis=true)
{
if (str.length() > width)
if (show_ellipsis)
return str.substr(0, width) + "...";
else
return str.substr(0, width);
return str;
}
int main()
{
std::string str = "Very long string";
int i = 1;
std::cout << truncate(str, 8) << "\t" << i << std::endl;
std::cout << truncate(str, 8, false) << "\t" << i << std::endl;
return 0;
}
The output would be:
Very lon... 1
Very lon 1
As Chris Olden mentioned above, using string::substr is a way to truncate a string. However, if you need another way to do that you could simply use string::resize and then add the ellipsis if the string has been truncated.
You may wonder what does string::resize? In fact it just resizes the used memory (not the reserved one) by your string and deletes any character beyond the new size, only keeping the first nth character of your string, with n being the new size. Moreover, if the new size is greater, it will expand the used memory of your string, but this aspect of expansion is straightforward I think.
Of course, I don't want to suggest a 'new best way' to do it, it's just another way to truncate a std::string.
If you adapt the Chris Olden truncate function, you get something like this:
#include <iostream>
#include <string>
std::string& truncate(std::string& str, size_t width, bool show_ellipsis=true) {
if (str.length() > width) {
if (show_ellipsis) {
str.resize(width);
return str.append("...");
}
else {
str.resize(width);
return str;
}
}
return str;
}
int main() {
std::string str = "Very long string";
int i = 1;
std::cout << truncate(str, 8) << "\t" << i << std::endl;
std::cout << truncate(str, 8, false) << "\t" << i << std::endl;
return 0;
}
Even though this method does basically the same, note that this method takes and returns a reference to the modified string, so be careful with it since this string could be destroyed because of an external event in your code. Thus if you don't want to take that risk, just remove the references and the function becomes:
std::string truncate(std::string str, size_t width, bool show_ellipsis=true) {
if (str.length() > width) {
if (show_ellipsis) {
str.resize(width);
return str + "...";
}
else {
str.resize(width);
return str;
}
}
return str;
}
I know it's a little bit late to post this answer. However it might come in handy for future visitors.

Comparing Character Literal to Std::String in C++

I would like to compare a character literal with the first element of string, to check for comments in a file. Why use a char? I want to make this into a function, which accepts a character var for the comment. I don't want to allow a string because I want to limit it to a single character in length.
With that in mind I assumed the easy way to go would be to address the character and pass it to the std::string's compare function. However this is giving me unintended results.
My code is as follows:
#include <string>
#include <iostream>
int main ( int argc, char *argv[] )
{
std::string my_string = "bob";
char my_char1 = 'a';
char my_char2 = 'b';
std::cout << "STRING : " << my_string.substr(0,1) << std::endl
<< "CHAR : " << my_char1 << std::endl;
if (my_string.substr(0,1).compare(&my_char1)==0)
std::cout << "WOW!" << std::endl;
else
std::cout << "NOPE..." << std::endl;
std::cout << "STRING : " << my_string.substr(0,1) << std::endl
<< "CHAR : " << my_char2 << std::endl;
if (my_string.substr(0,1).compare(&my_char2)==0)
std::cout << "WOW!" << std::endl;
else
std::cout << "NOPE..." << std::endl;
std::cout << "STRING : " << my_string << std::endl
<< "STRING 2 : " << "bob" << std::endl;
if (my_string.compare("bob")==0)
std::cout << "WOW!" << std::endl;
else
std::cout << "NOPE..." << std::endl;
}
Gives me...
STRING : b
CHAR : a
NOPE...
STRING : b
CHAR : b
NOPE...
STRING : bob
STRING 2 : bob
WOW!
Why does the function think the sub-string and character aren't the same. What's the shortest way to properly compare chars and std::string vars?
(a short rant to avoid reclassification of my question.... feel free to skip)
When I say shortest I mean that out of a desire for coding eloquence. Please note, this is NOT a homework question. I am a chemical engineering Ph.D candidate and am coding as part of independent research. One of my last questions was reclassified as "homework" by user msw (who also made a snide remark) when I asked about efficiency, which I considered on the border of abuse. My code may or may not be reused by others, but I'm trying to make it easy to read and maintainable. I also have a bizarre desire to make my code as efficient as possible where possible. Hence the questions on efficiency and eloquence.
Doing this:
if (my_string.substr(0,1).compare(&my_char2)==0)
Won't work because you're "tricking" the string into thinking it's getting a pointer to a null-terminated C-string. This will have weird effects up to and including crashing your program. Instead, just use normal equality to compare the first character of the string with my_char:
if (my_string[0] == my_char)
// do stuff
Why not just use the indexing operator on your string? It will return a char type.
if (my_string[0] == my_char1)
You can use the operator[] of string to compare it to a single char
// string::operator[]
#include <iostream>
#include <string>
using namespace std;
int main ()
{
string str ("Test string");
int i; char c = 't';
for (i=0; i < str.length(); i++)
{
if (c == str[i]) {
std::cout << "Equal at position i = " << i << std::endl;
}
}
return 0;
}
The behaviour of the first two calls to compare is entirely dependent on what random memory contents follows the address of each char. You are calling basic_string::compare(const char*) and the param here is assumed to be a C-String (null-terminated), not a single char. The compare() call will compare your desired char, followed by everything in memory after that char up to the next 0x00 byte, with the std::string in hand.
Otoh the << operator does have a proper overload for char input so your output does not reflect what you are actually comparing here.
Convert the decls of and b to be const char[] a = "a"; and you will get what you want to happen.
Pretty standard, strings in c++ are null-terminated; characters are not. So by using the standard compare method you're really checking if "b\0" == 'b'.
I used this and got the desired output:
if (my_string.substr(0,1).compare( 0, 1, &my_char2, 1)==0 )
std::cout << "WOW!" << std::endl;
else
std::cout << "NOPE..." << std::endl;
What this is saying is start at position 0 of the substring, use a length of 1, and compare it to my character reference with a length of 1. Reference