So I am trying to concatenate simple strings, and make a final sentence.
int main()
{
string I ("I");
string Love ("Love");
string STL ("STL,");
string Str ("String.");
string fullSentence = '\0';
// Concatenate
fullSentence = I + " " + Love + " " + STL + " " + Str;
cout << fullSentence;
return 0;
}
Here, I didn't want to have "fullSentence" with nothing, so I assigned null and it gives me an error. There is no certain error message, except the following which I do not understand at all... :
Exception thrown at 0x51C3F6E0 (ucrtbased.dll) in exercise_4.exe: 0xC0000005: Access violation reading location 0x00000000. occurred
Soon as I remove '\0', it works just fine. Why does it so?
It appears to be an MSVC compiler bug to me.
The statement:
string fullSentence = '\0';
is not supposed to compile.
Indeed, there is no valid (implicit) constructor from char (i.e. '\0') to std::string. Reference Here.
Note that gcc and clang do not accept this code as valid.
MSVC does.
Why does it so?
Looking at the assembly code, MSVC compiles that statement with the following constructor:
std::string::string(char const * const);
Passing '\0' as an argument, it will be converted into a nullptr actually.
So:
Constructs the string with the contents initialized with a copy of the null-terminated character string pointed to by s. The length of the string is determined by the first null character. The behavior is undefined if [s, s + Traits::length(s)) is not a valid range (for example, if s is a null pointer).
So your code is undefined behavior.
Put "\0" instead of '\0'. In C++ '' is for char and "" is for strings.
It's a conversion from char to non-scalar type std::string
You could use a debugger to see a call stack of what happened.
Looking into string class here
Below constructor was called in your case:
basic_string( const CharT* s, const Allocator& alloc = Allocator() );
As per description of the constructor (emphasis mine) Constructs the
string with the contents initialized with a copy of the
null-terminated character string pointed to by s. The length of the
string is determined by the first null character. The behavior is
undefined if [s, s + Traits::length(s)) is not a valid range
in your case range is empty -> invalid.
Related
Will the below string contain the null terminator '\0'?
std::string temp = "hello whats up";
No, but if you say temp.c_str() a null terminator will be included in the return from this method.
It's also worth saying that you can include a null character in a string just like any other character.
string s("hello");
cout << s.size() << ' ';
s[1] = '\0';
cout << s.size() << '\n';
prints
5 5
and not 5 1 as you might expect if null characters had a special meaning for strings.
Not in C++03, and it's not even guaranteed before C++11 that in a C++ std::string is continuous in memory. Only C strings (char arrays which are intended for storing strings) had the null terminator.
In C++11 and later, mystring.c_str() is equivalent to mystring.data() is equivalent to &mystring[0], and mystring[mystring.size()] is guaranteed to be '\0'.
In C++17 and later, mystring.data() also provides an overload that returns a non-const pointer to the string's contents, while mystring.c_str() only provides a const-qualified pointer.
This depends on your definition of 'contain' here. In
std::string temp = "hello whats up";
there are few things to note:
temp.size() will return the number of characters from first h to last p (both inclusive)
But at the same time temp.c_str() or temp.data() will return with a null terminator
Or in other words int(temp[temp.size()]) will be zero
I know, I sound similar to some of the answers here but I want to point out that size of std::string in C++ is maintained separately and it is not like in C where you keep counting unless you find the first null terminator.
To add, the story would be a little different if your string literal contains embedded \0. In this case, the construction of std::string stops at first null character, as following:
std::string s1 = "ab\0\0cd"; // s1 contains "ab", using string literal
std::string s2{"ab\0\0cd", 6}; // s2 contains "ab\0\0cd", using different ctr
std::string s3 = "ab\0\0cd"s; // s3 contains "ab\0\0cd", using ""s operator
References:
https://akrzemi1.wordpress.com/2014/03/20/strings-length/
http://en.cppreference.com/w/cpp/string/basic_string/basic_string
Yes if you call temp.c_str(), then it will return null-terminated c-string.
However, the actual data stored in the object temp may not be null-terminated, but it doesn't matter and shouldn't matter to the programmer, because when then programmer wants const char*, he would call c_str() on the object, which is guaranteed to return null-terminated string.
With C++ strings you don't have to worry about that, and it's possibly dependent of the implementation.
Using temp.c_str() you get a C representation of the string, which will definitely contain the \0 char. Other than that, i don't really see how it would be useful on a C++ string
std::string internally keeps a count of the number of characters. Internally it works using this count. Like others have said, when you need the string for display or whatever reason, you can its c_str() method which will give you the string with the null terminator at the end.
Why the following code gives an error?
// This a CPP Program
#include <bits/stdc++.h>
using namespace std;
// Driver code
main()
{
string s=NULL;
s.length();
}
I know that a runtime error will occur because I am trying to get the length of the null string but I want to know why it is happening?
You invoke the following overload of the std::string constructor (overload 5):
basic_string( const CharT* s, const Allocator& alloc = Allocator());
And this is the explanation belonging to the constructor (emphasis mine):
Constructs the string with the contents initialized with a copy of the null-terminated character string pointed to by s. The length of the string is determined by the first null character. The behavior is undefined if [s, s + Traits::length(s)) is not a valid range (for example, if s is a null pointer).
Thus, you have undefined behavior at work. Referring back to your question, that outrules any further thoughts on "why it is happening", because UB can result in anything. You could wonder why it's specified as UB in the first place - this is because std::string shall by design work with C-style, zero-terminated strings (char*). However, NULL is not one. The empty, zero-terminated C-style string is e.g. "".
Why the following code gives an error?
main must be declared to return int.
Also, to declare an empty string, make it string s; or string s="";
This would compile:
#include <iostream>
#include <string>
int main()
{
std::string s;
std::cout << s.length() << '\n'; // prints 0
}
On a sidenote: Please read Why should I not #include <bits/stdc++.h>?
There is no such thing as the null string unless by "null" you mean empty, which you don't.
I'm trying to understand how strings really work in C++ because I just got really confused after coming across an unexpected behavior.
Considering a string, I insert a character (not using append()) using [] operator:
string str;
str[0] = 'a';
Let's print the string:
cout << "str:" << str << endl;
I get NULL as output:
str:
Ok, let's try printing the only character in the string:
cout << "str[0]:" << str[0] << endl;
Output:
str[0]:a
Q1. What happened there? Why was a not printed in the first case?
Now, I do something that should throw a compilation error but it doesn't and my question is again, why.
str = 'ABC';
Q2. How's that not an incorrect semantic i.e. assigning a character (which is not really a character but essentially a string in single quotes) to a string?
Now, worse when I print the string, it always prints last character i.e C (I was expecting first character i.e. A):
cout << "str:" << str << endl;
Output:
str:C
Q3. Why was the last character printed, not first?
Considering a string, I insert a character (not using append()) using [] operator:
string str;
str[0] = 'a';
You did not insert a character. operator[](size_type pos) returns a reference to the - already existing - character at pos. If pos == size() then behaviour is undefined. Your string is empty, so size() == 0 and therefore str[0] has undefined behaviour.
Q1. What happened there? Why was a not printed in the first case?
The behaviour is undefined.
Now, I do something that should throw a compilation error but it doesn't and my question is again, why.
str = 'ABC';
Q2. How's that not an incorrect semantic i.e. assigning a character ... to a string?
Assigning a character to a string is not incorrect semantic. It sets the content of the string to that single character.
Q2. ... a character (which is not really a character but essentially a string in single quotes) ...
It is a multicharacter literal. The type of a multicharacter literal is int. If the compiler supports multicharacter literals, then the semantic is not incorrect.
There isn't an assignment operator for string that would accept an int. However, int is implicitly convertible to char, so the assignment operator that accepts a char is used after the conversion.
char cannot necessarily represent all the values that int can, so it is possible that the conversion overflows. If char is a signed type, then this overflow has undefined behaviour.
Q3. Why was the last character printed, not first?
The value of a multicharacter literal is implementation-defined. You'll need to consult the manual of your compiler to find out whether multicharacter literals are supported, and what value you should expect. Furthermore, you'll need to consider the fact that the char that the value is converted to probably cannot represent all values of int.
but I didn't get any warnings
Then consider getting a better compiler. This is what GCC warns:
warning: multi-character character constant [-Wmultichar]
str = 'ABC';
warning: overflow in implicit constant conversion [-Woverflow]
str[0] = 'a' should work with string just like it does with char str[] = "" (but it doesn't as we saw). Can you help me understand why [] operator has different behavior in dealing with array of characters than string?
Because that's how the standard has defined the behaviour and requirements of std::string.
char str[] = "";
Creates an array of size 1, consisting of the null terminator. This element of the array is like any other, and you can freely modify it:
str[0] = 'a';
This is well defined and OK. But now str no longer contains a null-terminated string, so trying to use it as such has undefined behaviour:
out << "str:" << str << endl; // oops, str is not a null terminated string
So, std::string has been designed such that you cannot mess with the final null terminator - as long as you obey the requirements of std::string. Not allowing touching the null terminator also allows the implementation to never allocate a memory buffer for an empty string. Not allocating memory may be faster than allocating memory, so this is a good thing.
You should take a look at http://en.cppreference.com/w/cpp/string/basic_string/operator_at. Namely, the portion about "If pos == size(), the behavior is undefined."
The following line creates an empty string:
string str;
so size() will return 0.
Your statement str string; str[0]='a' is undefined behaviour, though the reason for this differs between "before C++11" and "from C++11 on". Note that str is a non-const string. Before C++11 already a (read) access like str[pos] with pos == size() and str being a non-const string yields undefined behaviour. From C++11 on, a read-access would be permitted (yielding a reference to the '\0'-character. A modification, however, again is undefined in its behaviour.
So far to the Cpp reference regarding std::basic_string::operator_at.
But now let's explain the behaviour of a program similar to yours but with defined behaviour; (I'll use this then as analogy to describe the behaviour of your program):
string str = "bbbb";
const char* cstr = str.data();
printf("adress: %p; content:%s\n", cstr, cstr);
// yields "adress: 0x7fff5fbff5d9; content:bbbb"
str[0] = 'a';
const char* cstr2 = &str[0];
printf("adress: %p; content:%s\n", cstr2, cstr2);
// yields "adress: 0x7fff5fbff5d9; content:abbb"
cout << "str:" << str << endl;
// yields "str:abbb"
The program is almost self explanatory, but note that str.data()gives a pointer to the internal data buffer, and str.data() returns the same address as &str[0].
If we now change the same program to your setting with string str = "", then there does not even change to much in the behaviour (although this behaviour is undefined, not safe, not guaranteed, and may differ from compiler to compiler):
string str; // is the same as string str = ""
const char* cstr = str.data();
printf("adress: %p; content:%s\n", cstr, cstr);
// yields "adress: 0x7fff5fbff5c1; content:"
str[0] = 'a';
const char* cstr2 = &str[0];
printf("adress: %p; content:%s\n", cstr2, cstr2);
// yields "adress: 0x7fff5fbff5c1; content:a"
cout << "str:" << str << endl;
// yields "str:"
Note that str.data() returns the same address as &str[0] and that 'a' has actually been written to that address (if we have good luck, we do not access non-allocated memory, as an empty string is not guaranteed to have a buffer ready; maybe we have really good luck). So printing out str.data() actually gives you a (if we have additional luck that the character after 'a' is a string terminating char). Anyway, statement str[0]='a' does not increase string size, which is still 0, such that cout << str gives an empty string.
Hope this helps somehow.
string str;
Makes a string of length 0.
str[0] = 'a';
Sets the first element of the string to 'a'. Note that the length of the string is still 0. Also note there may not be space allocated to hold this 'a' and the program is broken at this point so further analysis is best guesses.
cout << "str:" << str << endl;
Prints the contents of the string. The string is length 0, so nothing prints.
cout << "str[0]:" << str[0] << endl;
reaches into undefined territories and tries to read back the previously stored 'a'. This won't work, and the result is undefined. In this case it gave the appearance of working, possibly the nastiest thing undefined behaviour can do.
str = 'ABC';
is not necessarily an error as there are multibyte characters out there, but this most likely will, but is not required to, result in a warning from the compiler as it's probably a mistake.
cout << "str:" << str << endl;
Your guess is as good as mine what the compiler will do since str = 'ABC'; was logically incorrect (although syntactically valid). The compiler seems to have truncated ABC to the last character much like putting 257 into a 8 bit integer may result in preserving only the least significant bit.
Learning C++. I just want to grab the first character in a string, then make a new string based on such character, and then print it out:
#include <iostream>
using namespace std;
int main(int argc, const char * argv[]) {
string name = "Jerry";
char firstCharacter = name.at(0);
string stringOfFirstCharacter = string(&firstCharacter);
cout << stringOfFirstCharacter;
return 0;
}
The output is:
J
Jerry
I don't really know why is it also printing Jerry. Why is that?
Your code has undefined behavior. The signature of the constructor that takes a pointer to char requires that it is a pointer to a null terminated string, which it is not in your case since it is a single character.
My guess is that the implementation you have uses the small object optimization, and that "Jerry" is small enough that it is stored inside the std::string object rather than dynamically allocated. The layout of the two objects in the stack happens to be first firstCharacter, then name. When you call std::string(&firstCharacter) it reads until it hits the first null character (inside the std::string buffer) and stops there.
You are constructing an std::string object from a char* (because you are taking the address of firstCharacter). A pointer to a character is not interpreted as a character itself by the constructor of std::string, but rather as a null-terminated string.
In this case, your program has Undefined Behavior, because the address of firstCharacter is not the address of the first character of a null-terminated string.
What you should be doing is:
string stringOfFirstCharacter(1, firstCharacter);
cout << stringOfFirstCharacter;
If you really want to create a one-character string. However, notice that in order to print the character to the standard output, you could have simply written:
cout << firstCharacter;
Or even:
cout << name.at(0);
With string(&firstCharacter), you are using the std::string constructor of the form
std::string( const char* s, const Allocator& alloc = Allocator() );
That form expects a pointer to a null-terminated array of characters. It is incorrect to pass a pointer to character(s) that are not null-terminated.
With your intention of initializing the string with 1 char, you should use the form:
string( 1, firstCharacter )
The string constructor you're using (the one that takes a char * argument), is intended to convert a C-style string into a C++ string object - not a single character. By passing it a single character you cause undefined behaviour.
In your specific case, there appears to not be a zero byte in memory after firstCharacter, so the constructor runs through and includes all of name along with it!
Will the below string contain the null terminator '\0'?
std::string temp = "hello whats up";
No, but if you say temp.c_str() a null terminator will be included in the return from this method.
It's also worth saying that you can include a null character in a string just like any other character.
string s("hello");
cout << s.size() << ' ';
s[1] = '\0';
cout << s.size() << '\n';
prints
5 5
and not 5 1 as you might expect if null characters had a special meaning for strings.
Not in C++03, and it's not even guaranteed before C++11 that in a C++ std::string is continuous in memory. Only C strings (char arrays which are intended for storing strings) had the null terminator.
In C++11 and later, mystring.c_str() is equivalent to mystring.data() is equivalent to &mystring[0], and mystring[mystring.size()] is guaranteed to be '\0'.
In C++17 and later, mystring.data() also provides an overload that returns a non-const pointer to the string's contents, while mystring.c_str() only provides a const-qualified pointer.
This depends on your definition of 'contain' here. In
std::string temp = "hello whats up";
there are few things to note:
temp.size() will return the number of characters from first h to last p (both inclusive)
But at the same time temp.c_str() or temp.data() will return with a null terminator
Or in other words int(temp[temp.size()]) will be zero
I know, I sound similar to some of the answers here but I want to point out that size of std::string in C++ is maintained separately and it is not like in C where you keep counting unless you find the first null terminator.
To add, the story would be a little different if your string literal contains embedded \0. In this case, the construction of std::string stops at first null character, as following:
std::string s1 = "ab\0\0cd"; // s1 contains "ab", using string literal
std::string s2{"ab\0\0cd", 6}; // s2 contains "ab\0\0cd", using different ctr
std::string s3 = "ab\0\0cd"s; // s3 contains "ab\0\0cd", using ""s operator
References:
https://akrzemi1.wordpress.com/2014/03/20/strings-length/
http://en.cppreference.com/w/cpp/string/basic_string/basic_string
Yes if you call temp.c_str(), then it will return null-terminated c-string.
However, the actual data stored in the object temp may not be null-terminated, but it doesn't matter and shouldn't matter to the programmer, because when then programmer wants const char*, he would call c_str() on the object, which is guaranteed to return null-terminated string.
With C++ strings you don't have to worry about that, and it's possibly dependent of the implementation.
Using temp.c_str() you get a C representation of the string, which will definitely contain the \0 char. Other than that, i don't really see how it would be useful on a C++ string
std::string internally keeps a count of the number of characters. Internally it works using this count. Like others have said, when you need the string for display or whatever reason, you can its c_str() method which will give you the string with the null terminator at the end.