Difference between string.empty and string[0] == '\0' - c++

Suppose we have a string
std::string str; // some value is assigned
What is the difference between str.empty() and str[0] == '\0'?

C++11 and beyond
string_variable[0] is required to return the null character if the string is empty. That way there is no undefined behavior and the comparison still works if the string is truly empty. However you could have a string that starts with a null character ("\0Hi there") which returns true even though it is not empty. If you really want to know if it's empty, use empty().
Pre-C++11
The difference is that if the string is empty then string_variable[0] has undefined behavior; There is no index 0 unless the string is const-qualified. If the string is const qualified then it will return a null character.
string_variable.empty() on the other hand returns true if the string is empty, and false if it is not; the behavior won't be undefined.
Summary
empty() is meant to check whether the string/container is empty or not. It works on all containers that provide it and using empty clearly states your intent - which means a lot to people reading your code (including you).

Since C++11 it is guaranteed that str[str.size()] == '\0'. This means that if a string is empty, then str[0] == '\0'. But a C++ string has an explicit length field, meaning it can contain embedded null characters.
E.g. for std::string str("\0ab", 3), str[0] == '\0' but str.empty() is false.
Besides, str.empty() is more readable than str[0] == '\0'.

Other answers here are 100% correct. I just want to add three more notes:
empty is generic (every STL container implements this function) while operator [] with size_t only works with string objects and array-like containers. when dealing with generic STL code, empty is preferred.
also, empty is pretty much self explanatory while =='\0' is not very much.
when it's 2AM and you debug your code, would you prefer see if(str.empty()) or if(str[0] == '\0')?
if only functionality matters, we would all write in vanilla assembly.
there is also a performance penalty involved. empty is usually implemented by comparing the size member of the string to zero, which is very cheap, easy to inline etc. comparing against the first character might be more heavy. first of all, since all strings implement short string optimization, the program first has to ask if the string is in "short mode" or "long mode". branching - worse performance. if the string is long, dereferencing it may be costly if the string was "ignored" for some time and the dereference itself may cause a cache-fault which is costly.

empty() is not implemented as looking for the existence of a null character at position 0, its simply
bool empty() const
{
return size() == 0 ;
}
Which could be different

Also, beware of the functions you'll use if you use C++ 11 or later version:
#include <iostream>
#include <cstring>
int main() {
std::string str("\0ab", 3);
std::cout << "The size of str is " << str.size() << " bytes.\n";
std::cout << "The size of str is " << str.length() << " long.\n";
std::cout << "The size of str is " << std::strlen(str.c_str()) << " long.\n";
return 0;
}
will return
The size of str is 3 bytes.
The size of str is 3 long.
The size of str is 0 long.

You want to know the difference between str.empty() and str[0] == '\0'. Lets follow the example:
#include<iostream>
#include<string>
using namespace std;
int main(){
string str, str2; //both string is empty
str2 = "values"; //assigning a value to 'str2' string
str2[0] = '\0'; //assigning '\0' to str2[0], to make sure i have '\0' at 0 index
if(str.empty()) cout << "str is empty" << endl;
else cout << "str contains: " << str << endl;
if(str2.empty()) cout << "str2 is empty" << endl;
else cout << "str2 contains: " << str2 << endl;
return 0;
}
Output:
str is empty
str2 contains: alues
str.empty() will let you know the string is empty or not and str[0] == '\0' will let you know your strings 0 index contains '\0' or not. Your string variables 0 index contains '\0' doesn't mean that your string is empty. Yes, only once it can be possible when your string length is 1 and your string variables 0 index contains '\0'. That time you can say that, its an empty string.

C++ string has the concept of whether it is empty or not. If the string is empty then str[0] is undefined. Only if C++ string has size >1, str[0] is defined.
str[i] == '\0' is a concept of the C-string style. In the implementation of C-string, the last character of the string is '\0' to mark the end of a C-string.
For C-string you usually have to 'remember' the length of your string with a separate variable. In C++ String you can assign any position with '\0'.
Just a code segment to play with:
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char* argv[]) {
char str[5] = "abc";
cout << str << " length: " << strlen(str) << endl;
cout << "char at 4th position: " << str[3] << "|" << endl;
cout << "char at 5th position: " << str[4] << "|" << endl;
str[4]='X'; // this is OK, since Cstring is just an array of char!
cout << "char at 5th position after assignment: " << str[4] << "|" << endl;
string cppstr("abc");
cppstr.resize(3);
cout << "cppstr: " << cppstr << " length: " << cppstr.length() << endl;
cout << "char at 4th position:" << cppstr[3] << endl;
cout << "char at 401th positon:" << cppstr[400] << endl;
// you should be getting segmentation fault in the
// above two lines! But this may not happen every time.
cppstr[0] = '\0';
str[0] = '\0';
cout << "After zero the first char. Cstring: " << str << " length: " << strlen(str) << " | C++String: " << cppstr << " length: " << cppstr.length() << endl;
return 0;
}
On my machine the output:
abc length: 3
char at 4th position: |
char at 5th position: |
char at 5th position after assignment: X|
cppstr: abc length: 3
char at 4th position:
char at 401th positon:?
After zero the first char. Cstring: length: 0 | C++String: bc length: 3

Related

Program picking '\0' even when it is not mentioned - Clarification

So, I am given to predict what this program will do:
int main()
{
char d[] = {'h','e','l','l','o'};
const char *c = d;
std::cout << *c << std::endl;
while ( *c ) {
c = c + 1;
std::cout << *c << std::endl;
if ( *c == '\0' )
std::cout << "Yes" << std::endl;
}
return 0;
}
From my understanding the code should've never printed Yes as there is no \0 in the character array d[], so is it the garbage value this program is picking? I short this while should run infinite times. Is that right?
The proper answer to this question is that the program exhibits undefined behavior, because it goes past the end of the array.
Changing the program to use string literal for initialization would change the behavior to "always prints "Yes":
char d[] = "hello";
I short this while should run infinite times.
Once undefined behavior happens, all bets are off. However, commonly the program manages to find a zero byte in memory outside of d[], at which point it prints "Yes", and exits the loop.
Your code is an example where array d is not a string (more accurately, not a nul-termitated string), so it is incorrect usage of that array as a string. That means, all functions that work with char* strings and use \0 as a sign of string end go ouside the memory allocated for d.... and somtimes \0 can be found outside (no one knows beforehand where this \0 will be found). And once again, this is incorrect usage that can lead to errors related to array boundaries violation.
Finaly, because conditions for if statement and while is "associated" in sense "(*c == '\0') is true at the last iteration of loop while(*c){...}" and there is very low probability that while(*c){...} is infinite, "yes" will be printed eventually.
UPDATE:
Let's consider additionally the following example:
#include <iostream>
using namespace std;
int main()
{
char d1[] = { 'h', 'e', 'l', 'l', 'o' }; // no nul-terminator here
char d2[] = { 'h', 'e', 'l', 'l', 'o', '\0' };
char d3[] = "hello";
cout << "Memory allocated for d1 - " << sizeof(d1) << endl;
cout << "Length of string in d1 - " << strlen(d1) << endl;
cout << "Memory allocated for d2 - " << sizeof(d2) << endl;
cout << "Length of string in d2 - " << strlen(d2) << endl;
cout << "Memory allocated for d3 - " << sizeof(d3) << endl;
cout << "Length of string in d3 - " << strlen(d3) << endl;
return 0;
}
Output will be (for the second line not always exactly, but similar):
Memory allocated for d1 - 5
Length of string in d1 - 19
Memory allocated for d2 - 6
Length of string in d2 - 5
Memory allocated for d3 - 6
Length of string in d3 - 5
Here you can see 3 ways of char-array initialization. And d3 here is initialized with string literal where \0 is added because value is in "". Array d1 has no nul-terminator and as a result strlen return value greated than sizeof - \0 was found outside array d1.

length() of string is returning 0 for a C++ program

In the following code the length of the reversed string is being returned as zero which should not be the case:
int main() {
string for_reversal;
string reversed;
int i,j,length, r_length;
cout << "Enter the string : \n";
cin >> for_reversal;
cout << "Entered string is : " << for_reversal <<"\n";
cout << "String length is : " << for_reversal.length() << "\n";
length = for_reversal.length();
for (i=0; i<=length; i++)
{
reversed[i] = for_reversal[length - i-1];
cout << for_reversal[length-i] << "\t";
}
reversed[length+1]='\0';
cout << "\n";
r_length = reversed.length();
cout << "Reversed String length is : " << r_length << "\n";
cout << "Reversed String is : " << reversed;
return 0;
}
Not sure whats going wrong here.
There are length valid characters in a string with length length. In your cycle you access the element with index length which is out of bounds for the string and this invokes undefined behavior.
Additionally you can not assign values to cells in a string that are outside of its current size, while you assign values to the cells in reversed before resizing the string appropriately. This leads to second undefined behavior.
Having in mind the two issues I mentioned above the behavior of your problem is really not defined. However the output does makes sense if we ignore that - you assign to reversed[0] the value of for_reversal[length], which is probably '\0'. As a result the length of reversed is now 0.
Change
reversed[i] = for_reversal[length - i-1];
to
reversed+=for_reversal[length - i-1];
The way you're doing it is accessing the string out of its bound, paving the way for all kinds of hell breaking through. They way proposed above will append stuff to the string, reserving space as needed.
You could try and reserve space in "reversed" (http://www.cplusplus.com/reference/string/string/reserve/) so it can hold the original string (reversed.reserve(for_reversal.size())).
In any case, please not that you're not reserving space for the \0 character. Either reserve a bigger string or append \0 with the + operator.
The problem is here:
for (i = 0; i <= length; i++) {
reversed[i] = for_reversal[length - i-1];
// ...
}
With the statement:
reversed[i]
You're actually trying to access the i-th element of the string, with i=0, 1, ..., length.
But at that moment, the object string reversed is an empty string. Indeed you've created it with:
string reversed;
That is default constructor which initializes an empty string.
In other words, reversed[i] access to undefined memory location, since reversed is void, and there is no i-th position in the string.
Solution
In order to reverse a string you may find this code useful:
std::string reversed;
for(auto rit = for_reversal.rbegin(); rit != for_reversal.rend(); ++rit) {
reversed.push_back(*rit);
}
This code should be safer that your, using iterators (in that case reversed iterator).
Note
Moreover there is no need to append the '\0' char at the end, because the class string already handles that.
The program has undefined behavior because the string reversed is empty
string reversed;
and you may not use the subscript operator
reversed[i] = for_reversal[length - i-1];
^^^^^^^^^^^
to assign values to the string.
Also there is no need to append the zero character to objects of type std::string.
Take into account that in the loop
for (i=0; i<=length; i++)
{
reversed[i] = for_reversal[length - i-1];
cout << for_reversal[length-i] << "\t";
}
then i is equal to length then you will have
reversed[length] = for_reversal[-1];
^^^^^
The simplest way to create a reversed string is the following
#include <iostream>
#include <string>
int main()
{
std::string for_reversal;
std::cout << "Enter the string: ";
std::cin >> for_reversal;
std::string reversed( for_reversal.rbegin(), for_reversal.rend() );
std::cout << "\nReversed String length is : " << reversed.length() << "\n";
std::cout << "Reversed String is : " << reversed << std::endl;
return 0;
}
The program output is
Enter the string: Hello
Reversed String length is : 5
Reversed String is : olleH
If you want to write the loop yourself then you should reserve enough memory in the reversed string or at least use member function push_back instead of the subscript operator if the string initially is empty. The program can look for example the following way
#include <iostream>
#include <string>
int main()
{
std::string for_reversal;
std::string reversed;
std::cout << "Enter the string: ";
std::cin >> for_reversal;
reversed.reserve( for_reversal.length() );
for ( std::string::size_type i = for_reversal.length(); i != 0; i-- )
{
reversed += for_reversal[i-1];
}
std::cout << "\nReversed String length is : " << reversed.length() << "\n";
std::cout << "Reversed String is : " << reversed << std::endl;
return 0;
}
The program output will be the same as shown above
Enter the string: Hello
Reversed String length is : 5
Reversed String is : olleH
first initialize string reversed with empty string i.e string reversed="";
after this change for loop:
for(int i=0;i<length;i++)
{
reversed+=for_reversal[length-1-i];
cout << for_reversal[length-i] << endl;
}
int r_length=reversed.length();
cout << "reversed string length : " << r_length << endl;
cout << "Reversed string is : " << reversed << endl;
hope u understand.

Extra characters on cstring when cout

I have a char[4] dataLabel that when I say
wav.read(dataLabel, sizeof(dataLabel));//Read data label
cout << "Data label:" <<dataLabel << "\n";
I get the output Data label:data� but when I loop through each char I get the correct output, which should be "data".
for (int i = 0; i < sizeof(dataLabel); ++i) {
cout << "Data label " << i << " " << dataLabel[i] << "\n";
}
The sizeof returns 4. I'm at a loss for what the issue is.
EDIT: What confuses me more is that essentially the same code from earlier in my program works perfectly.
ifstream wav;
wav.open("../../Desktop/hello.wav", ios::binary);
char riff[4]; //Char to hold RIFF header
if (wav.is_open()) {
wav.read(riff, sizeof(riff));//Read RIFF header
if ((strcmp(riff, "RIFF"))!=0) {
fprintf(stderr, "Not a wav file");
exit(1);
}
else {
cout << "RIFF:" << riff << "\n";
This prints RIFF:RIFF as intended.
You are missing a null terminator on your character array. Try making it 5 characters and making the last character '\0'. This lets the program know that your string is done without needing to know the size.
What is a null-terminated string?
The overload of operator<< for std::ostream for char const* expects a null terminated string. You are giving it an array of 4 characters.
Use the standard library string class instead:
std::string dataLabel;
See the documentation for istream::read; it doesn't append a null terminator, and you're telling it to read exactly 4 characters. As others have indicated, the << operator is looking for a null terminator so it's continuing to read past the end of the array until it finds one.
I concur with the other suggested answer of using std::string instead of char[].
Your char[] array is not null-terminated, but the << operator that accepts char* input requires a null terminator.
char dataLabel[5];
wav.read(dataLabel, 4); //Read data label
dataLabel[4] = 0;
cout << "Data label:" << dataLabel << "\n";
Variable dataLabel is defined like
char[4] dataLabel;
that it has only four characters that were filled with characters { 'd', 'a', 't', 'a' ) in statement
wav.read(dataLabel, sizeof(dataLabel));//
So this character array does not have the terminating zero that is required for the operator << when its argument is a character array.
Thus in this statement
cout << "Data label:" <<dataLabel << "\n";
the program has undefined behaviour.
Change it to
std::cout << "Data label: ";
std::cout.write( dataLabel, sizeof( dataLabel ) ) << "\n";

how to print a char array using a pointer only? (we don't know the size of the array)

I have a function with three parameters: a pointer to a character array (also known as a C-String), and two pointers to specific characters (we will assume that they point to characters in the C-String).
void stringPointerOperation(char* str, char* firstPtr, char* secondPtr)
{
cout << str << endl;
cout << "First character=" << *firstPtr << endl;
cout << "Second character =" << *secondPtr << endl;
}
Questions:
How do I print out the characters from firstPtr to the end of str?
How do I find out how many characters are between firstPtr and secondPtr?
Answer to question 1:
If your array of chars is properly formatted, it should be null-terminated (i.e., the last character should be \0). Simply print characters until you get there, as:
while(*firstPtr != '\0') {
cout << *firstPtr << endl;
*firstPtr++;
}
Answer to question 2:
If you are sure they are pointers to the same array of characters, simply subtracting them should work:
int charsBetween = secondPtr - firstPtr;

String after appending Char changning its size

I want to test what if string append char's size, and below is the outcome.
I know that the string end with the null character, but why the outcome is like that?
#include <iostream>
#include <string>
using namespace std;
int main(){
string a = "" + 'a'; //3
string b = "" + '1'; //2
string c = "a" + 'a'; //2
string d = "1" + '1'; //3
string e = "\0" + 'a'; //20
string f = "\0" + '1'; //1
string g = "a" + '\0'; //1
string h = "1" + '\0'; //1
string i = "" + '\0'; //0
string j = "" + '\0'; //0
cout << a.size() << endl;
cout << b.size() << endl;
cout << c.size() << endl;
cout << d.size() << endl;
cout << e.size() << endl;
cout << f.size() << endl;
cout << g.size() << endl;
cout << h.size() << endl;
cout << i.size() << endl;
cout << j.size() << endl;
return 0;
}
Your code is not doing what you think.
String literals decay to const char *, and char is an integer type. If you try to sum them, the compiler finds that the simplest way to make sense of that stuff is to convert chars to ints, so the result is performing pointer arithmetic over the string literals - e.g. ""+'a' goes to the 97th character in memory after the beginning of the string literal "" (if 'a' is represented by 97 on your platform).
This results in garbage being passed to the string constructor, which will store inside the string being constructed whatever it finds at these locations of memory until it founds a \0 terminator. Hence the "strange" results you get (which aren't reproducible, since the exact memory layout of the string table depends from the compiler).
Of course all this is undefined behavior as far as the standard is concerned (you are accessing char arrays outside their bounds, apart from the cases where you add \0).
To make your code do what you mean, at least one of the operands must be of type string:
string c = string("a") + 'a';
or
string c = "a" + string("a");
so the compiler will see the relevant overloads of operator+ that involve std::string.
Most of your initializers have undefined behaviour. Consider, for example:
string a = "" + 'a';
You are adding a char to a char pointer. This advances the pointer by the ASCII value of the char, and uses the resulting (undefined) C string to initialize a.
To fix, change the above to:
string a = string("") + 'a';