I am new with strings in C++. I am just confused with the working of the code below (used to reverse a string).
std:: string rev;
for(int i= str.size()-1; i>=0; --i)
{
rev.push_back(str[i]);
}
std:: cout<<" Reversed= "<<rev<<endl;
The Problem is that the last character of a string is null character '\0'. So, when the loop runs for the first iteration, it should place a null character at the start of rev and one more thing, here the string rev may not be null terminated because '\0' is not assigned as the last character of string.
But when I execute the Program, it works fine. I know I am wrong while thinking all this. Can anyone explain the working please? I shall be glad and Thankful to you :)
The null terminator is not actually considered part of a std::string. It only comes into play when you call c_str(). So size() and length() do not include a terminator. And in fact you can put null characters into the middle of a std::string and everything will still work, except for c_str() (if your string may contain nulls you should use data() and size()).
The '/0' operator is added automatically, so you are looping only till the character just before '/0'. It's always there basically, so you are not seeing any changes.
Related
std::string resize causes strings that appear to be equal to no longer be equal. It can appear to be misleading when I hover over the variable in my debugger and they appear to hold the same value.
I think it comes down to the fact that I expected the == operator to stop at the first null character but it keeps going till the end of the size. I'm sure this is working as intended but I was stuck on an issue caused by this for a while so I wanted to see why you would keep comparing characters even after the first null character. thanks!
int main(void)
{
std::string test1;
test1.resize(10);
test1[0] = 'a';
std::string test2 = "a";
//they are not equal
bool same = (test1 == test2);
return 0;
}
test1 is the string "a\0\0\0\0\0\0\0\0\0". test2 is the string "a". They are not equal.
std::string can contain null characters. Its length is not the distance to the first null character. It does also guarantee that the memory buffer containing the characters of the string ends with an additional null character 1 beyond its length.
If you don't intend for the string to be longer but just want the memory, use std::string::reserve. Note that you cannot access elements beyond the end with [] legally, but pushing back or whatever won't cause any new memory allocations until you pass the reserve limit.
This is the intended behavior of std::string. Unlike a c-string a std::string can have as many null characters as you want. For instance "this\0 is\0 a\0 legal\0 std::string\0" would be legal to have as the contents for a std::string. You have to build it like
std::string nulls_inside("this\0 is\0 a\0 legal\0 std::string\0", sizeof("this\0 is\0 a\0 legal\0 std::string\0");
but you can also insert null characters into an existing std::string. In your case you're comparing
"a\0\0\0\0\0\0\0\0\0\0"
against
"a\0"
so it fails.
When I run the example code, the wordLength is 7 (hence the output 7). But my char array gets some really weird characters in the end of it.
wordLength = word.length();
cout << wordLength;
char * wordchar = new char[wordLength]; //new char[7]; ??
for (int i = 0; i < word.length(); i++) //0-6 = 7
{
wordchar[i] = 'a';
}
cout << wordchar;
The output: 7 aaaaaaa²²²²¦¦¦¦¦ÂD╩2¦♀
Desired output is: aaaaaaa... What is the garbage behind it?? And how did it end up there?
You should add \0 at the end of wordchar.
char * wordchar = new char[wordLength +1];
//add chars as you have done
wordchar[wordLength] = `\0`
The reason is that C-strings are null terminated.
C strings are terminated with a '\0' character that marks their end (in contrast, C++ std::string just stores the length separately).
In copying the characters to wordchar you didn't terminate the string, thus, when operator<< outputs wordchar, it goes on until it finds the first \0 character that happens to be after the memory location pointed to by wordchar, and in the process it prints all the garbage values that happen to be in memory in between.
To fix the problem, you should:
make the allocated string 1 char longer;
add the \0 character at the end.
Still, in C++ you'll normally just want to use std::string.
Use: -
char * wordchar = new char[wordLength+1]; // 1 extra for null character
before for loop and
wordchar[i] ='\0'
after for loop , C strings are null terminated.
Without this it keeps on printing, till it finds the first null character,printing all the garbage values.
You avoid the trailing zero, that's the cause.
In C and C++ the way the whole eco-system treats string length is that it assumes a trailing zero ('\0' or simply 0 numerically). This is different then for example pascal strings, where the memory representation starts with the number which tells how many of the next characters comprise the particular string.
So if you have a certain string content what you want to store, you have to allocate one additional byte for the trailing zero. If you manipulate memory content, you'll always have to keep in mind the trailing zero and preserve it. Otherwise strstr and other string manipulation functions can mutate memory content when running off the track and keep on working on the following memory section. Without trailing zero strlen will also give a false result, it also counts until it encounters the first zero.
You are not the only one making this mistake, it often gets important roles in security vulnerabilities and their exploits. The exploit takes advantage of the side effect that function go off trail and manipulate other things then what was originally intended. This is a very important and dangerous part of C.
In C++ (as you tagged your question) you better use STL's std::string, and STL methods instead of C style manipulations.
I've always had a question about null-terminated strings in C++/C. For example, if you have a character array like so:
char a[10];
And then you wanted to read in characters like so:
for(int i = 0; i < 10; i++)
{
cin >> a[i];
}
And lets in input the following word: questioner
as the input.
Now my question is what happens to the '\0'? If I were to reverse the string, and make it print out
renoitseuq
Where does the null-terminating character go? I thought that good programming practice was to always leave one extra character for the zero-terminating character. But in this example, everything was printed correctly, so why care about the null-terminating character? Just curious. Thanks for your thoughts!
There are cases where you're given a null-terminator, and cases where you have to ask for one yourself.
const char* x = "bla";
is a null-terminated C-style string. It actually has 4 characters - the 3 + the null terminator.
Your string isn't null-terminated. In fact, treating it as a null-terminated string leads to undefined behavior. If you were to cout << it, you'd be attempting to read beyond the memory you're allowed to access, because the runtime will keep looking for a null-terminator and spit out characters until it reaches one. In your case, you were lucky there was one right at the end, but that's not a guarantee.
char a[10]; is just like any other array - un-initialized values, 10 characters - not 11 just because it's a char array. You wouldn't expect int b[10] to contain 10 values for you to play with and an extra 0 at the end just because, would you?
Well, reading that back, I don't see why you'd expect that from a C-string as well - it's not all intuitive.
You are reading 10 chars, not a string. I assume that you also output 10 chars in reverse, so the 0-char plays no role, coz you dont use the array as string, but as an array of single chars...
char a[10] is ten characters, any of which can be a '\0'.
If you put "questioner" in there none of them are.
To get that you'd need a[11] and fill it with "questioner" and then '\0'.
If you were reversing it, you'd get the position of the first '\0' in a[?], reverse up to that and then add a null terminator.
This is a classic banana skin in C, unfortunately it still manages to get under your foot at the most inopportune of moments, even if you are all too familiar with it.
I can't understand what the '\0' in the two different place mean in the following code:
string x = "hhhdef\n";
cout << x << endl;
x[3]='\0';
cout << x << endl;
cout<<"hhh\0defef\n"<<endl;
Result:
hhhdef
hhhef
hhh
Can anyone give me some pointers?
C++ std::strings are "counted" strings - i.e., their length is stored as an integer, and they can contain any character. When you replace the third character with a \0 nothing special happens - it's printed as if it was any other character (in particular, your console simply ignores it).
In the last line, instead, you are printing a C string, whose end is determined by the first \0 that is found. In such a case, cout goes on printing characters until it finds a \0, which, in your case, is after the third h.
C++ has two string types:
The built-in C-style null-terminated strings which are really just byte arrays and the C++ standard library std::string class which is not null terminated.
Printing a null-terminated string prints everything up until the first null character. Printing a std::string prints the whole string, regardless of null characters in its middle.
\0 is the NULL character, you can find it in your ASCII table, it has the value 0.
It is used to determinate the end of C-style strings.
However, C++ class std::string stores its size as an integer, and thus does not rely on it.
You're representing strings in two different ways here, which is why the behaviour differs.
The second one is easier to explain; it's a C-style raw char array. In a C-style string, '\0' denotes the null terminator; it's used to mark the end of the string. So any functions that process/display strings will stop as soon as they hit it (which is why your last string is truncated).
The first example is creating a fully-formed C++ std::string object. These don't assign any special meaning to '\0' (they don't have null terminators).
The \0 is treated as NULL Character. It is used to mark the end of the string in C.
In C, string is a pointer pointing to array of characters with \0 at the end. So following will be valid representation of strings in C.
char *c =”Hello”; // it is actually Hello\0
char c[] = {‘Y’,’o’,’\0′};
The applications of ‘\0’ lies in determining the end of string .For eg : finding the length of string.
The \0 is basically a null terminator which is used in C to terminate the end of string character , in simple words its value is null in characters basically gives the compiler indication that this is the end of the String Character
Let me give you example -
As we write printf("Hello World"); /* Hello World\0
here we can clearly see \0 is acting as null ,tough printinting the String in comments would give the same output .
I'm doing an assignment for school and I seem to have a problem with other characters being in the array. (Amongst other problems...)
In one function, I declared the array..
const int MAXSIZE = 100;
char inFix[MAXSIZE];
And this code is used to put chars into the array
//loop to store user input until enter is pressed
while((inputChar = static_cast<char>(cin.get()))!= '\n')
{
//if function to decide if the input should be stored or not
if(isOperator(inputChar) || isdigit(static_cast<int>(inputChar)) || inputChar == '(' || inputChar == ')')
{
inFix[a] = inputChar; //stores input
a++;
}
}
At the end of this, I append the null character to the array though I wasn't sure if I should do this:
inFix[MAXSIZE] = '\0';
Or if I should've used strcat.. either way... in my next function, I use strcat to append a parenthesis to the end.
But I've been having problems with the code, so I ran a for loop to print what is inside the infix array at the beginning of my next function, just to see...
And I get this annoying beeping sound, and a string of weird characters like hearts, and music signs... and.. a whole list of odd characters. What could be the problem? Thanks.
EDIT: by the way, I input 9*4, and I run the for loop after i append the parenthesis, so at the beginning of the output, I get:
9*4) and then the string of odd characters...
so I ran a for loop to print what is inside the infix array at the beginning of my next function, just to see...
And I get this annoying beeping sound, and a string of weird characters like hearts, and music signs... and.. a whole list of odd characters. What could be the problem?
The problem is that you're printing out array elements which you never initialized. The answer you have currently accepted advises you to initialize all those elements. Although this will not cause errors it is a mistake to let this answer prevent you from fully understanding the problem you encountered.
Reconsider your code where you inserted a null character at the end of the array:
inFix[MAXSIZE] = '\0';
You apparently know that a null character has something to do with marking the end of the string, but you've mistaken how to do that correctly. Everything from the beginning of the array until a null character will be treated as part of your string. If you copy three characters from the input 9*4 into your array then you should only want those three characters to be seen as part of your string. What you do not want is for everything in the array past those three characters, up to MAXSIZE to also be treated as part of your string. So you need to put the end-of-string marker, the '\0', right after the characters you care about.
(BTW, inFix[MAXSIZE] = '\0'; not only puts the end-of-string marker at the end of the array, it puts it outside the array, which you are not allowed to do. The program will behave unpredictably.)
inFix[0] = '9';
inFix[1] = '*';
inFix[2] = '4';
inFix[3] = '\0'; // <-- this is where you need to put the end-of-string marker, because this is the end of the characters you care about.
Putting the end-of-string marker at the end of the array effectively does this:
inFix[0] = '9';
inFix[1] = '*';
inFix[2] = '4';
inFix[3] = ???
inFix[4] = ???
.
.
.
inFix[98] = ???
inFix[99] = ??? // annoying bell noise? musical note symbol?
inFix[100] = '\0'; // Yes, Please!
The reason initializing the array to all zeros (which can also be done like this char inFix[MAXSIZE] = {};, empty braces instead of a 0) worked for you is because that means that no matter where you stop writing characters you care about, the next character will be a '\0'. That position, right after the characters you care about, is the only place it matters.
Since the loop that's copying the characters knows exactly where it stops it also knows exactly where to insert the end-of-string marker. It would be easy to just insert a single '\0' in the correct place.
Try appending the '\0' to position a in the array instead - that is, exactly after the last char you've read. Otherwise you're putting your characters in the array, then a random sequence of what was in the array before, and then after that is the \0 (or, in this case, it's one after the end of the array, which is even worse).
You say you null terminate string using inFix[MAXSIZE] = '\0' as you currently understand this should converted into inFix[MAXSIZE - 1] = '\0' but in this case you say that you have an string that ended in last possible character!! so how you should add something( for example a parenthesis ) to it?? because it is full and it can't accept any more characters.
So may be this is better to use inFix[a] = '\0' because you know that a point to end of string, and now if a < MAXSIZE then you have enough space to add more items to the string otherwise you may increase MAXSIZE (using strcat for example) or show some error to the user.
You declared the array with:
char inFix[MAXSIZE];
At this point in time, the content of the array is not initialized. Printing out these uninitialized characters will lead to the behavior you see (strange characters, sounds, etc.).
However, the contents of that array are not initialized. If you want to initialize this array, change this line to:
char inFix[MAXSIZE] = {0};
This initialization syntax causes the elements in the array to be initialized. The first elements are initialized to the values provided in the brackets, while any values not in the brackets will be initialized to 0. In the case above, all of the array elements will be initialized to 0 (which happens to correspond to the NULL character \0).
Since all the characters are initialized to NULL in this case, you don't need to append the null character after your input.
That said, a better solution would be not to use a C-Style array at all, but a std::vector<char>, or simply a std::string, as pointed out above.