Are std::string with null-character possible? - c++

I initialized a C++ string with a string literal and replaced a char with NULL.
When printed with cout << the full string is printed and the NULL char prints as blank.
When printed as c_str the string print stop at the NULL char as expected.
I'm a little confused. Does the action came from cout? or string?
int main(){
std::string a("ab0cd");
a[2] = '\0'; // '\0' is null char
std::cout << a << std::endl; // abcd
std::cout << a.c_str() << std::endl; // ab
}
Test it online.
I'm not sure whether the environment is related, anyway, I work with VSCode in Windows 10

First you can narrow down your program to the following:
#include <iostream>
#include <string>
int main(){
std::string a("ab0cd");
a[2] = '\0'; // replace '0' with '\0' (same result as NULL, just cleaner)
std::cout << a << "->" << a.c_str();
}
This prints
abcd->ab
That's because the length of a std::string is known. So it will print all of it's characters and not stop when encountering the null-character. The null-character '\0' (which is equivalent to the value of NULL [both have a value of 0, with different types]), is not printable, so you see only 4 characters. (But this depends on the terminal you use, some might print a placeholder instead)
A const char* represents (usually) a null-terminated string. So when printing a const char* it's length is not known and characters are printed until a null-character is encountered.

Contrary to what you seem to think, C++ string are not null terminated.
The difference in behavior came from the << operator overloads.
This code:
cout << a.c_str(); // a.c_str() is char*
As explained here, use the << overloads that came with cout, it print a char array C style and stop at the first null char. (the char array should be null terminated).
This code:
cout << a; // a is string
As explained here, use the << overloads that came with string, it print a string object that internally known is length and accept null char.

string end limit (boundary) is not 0 (NULL) like simple char* but its size keep internally in its member data as it's actually user-defined type (an instantiated object) as opposed to primitive type, so
int main(){
string a("abc0d");
a[3] = 0; // '\0' is null char
a.resize(2);
std::cout << a << std::endl; // ab
std::cout << a.c_str() << std::endl; // ab
}
i'm sorry change your code to be more comfortable, watch as it results in
ab
ab
good learning: http://www.cplusplus.com/reference/string/string/find/index.html

Related

Char outputting random characters at the end of the sentence

#include <iostream>
#include <string.h>
using namespace std;
void crypt(char* sMsg)
{
cout << "Original Message: '" << sMsg << "'" << endl;
int length = strlen(sMsg);
char sMsg_Crypt[3][length];
/* sMsg_Cryp[3]
[0] CRYPT LETTERS, ASCII + 3
[1] INVERT CHAR
[2] HALF+ OF SENTENCE, ASCII - 1
*/
for (int i=0; i<length; i++)
{
if (isalpha((int)sMsg[i]))
sMsg_Crypt[0][i] = sMsg[i] + 3; // DO ASCII + 3
else
sMsg_Crypt[0][i] = sMsg[i];
}
cout << "Crypt[0]: '" << sMsg_Crypt[0] << "'" << endl;
}
int main()
{
char sMsg[256];
cin.getline(sMsg,256);
crypt(sMsg);
return 0;
}
Input:
Hello World! Testing the Cryptography...
Output:
Original Message: 'Hello World! Testing the Cryptography...'
Crypt[0]: 'Khoor Zruog! Whvwlqj wkh Fu|swrjudsk|...Çi­o'
Why this Çi­o is comming out??
For starters variable length arrays like this
int length = strlen(sMsg);
char sMsg_Crypt[3][length];
is not a standard C++ feature.
You could use at least an array of objects of the type std::string like for example
std::string sMsg_Crypt[3];
Nevertheless the problem is that the array sMsg_Crypt[0] dies not contain a string. That is you forgot to append inserted characters in the array with the terminating zero character '\0'.
You could write after the for loop
sMsg_Crypt[0][length] = '\0';
provided that the array (if the compiler supports VLA) is declared like
char sMsg_Crypt[3][length+1];
Firstly, you can't define a static char array like this: char sMsg_Crypt[3][length];. That is because the length is not a const type, meaning the size of the array will be sMsg_Crypt[3][0] (this is because the size is not known at compile time). In MSVC, it'll flag an error (by IntelliSense). Since you know the size beforehand (256), you can replace the length with 256.
The second fact is that you're using C++ and you have access to std::string. So without using a char buffer, use std::string instead. It would look something like this: std::string sMsg_Crypt[3];
The last fact is that, for a string to be read correctly, it needs to be null-terminated ('\0' at the end). This means that the ending character must be '\0'. In the case of std::string, it does it for you.

Why does printing the 'address of index n' of c style strings lead to output of substring

I'm rather new to C++ and while working with a pointer to a char array (C style string) I was confused by its behavior with the ostream object.
const char* items {"sox"};
cout << items << endl;
cout << items[0] << endl;
cout << *items << endl;
cout << &items << endl;
cout << &items[1] << endl;
Running this leads to:
sox
s
s
0x7fff2e832870
ox
In contrary to pointer of other data types, printing the variable doesn't output the address, but the string as a whole. By what I understand, this is due to the << operator being overloaded for char arrays to treat them as strings.
What I don't understand is, that cout << &items[1] prints the string from index 1 onward (ox), instead of the address of the char at index 1. Is this also due to << operator being overloaded or what is the reason for this behavior?
The type of &items[1] is const char *. Therefore the const char * overload of operator << is used, which prints the string from index 1 onwards.
OTOH, the type of &items is const char **, for which no specific overload exists, so the address of items is printed (via the const void * overload).
Back in the olden days, when C ran the world, there was no std::string, and programmers had to make do with arrays of char to manage text. When C++ brought enlightenment (and std::string), old habits persevered, and arrays of char are still used to manage text. Because of this heritage, you'll find many places where arrays of char act differently from arrays of any other type.
So,
const int integers[] = { 1, 2, 3, 4 };
std::cout << integers << '\n';
prints the address of the first element in the array.
But,
const char text[] = { 'a', 'b', 'c', '\0' };
std::cout << text << '\n';
prints the text in the array text, up to the final 0: abc
Similarly, if you try to print addresses inside the array, you get different behavior:
std::cout << &integers[1] << '\n';
prints the address of the second element in th array, but
std::cout << &text[1] << '\n';
prints the text starting at the second character of the array: bc
And, as you suspected, that's because operator<< has an overload that takes const char* and copies text beginning at the location pointed to by the pointer, and continuing up to the first 0 that it sees. That's how C strings work, and that behavior carries over into C++.
items[1] is the second character of the array and its address, i.e. &items[1], is a pointer to the second character (with index 1) as well. So, with the same rule that you have mentioned for operator <<, the second character of the string till the end is printed.

std::cout << cstring; prints value of cstring elements, not cstring hex address. Why?

I understand that an array of chars is different to a cstring, due to the inclusion of a suffixing \0 sentinel value in a cstring.
However, I also understand that, in the case of a cstring, an array of chars, or any other type of array, the array identifier in the program is a pointer to the array.
So, below is perfectly valid.
char some_c_string[] = "stringy";
char *stringptr;
stringptr = some_c_string; // assign pointer val to other pointer
What I don't understand is why std::cout automatically assumes I want to output the value of each element in either a cstring, or an array of chars, rather than the hex address. For example:
char some_c_string[] = "stringy"; // got a sentinel val
char charArray[5] = {'H','e','l','l','o'}; // no space for sentinel val \0
char *stringptr;
stringptr = some_c_string;
int intArray[3] = {1, 2, 4};
cout << some_c_string << endl << charArray << endl
<< stringptr << endl << intArray << endl;
Will result in the output:
stringy
Hello
stringy
0xsomehexadd
So for the cstring and the char array, std::cout has given me the value of each element, rather than the hex address like with the int array.
I guess this became a standard in C++ for convenience. But can someone please expand on 1) When this became standard. 2) How std::cout differentiates between char/cstrings and other arrays. I guess it uses sizeof() to see it's is an array of single bytes, and that value of each array element is an ASCII int value to identify an array of chars/cstring.
Thanks! :D
There is nothing fancy going on. The operator<< has a special overload for char*, so that you can do std::cout << "Hello World";. It's been like that since day 1 of c++.
For anything besides char*, the pointer address is displayed as hex.
If you want to display the address of a char*, simply cast it to void*, ie
std::cout << (void*)"Hello World";

Trying to reverse a string and getting a bus error

I am trying to reverse a string (but that's not the problem that I have). The problem is trying to change the value of the string array given a certain index. However, every time I try to change the value at the index, I get a bus error. Namely, Bus error: 10. I'm not sure what this means. Also, I tried str[0] = "a" but this also gives me a bus error. Any suggestions to fix this?
#include <iostream>
using namespace std;
void reverse(char* str){
str[0] = 'a';
}
int main(){
char* str = "hello";
reverse(str);
}
Allocate your string as an array on the stack and not as a pointer into a possibly read-only segment of your program.
char str[] = "hello";
First of all, this line should atleast give you a warning:
char* str = "hello";
you are converting a string constant to a pointer, which is not allowed.
To fix your code, you should use, char str[] = "hello" in main().
When you pass this array in reverse(), it decays to char*, now the question which you asked in previous answer's comment.
But when I write cout << str << endl;, why does it print out "hello"? Shouldn't it print only the first character of the string since it points to the first element of the array?
It is because the << operator on std::cout is overloaded. If you give it a char* or const char*, it treats the operand as a pointer to (the first character of) a C-style string, and prints the contents of that string:
const char * str= "hello";
cout << str; // prints "hello"
If you give it a char value, it prints that value as a character:
cout << *str; // prints "h"
cout << str[0]; // prints "h"

how to print char array in c++

how can i print a char array such i initialize and then concatenate to another char array? Please see code below
int main () {
char dest[1020];
char source[7]="baby";
cout <<"source: " <<source <<endl;
cout <<"return value: "<<strcat(dest, source) <<endl;
cout << "pointer pass: "<<dest <<endl;
return 0;
}
this is the output
source: baby
return value: v����baby
pointer pass: v����baby
basically i would like to see the output print
source: baby
return value: baby
pointer pass: baby
You haven't initialized dest
char dest[1020] = ""; //should fix it
You were just lucky that it so happened that the 6th (random) value in dest was 0. If it was the 1000th character, your return value would be much longer. If it were greater than 1024 then you'd get undefined behavior.
Strings as char arrays must be delimited with 0. Otherwise there's no telling where they end. You could alternatively say that the string ends at its zeroth character by explicitly setting it to 0;
char dest[1020];
dest[0] = 0;
Or you could initialize your whole array with 0's
char dest[1024] = {};
And since your question is tagged C++ I cannot but note that in C++ we use std::strings which save you from a lot of headache. Operator + can be used to concatenate two std::strings
Don't use char[]. If you write:
std::string dest;
std::string source( "baby" )
// ...
dest += source;
, you'll have no problems. (In fact, your problem is due to the fact
that strcat requires a '\0' terminated string as its first argument,
and you're giving it random data. Which is undefined behavior.)
your dest array isn't initialized. so strcat tries to append source to the end of dest wich is determined by a trailing '\0' character, but it's undefined where an uninitialized array might end... (if it does at all...)
so you end up printing more or less random characters until accidentially a '\0' character occurs...
Try this
#include <iostream>
using namespace std;
int main()
{
char dest[1020];
memset (dest, 0, sizeof(dest));
char source[7] = "baby";
cout << "Source: " << source << endl;
cout << "return value: " << strcat_s(dest, source) << endl;
cout << "pointer pass: " << dest << endl;
getchar();
return 0;
}
Did using VS 2010 Express.
clear memory using memset as soon as you declare dest, it's more secure. Also if you are using VC++, use strcat_s() instead of strcat().