String after appending Char changning its size - c++

I want to test what if string append char's size, and below is the outcome.
I know that the string end with the null character, but why the outcome is like that?
#include <iostream>
#include <string>
using namespace std;
int main(){
string a = "" + 'a'; //3
string b = "" + '1'; //2
string c = "a" + 'a'; //2
string d = "1" + '1'; //3
string e = "\0" + 'a'; //20
string f = "\0" + '1'; //1
string g = "a" + '\0'; //1
string h = "1" + '\0'; //1
string i = "" + '\0'; //0
string j = "" + '\0'; //0
cout << a.size() << endl;
cout << b.size() << endl;
cout << c.size() << endl;
cout << d.size() << endl;
cout << e.size() << endl;
cout << f.size() << endl;
cout << g.size() << endl;
cout << h.size() << endl;
cout << i.size() << endl;
cout << j.size() << endl;
return 0;
}

Your code is not doing what you think.
String literals decay to const char *, and char is an integer type. If you try to sum them, the compiler finds that the simplest way to make sense of that stuff is to convert chars to ints, so the result is performing pointer arithmetic over the string literals - e.g. ""+'a' goes to the 97th character in memory after the beginning of the string literal "" (if 'a' is represented by 97 on your platform).
This results in garbage being passed to the string constructor, which will store inside the string being constructed whatever it finds at these locations of memory until it founds a \0 terminator. Hence the "strange" results you get (which aren't reproducible, since the exact memory layout of the string table depends from the compiler).
Of course all this is undefined behavior as far as the standard is concerned (you are accessing char arrays outside their bounds, apart from the cases where you add \0).
To make your code do what you mean, at least one of the operands must be of type string:
string c = string("a") + 'a';
or
string c = "a" + string("a");
so the compiler will see the relevant overloads of operator+ that involve std::string.

Most of your initializers have undefined behaviour. Consider, for example:
string a = "" + 'a';
You are adding a char to a char pointer. This advances the pointer by the ASCII value of the char, and uses the resulting (undefined) C string to initialize a.
To fix, change the above to:
string a = string("") + 'a';

Related

Why does this c++ code print out length 5 and when i'm print out string the program is automatic terminate?

char start = 'a';
string out=""+start;
cout<<out.length()<<endl;
First we take a character in start variable then we take make string and initialize with start variable and print it ?
string out = "" + start;
This addition doesn't do what you think you do, 'a' is being cast to an int according to it's ASCI value, you're moving the pointer by that many elements, and then you're constructing the string from the const char* that you're passing, which causes UB, as it points to some invalid memory location that you're trying to read.
"" is a const char[1] (holding'\0').
You are trying to add 'a' to this - so it decays into a pointer whose adress than gets increased by 97 ((int)'a' == 97). Now you assign the std::string to this out-of-bounds adress - which is undefined behavior.
to achieve what you want, you can use the operator""s to turn the const char[1] into an std::string. Then std::string::operator+ will execute and correctly concatenate everything.
using namespace std::string_literals;
char start = 'a';
std::string out = ""s + start;
std::cout << "string: " << out << " - length: " << out.length() << std::endl;
output:
string: a - length: 1
Convert the char to string and it works ->
char start = 'a';
std::string s(1, start);
std::cout << s.length() << endl;

Program picking '\0' even when it is not mentioned - Clarification

So, I am given to predict what this program will do:
int main()
{
char d[] = {'h','e','l','l','o'};
const char *c = d;
std::cout << *c << std::endl;
while ( *c ) {
c = c + 1;
std::cout << *c << std::endl;
if ( *c == '\0' )
std::cout << "Yes" << std::endl;
}
return 0;
}
From my understanding the code should've never printed Yes as there is no \0 in the character array d[], so is it the garbage value this program is picking? I short this while should run infinite times. Is that right?
The proper answer to this question is that the program exhibits undefined behavior, because it goes past the end of the array.
Changing the program to use string literal for initialization would change the behavior to "always prints "Yes":
char d[] = "hello";
I short this while should run infinite times.
Once undefined behavior happens, all bets are off. However, commonly the program manages to find a zero byte in memory outside of d[], at which point it prints "Yes", and exits the loop.
Your code is an example where array d is not a string (more accurately, not a nul-termitated string), so it is incorrect usage of that array as a string. That means, all functions that work with char* strings and use \0 as a sign of string end go ouside the memory allocated for d.... and somtimes \0 can be found outside (no one knows beforehand where this \0 will be found). And once again, this is incorrect usage that can lead to errors related to array boundaries violation.
Finaly, because conditions for if statement and while is "associated" in sense "(*c == '\0') is true at the last iteration of loop while(*c){...}" and there is very low probability that while(*c){...} is infinite, "yes" will be printed eventually.
UPDATE:
Let's consider additionally the following example:
#include <iostream>
using namespace std;
int main()
{
char d1[] = { 'h', 'e', 'l', 'l', 'o' }; // no nul-terminator here
char d2[] = { 'h', 'e', 'l', 'l', 'o', '\0' };
char d3[] = "hello";
cout << "Memory allocated for d1 - " << sizeof(d1) << endl;
cout << "Length of string in d1 - " << strlen(d1) << endl;
cout << "Memory allocated for d2 - " << sizeof(d2) << endl;
cout << "Length of string in d2 - " << strlen(d2) << endl;
cout << "Memory allocated for d3 - " << sizeof(d3) << endl;
cout << "Length of string in d3 - " << strlen(d3) << endl;
return 0;
}
Output will be (for the second line not always exactly, but similar):
Memory allocated for d1 - 5
Length of string in d1 - 19
Memory allocated for d2 - 6
Length of string in d2 - 5
Memory allocated for d3 - 6
Length of string in d3 - 5
Here you can see 3 ways of char-array initialization. And d3 here is initialized with string literal where \0 is added because value is in "". Array d1 has no nul-terminator and as a result strlen return value greated than sizeof - \0 was found outside array d1.

Difference between string.empty and string[0] == '\0'

Suppose we have a string
std::string str; // some value is assigned
What is the difference between str.empty() and str[0] == '\0'?
C++11 and beyond
string_variable[0] is required to return the null character if the string is empty. That way there is no undefined behavior and the comparison still works if the string is truly empty. However you could have a string that starts with a null character ("\0Hi there") which returns true even though it is not empty. If you really want to know if it's empty, use empty().
Pre-C++11
The difference is that if the string is empty then string_variable[0] has undefined behavior; There is no index 0 unless the string is const-qualified. If the string is const qualified then it will return a null character.
string_variable.empty() on the other hand returns true if the string is empty, and false if it is not; the behavior won't be undefined.
Summary
empty() is meant to check whether the string/container is empty or not. It works on all containers that provide it and using empty clearly states your intent - which means a lot to people reading your code (including you).
Since C++11 it is guaranteed that str[str.size()] == '\0'. This means that if a string is empty, then str[0] == '\0'. But a C++ string has an explicit length field, meaning it can contain embedded null characters.
E.g. for std::string str("\0ab", 3), str[0] == '\0' but str.empty() is false.
Besides, str.empty() is more readable than str[0] == '\0'.
Other answers here are 100% correct. I just want to add three more notes:
empty is generic (every STL container implements this function) while operator [] with size_t only works with string objects and array-like containers. when dealing with generic STL code, empty is preferred.
also, empty is pretty much self explanatory while =='\0' is not very much.
when it's 2AM and you debug your code, would you prefer see if(str.empty()) or if(str[0] == '\0')?
if only functionality matters, we would all write in vanilla assembly.
there is also a performance penalty involved. empty is usually implemented by comparing the size member of the string to zero, which is very cheap, easy to inline etc. comparing against the first character might be more heavy. first of all, since all strings implement short string optimization, the program first has to ask if the string is in "short mode" or "long mode". branching - worse performance. if the string is long, dereferencing it may be costly if the string was "ignored" for some time and the dereference itself may cause a cache-fault which is costly.
empty() is not implemented as looking for the existence of a null character at position 0, its simply
bool empty() const
{
return size() == 0 ;
}
Which could be different
Also, beware of the functions you'll use if you use C++ 11 or later version:
#include <iostream>
#include <cstring>
int main() {
std::string str("\0ab", 3);
std::cout << "The size of str is " << str.size() << " bytes.\n";
std::cout << "The size of str is " << str.length() << " long.\n";
std::cout << "The size of str is " << std::strlen(str.c_str()) << " long.\n";
return 0;
}
will return
The size of str is 3 bytes.
The size of str is 3 long.
The size of str is 0 long.
You want to know the difference between str.empty() and str[0] == '\0'. Lets follow the example:
#include<iostream>
#include<string>
using namespace std;
int main(){
string str, str2; //both string is empty
str2 = "values"; //assigning a value to 'str2' string
str2[0] = '\0'; //assigning '\0' to str2[0], to make sure i have '\0' at 0 index
if(str.empty()) cout << "str is empty" << endl;
else cout << "str contains: " << str << endl;
if(str2.empty()) cout << "str2 is empty" << endl;
else cout << "str2 contains: " << str2 << endl;
return 0;
}
Output:
str is empty
str2 contains: alues
str.empty() will let you know the string is empty or not and str[0] == '\0' will let you know your strings 0 index contains '\0' or not. Your string variables 0 index contains '\0' doesn't mean that your string is empty. Yes, only once it can be possible when your string length is 1 and your string variables 0 index contains '\0'. That time you can say that, its an empty string.
C++ string has the concept of whether it is empty or not. If the string is empty then str[0] is undefined. Only if C++ string has size >1, str[0] is defined.
str[i] == '\0' is a concept of the C-string style. In the implementation of C-string, the last character of the string is '\0' to mark the end of a C-string.
For C-string you usually have to 'remember' the length of your string with a separate variable. In C++ String you can assign any position with '\0'.
Just a code segment to play with:
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char* argv[]) {
char str[5] = "abc";
cout << str << " length: " << strlen(str) << endl;
cout << "char at 4th position: " << str[3] << "|" << endl;
cout << "char at 5th position: " << str[4] << "|" << endl;
str[4]='X'; // this is OK, since Cstring is just an array of char!
cout << "char at 5th position after assignment: " << str[4] << "|" << endl;
string cppstr("abc");
cppstr.resize(3);
cout << "cppstr: " << cppstr << " length: " << cppstr.length() << endl;
cout << "char at 4th position:" << cppstr[3] << endl;
cout << "char at 401th positon:" << cppstr[400] << endl;
// you should be getting segmentation fault in the
// above two lines! But this may not happen every time.
cppstr[0] = '\0';
str[0] = '\0';
cout << "After zero the first char. Cstring: " << str << " length: " << strlen(str) << " | C++String: " << cppstr << " length: " << cppstr.length() << endl;
return 0;
}
On my machine the output:
abc length: 3
char at 4th position: |
char at 5th position: |
char at 5th position after assignment: X|
cppstr: abc length: 3
char at 4th position:
char at 401th positon:?
After zero the first char. Cstring: length: 0 | C++String: bc length: 3

calculate string length excluding the null character without using library functions in c++?

I am a c++ beginner and I am struggling to produce a program for the following assignment:
Create a project called “StringLength”. The main program should include a function (called findStringLength) which will calculate the length of a string (that is, the number of characters in the string, excluding the terminating null character).
The main program should test the operation of the function with the following test strings:
"Short string"
"A longer string used for test purposes"
""
" "
Declare four character arrays and assign these test values. The output of the program should take the form shown below for each test string:
Length of "Short" = 5
You should write the code to calculate the length of the string yourself; do not use any of the library functions to do this.
Note that if a quotation mark " is to be included in the string, it should be preceded by a backslash \ character – to prevent it from being interpreted as the end of the string:
"Quotation char \" in string"
will be displayed as:
Quotation char " in string
My code is as follows:
#include <iostream>
using namespace std;
size_t findStringLength (char*);
int main()
{
char n1[] = "Short string";
char n2[] = "A longer string is used for test purposes";
char n3[] = "";
char n4[] = " ";
int stringlength;
stringlength = findStringLength("Short string/");
cout << "Length of " << n1 << " = " << stringlength << endl;
stringlength = findStringLength("A longer string used for test purposes/");
cout << "\nLength of " << n2 << " = " << stringlength << endl;
stringlength = findStringLength("/");
cout << "\nLength of " << n3 << " = " << stringlength << endl;
stringlength = findStringLength(" /");
cout << "\nLength of " << n4 << " = " << stringlength << endl;
cout << "\n";
}
size_t findStringLength (char string[])
{
int i=0;
while(string[i])i++;
return i;
}
EDIT I now have the code shown above, which gives the correct output to a certain extent. Problem being I receive this
error:H:\StringLength\main.cpp:16: warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings]
stringlength = findStringLength("Short string/");
^
I would recommend a simple while loop instead:
int string_length = 0;
while (input_str[string_length] != '\0') { string_length++; }
return string_length;
It iterates along the string until it hits a null character.
The function can be written in many ways. For example
size_t findStringLength( const char *s )
{
size_t n = 0;
while ( s[n] != '\0' ) ++n;
return n;
}
Take into account that you call the function for character arrays (as it is described in your assignment). So your function declaration
char findStringLength (char);
is wrong because the parameter is not declared as a character array or a pointer to first element of a character array. Also you specified a wrong return type.
These two function declarations are equivalent
size_t findStringLength( const char s[] );
size_t findStringLength( const char *s );
I would like to answer the following doubt of yours
I also receive a repetitive error in conversion from char* to char.
=> that's actually the error you are getting because of the way you are passing your string to the function which is declared as:
char findStringLength (char);
and what you are passing is actually n1,n2,etc which are actually arrays of type char and not the character string itself.you should read more about passing of arguments to functions in c++ and for this case you have to pass it by value so better declare your function as:
char findStringLength (char*);
int findStringLength (char string[])
{
int i=0;
while(string[i])i++;
return i;
}

Am I incorrectly using atoi?

I was having some trouble with my parsing function so I put some cout statements to tell me the value of certain variables during runtime, and I believe that atoi is incorrectly converting characters.
heres a short snippet of my code thats acting strangely:
c = data_file.get();
if (data_index == 50)
cout << "50 digit 0 = '" << c << "' number = " << atoi(&c) << endl;
the output for this statement is:
50 digit 0 = '5' number = 52
I'm calling this code within a loop, and whats strange is that it correctly converts the first 47 characters, then on the 48th character it adds a 0 after the integer, on the 49th character it adds a 1, on the 50th (Seen here) it adds a two, all the way up to the 57th character where it adds a 9, then it continues to convert correctly all the way down to the 239th character.
Is this strange or what?
Just to clarify a little more i'll post the whole function. This function gets passed a pointer to an empty double array (ping_data):
int parse_ping_data(double* ping_data)
{
ifstream data_file(DATA_FILE);
int pulled_digits [4];
int add_data;
int loop_count;
int data_index = 0;
for (char c = data_file.get(); !data_file.eof(); c = data_file.get())
{
if (c == 't' && data_file.get() == 'i' && data_file.get() == 'm' && data_file.get() == 'e' && data_file.get() == '=')
{
loop_count = 0;
c = data_file.get();
if (data_index == 50)
cout << "50 digit 0 = '" << c << "' number = " << atoi(&c) << endl;
pulled_digits[loop_count] = atoi(&c);
while ((c = data_file.get()) != 'm')
{
loop_count++;
if (data_index == 50)
cout << "50 digit " << loop_count << " = '" << c << "' number = " << atoi(&c) << endl;
pulled_digits[loop_count] = atoi(&c);
}
add_data = 0;
for (int i = 0; i <= loop_count; i++)
add_data += pulled_digits[loop_count - i] * (int)pow(10.0,i);
if (data_index == 50)
cout << "50 index = " << add_data << endl;
ping_data[data_index] = add_data;
data_index++;
if (data_index >= MAX_PING_DATA)
{
cout << "Error parsing data. Exceeded maximum allocated memory for ping data." << endl;
return MAX_PING_DATA;
}
}
}
data_file.close();
return data_index;
}
atoi takes a string, i.e. a null terminated array of chars, not a pointer to a single char so this is incorrect and will get you unpredictable results.
char c;
//...
/* ... */ atoi(&c) /* ... */
Also, atoi doesn't provide any way to detect errors, so prefer strtol and similar functions.
E.g.
char *endptr;
char c[2] = {0}; // initalize c to all zero
c[0] = data.file.get(); // c[1] is the null terminator
long l = strtol(c, &endptr, 10);
if (endptr == c)
// an error occured
atoi expects a null-terminated string as an input. What you are supplying is not a null-terminated string.
Having said that, it is always worth adding that it is very difficult (if at all possible) to use atoi properly. atoi is a function that offers no error control and no overflow control. The only proper way to perform string-representation-to-number conversion in C standard library is functions from strto... group.
Actually, if you need to convert just a single character digit, using atoi or any other string conversion function is a weird overkill. As it has already been suggested, all you need is to subtract the value of 0 from your character digit value to get the corresponding numerical value. The language specification guarantees that this is a portable solution.
Nevermind, it was simply that I needed to convert the character into a string terminated by \0. I changed it to this code:
char buffer [2];
buffer[1] = '\0';
buffer[0] = data_file.get();
if (data_index == 50)
cout << "50 digit 0 = '" << buffer[0] << "' number = " << atoi(buffer) << endl;
and it worked.