Null terminator \1? - c++

I am doing some coding for a beginner C++ class I am taking. In the class, we have to take code submitted by another student and fix a bug they created. The code is as follows:
#include <iostream>
using namespace std;
int countChars(char *, char); // Function prototype
int main()
{
const int SIZE = 51; // Array size
char userString[SIZE]; // To hold a string
char letter; // The character to count
// Get a string from the user.
cout << "Enter a string (up to 50 characters): ";
cin.getline(userString, SIZE);
// Get a character to count occurrences of within the string.
cout << "Enter a character and I will tell you how many\n";
cout << "times it appears in the string: ";
cin >> letter;
// Display the number of times the character appears.
cout << letter << " appears ";
cout << countChars(userString, letter) << " times.\n";
return 0;
}
int countChars(char *strPtr, char ch)
{
int times = 0; // Number of times ch appears in the string
// Step through the string counting occurrences of ch.
while (*strPtr != '\0')// ***** There was a one placed inside the null operator, however, this is not a syntax error, but rather just incorrect.
{
if (*strPtr == ch) // If the current character equals ch...
times++; // ... increment the counter
strPtr++; // Go to the next char in the string.
}
return times;
}
The student changed the function such that it had the null terminator as \10, which did not cause a compile nor run time error. After playing with it, I found that it could also be \1 and still work. How is this possible. I am a complete noob, so I apologize if this is a stupid question, but I assumed that this was a boolean operator and 1 was true and 0 was false. The question is why will \10 and \1 work as the null terminator. Thank you in advance!

'\0' means "the character having the integer representation 0." Similarly, '\10' means "the character having the integer representation 10." That's why it's not a compilation error--only a logical error.

The "\0" is the only one to be called a NULL terminator. Even though "\1" or "10" or even "\103" work, only "\0" is referred to as the NULL terminator. I'll explain why.
In "\0" the 0 refers to the OCT value in the ascii table (see picture below). In the ascii table, there is no character whose OCT value is 0, therefore, if we try to use 0, it is referred to as a NULL terminator, because it points to the ground and nothing meaningful or useful.
Now, why does "\10" and "\1" work? Because these refer to OCT values 1 and 10, which can be mapped to a character on the ascii table, notably Start of Heading and Baskspace. Similarily, if you pick a OCT value that points to a letter, punctuation mark or a number, like for example, "\102" points to B, it will output B.
If you try an invalid number, like "\8", then it will simply output 8 on the screen, since it does not refer to a valid character on the ascii table.
See this code, it summarizes all the different types of pointers:
#include <iostream>
using namespace std;
int main(void)
{
cout << "\0" << endl; // This is NULL; points to the ground
cout << "\8"; << endl; // Invalid OCT value; outputs invalid number input with only a warning. No compilation error. Here, 8 will be output
cout << "\102"; // Valid; points to B. Any valid OCT value will point to a character, except NULL.
}
EDIT: After doing some research, I noticed that the correct way to use the escape sequences is indeed with only 3 numbers at minimum. Therefore, even the NULL terminator should technically be written as "\000", following the octal method. However, the compilers can apparently tell which octal value is being referred to even if it is not written in octal format. In other words, "\1" is interpreted by the compiler as "\001". Just some extra information.
I found this information at: C++ Character Literals

Related

Cin input with a + causes next input to be a string

so I'm pretty new to C++ and I'm doing an assignment for my class. I ran into a problem when trying to check if an input is a string or a double/int. So I made a basic program to test it
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char *argv[])
{
string hi;
double hello;
cin >> hello;
if (!cin)
{
//Strings go here
cin.clear();
cin >> hi;
cout << hi << endl;
}
else
{
cout << hello << endl;
}
cout << "Done!" << endl;
}
So it works for basically when inputting a letter (such as "j" or "a") or a number but when inputting "+" or "-" it waits for the next input then forces it through the string block, even if it is a number. However "*" or "/" are read as strings and don't cause that issue (I'm assuming since they aren't explicitly operators)
I assume I am probably missing something. Thank you
Edit: I am testing with single types at a time (such as 123, 1 , d, +) without mixing types, so there won't be any inputs that have a double and a string
As per user4581301's suggestion, I'll put in some examples input and outputs
Inputs:Outputs
"Hello":"Hello"
123:123
"/":"/"
"+" (Input after: 2):"2"
The problem
Your programme does not work exactly as intended, because it doesn't take into consideration potentially consumed but lost characters.
Here are different cases that work as expected:
abc: the first char read is not numeric, so it's not consumed and cin fails fast. The second reading reads every chars present as a string.
123abc456: the first 123 is read. When a is encountered, it is not consumed since it's not valid numeric. Further reading is stopped, but a numeric value could be read.
/123: the first char read is not numeric, so it's not consumed and cin fails. the second reading reads every char present as string.
-123 or +123: the first char is considered as a valid numeric char and is read, and then the remainder is read, still as a numeric value. Works as expected if you consider that a double ora an int can be signed. Depending on output formatting, the + might not appear with your output.
Here are the cases that do not work: if the first char is + or - but it is not followed by a valid numeric char. In this case the first char is consumed, but the next char makes the numeric input fail. The remaining chars are then read as string (except the first sign that was already consumed and is lost). Examples:
++123
+ 123 (the space ends the axtraction of the double that fails, the remainder is read as a string).
Online demo
The solution
The easiest solution is to read the input as a string, then try to convert the string.
For example:
size_t processed;
string hi;
double hello;
cin >> hi;
try {
hello = stod(hi,&processed);
cout<<"number:" <<hello;
if (processed<hi.size())
cout << " (followed by something)";
cout <<endl;
}
catch (...) // should be more precise in catching, but it's for the proof of concept
{
cout <<"string: "<< hi << endl;
}
Online demo
If you want to consider + and - as alphanumeric, it'd be easy to check if hi is non empty and hi[0] a digit before trying to do the conversion.
Alternatively, you could also do a regex check of the string to see if it matches a numeric format.

Using toupper on char returns the ascii number of the char, not the character?

int main()
{
char hmm[1000];
cin.getline(hmm, 1000);
cout << hmm << endl; //this was to test if I could assign my input to the array properly
for (int sayac = 0; hmm[sayac] != '#'; sayac++) {
if (!isdigit(hmm[sayac])) {
if (islower(hmm[sayac]))
cout << toupper(hmm[sayac]);
else if (isupper(hmm[sayac]))
cout << tolower(hmm[sayac]);
else
cout << hmm[sayac];
}
}
"Write a program that reads keyboard input to the # symbol and that echoes the input
except for digits, converting each uppercase character to lowercase, and vice versa.
(Don’t forget the cctype family.) "
I'm doing this exercise from the primer book. But when I run it, it returns the ascii order of the char, not the uppercase/lowercase version of the character. Couldn't figure out the problem. Can someone tell my why please?
(I may have other problems about the exercise, please don't correct them if I have. I want to fix it on my own (except the problem I explained), but I can't check the other ones as I have this problem.
When writing
std::cout << toupper('a');
the following happen:
int toupper(int ch) is called, and returns an integer whose value is 'A' (0x41).
std::basic_ostream::operator<<(std::cout, 0x41) is called, that is the int (2) overload since an int was provided.
Overall, it prints "65".
As a solution, you can cast back your upper case to a char:
std::cout << static_cast<char>(toupper('a'));
It's a question of representation. There is no difference between a character and that character's numeric value. It's all in how you choose to display it. For example, the character 'a' is just a constant with a value equal to the character's numeric value.
The problem you are having is that std::toupper and std::tolower return an int rather than a char. One reason for that is that they handle EOF values, which are not necessarily representable by char. As a consequence, std::cout see you are trying to print an int and not a char. The standard behavior for streaming an int is to print the number. The solution is then to cast your result to char to force the value to be interpreted as a character. You can use something like std::cout << static_cast<char>(std::toupper(hmm[sayac]));.
Try the following :
#include <cctype>
#include <iostream>
int main()
{
char hmm[1000];
std::cin.getline(hmm, 1000);
std::cout << hmm << std::endl; //this was to test if I could assign my input to the array properly
for (int sayac = 0; hmm[sayac] != '#'; sayac++) {
if (!std::isdigit(hmm[sayac])) {
if (std::islower(hmm[sayac]))
std::cout << static_cast<char>(std::toupper(hmm[sayac]));
else if (isupper(hmm[sayac]))
std::cout << static_cast<char>(std::tolower(hmm[sayac]));
else
std::cout << hmm[sayac];
}
}
}
You should also consider using an std::string instead of an array of char of arbitrary length. Also, take note that you have undefined behavior if the input string does not contain #.

Storing data in char array causing corruption around variable

I am working on a C++ project and I am having an issue.
Below is my code
tempfingerprint = libssh2_hostkey_hash(session, LIBSSH2_HOSTKEY_TYPE_RSA);
char temp[48];
memset(temp, 0, sizeof(temp));
for (i = 0; i < 16; i++)
{
//fingerprintstream << (unsigned char)tempfingerprint[i] << ":";
if (temp[0] == 0)
{
sprintf(temp, "%02X:", (unsigned char)tempfingerprint[i]);
}
else
{
//sprintf(temp, "%s:%02X", temp, (unsigned char)tempfingerprint[i]);
char characters[3];
memset(characters, 0, sizeof(characters));
//If less than 16, then add the colon (:) to the end otherwise don't bother as we're at the end of the fingerprint
sprintf(characters, "%02X:", (unsigned char)tempfingerprint[i]);
strcat(temp, characters);
}
}
//Remove the end colon as its not needed. 48 Will already be null terminated, so the previous will contain the last colon
temp[47] = 0;
return string(temp);
When I run my app, I get the following error from visual studio
Run-Time-Check Failure #2 - Stack around the variable 'temp' was corrupted.
I've ran the same code on Linux through Valgrind and no errors were shown so I'm not sure what the problem is with Windows.
Here's an approach using on what Paul McKenzie's talking about (though he might implement it differently) based on it looks like you were trying to do with the stream
#include <iostream>
#include <sstream>
#include <iomanip> // output format modifiers
using namespace std;
int main()
{
stringstream fingerprintstream;
// set up the stream to print uppercase hex with 0 padding if required
fingerprintstream << hex << uppercase << setfill('0');
// print out the first value without a ':'
fingerprintstream << setw(2) << 0;
for (int i = 1; i < 16; i++) // starting at 1 because first has already been handled.
{
// print out the rest prepending the ':'
fingerprintstream << ":" << setw(2) << i;
}
// print results
std::cout << fingerprintstream.str();
return 0;
}
Output:
00:01:02:03:04:05:06:07:08:09:0A:0B:0C:0D:0E:0F
Just realized what I think OP ran up against with the garbage output. When you output a number, << will use the appropriate conversion to get text, but if you output a character << prints the character. So fingerprintstream << (unsigned char)tempfingerprint[i]; takes the binary value at tempfingerprint[i] and, thanks to the cast, tries to render it as a character. Rather than "97", you will get (assuming ASCII) "a". A large amount of what you try to print will give nonsense characters.
Example: If I change
fingerprintstream << ":" << setw(2) << i;
to
fingerprintstream << ":" << setw(2) << (unsigned char)i;
the output becomes
0?:0?:0?:0?:0?:0?:0?:0?:0?:0?:0 :0
:0?:0?:0
:0?:0?
Note the tab and the line feeds.
I need to know the definition of tempfingerprint to be sure, but you can probably solve the garbage output problem by removing the cast.
Based on new information, tempfingerprint is const char *, so tempfingerprint[i] is a char and will be printed as a character.
We want a number, so we have to force the sucker to be an integer.
static_cast<unsigned int>(tempfingerprint[i]&0xFF)
the &0xFF masks out everything but the last byte, eliminating sign extension of negative numbers into huge positive numbers when displayed unsigned.
There are, as far as I see, two issues in the code which lead to exceeding array boundaries:
First, with char temp[48] you reserve exactly 48 characters for storing results; However, when calling strcat(temp, characters) with the 16th value, and characters comprises at least the characters including the colon, then temp will comprise 16*3 digits/colons + one terminating '\0'-character, i.e. 49 characters (not 48). Note that strcat automatically appends a string terminating char.
Second, you define char characters[3] such that you reserve place for two digits and the colon, but not for the terminating '\0'-character. Hence, an sprintf(characters, "%02X:",...) will exceed characterss array bounds, as sprintf also appends the string terminator.
So, if you do not want to rewrite your code in general, changing your definitions to char temp[49] and char characters[4] will solve the problem.

What is the length of my array?

Hello everyone I'm having trouble with strlen and arrays, it keeps saying my string length is only one? If anyone could help it would be great here's my code:
#include <iostream>
using namespace std;
#include <cstring>
int main()
{
char word1[20];
int len = strlen(word1);
cout << "enter a word!\n";
cin.get(word1, 20, '\n'); cin.ignore(50,'\n');
cout << len;
}
Just read the back and forth in the comments, updating my answer to try and give some more intuition behind what's going on.
char word1[20]; Sets a place in your computer's memory that can eventually be filled by data up to 20 characters. Note that this statement alone does not "clear" the memory of whatever is currently there. As sfjac has pointed out, this means that literally anything could be in that space. It's highly unlikely that whatever is in this space is a character or anything your code could readily understand.
int len = strlen(word1); Creates an integer and sets it equal to the value of the number of characters currently in word1. Note that, because we have not specified any content for word1, you're taking the length of whatever happened to be in that memory space already. You've limited the maximum to 20, but in this case, whatever data junk is in there is giving you a length of 1.
cout << "enter a word!\n"; Prompt the user for a word
cin.get(word1, 20, '\n'); cin.ignore(50,'\n'); Get the word, store it in word1. At this point, word1 is now defined with actual content. However - you've already defined the variable len. The computer does not know to automatically redefine this for you. It follows the steps you provide, in order.
cout << len; Print the value stored in len. Because len was created prior to the user entering their data, len has absolutely nothing to do with what the user entered.
Hope this helps give you some intuition that will help beyond this one question!
#Chris is correct but perhaps a small explanation. When you declare a character array like char word1[20] on the stack, the array will not be initialized. The strlen function computes the length of the array by counting the number of characters from the address of word1 to the first null byte in memory, which could be pretty much anything.
I highly recommend using std::string for text.
If you must use character arrays:
Define a named identifier for the capacity.
Define the array using the named identifier.
The capacity should account for a terminating nul, '\0', character to
mark the end of the maximum text length.
Using the above guidelines you have the simple program:
int main(void)
{
std::string a_word_string;
std::string line_of_text_string;
const unsigned int c_string_capacity = 32U;
char c_string[c_string_capacity];
// The std::string functions
cout << "Enter some text: ";
getline(cin, line_of_text_string); // read a line of text
cout << "\nEnter a sentence: ";
cin >> a_word_string;
cin.ignore(10000, '\n'); // Ignore remaining text in the buffer.
// The C-style string functions
cout << "Enter more text: ";
cin.read(c_string, c_string_capacity);
c_string[c_string_capacity - 1] = '\0'; // Insurance, force end of string character
cout << "You entered " << (strlen(c_string)) << " characters.\n";
return EXIT_SUCCESS;
}
The std::string class is more efficient and can handle dynamically size changes.
The length of the array is the value of c_string_capacity which was used when defining the array.
The length of the text in the array is defined as strlen(c_string), which is the number of characters before the terminating nul is found.
You have to calculate len after reading in word1, otherwise you are left with undefined behaviour.
char word1[20];
cout << "enter a word!\n";
cin.get(word1, 20, '\n'); cin.ignore(50,'\n');
int len = strlen(word1);
cout << len;
It's a good idea to always initialize objects when you declare them. Since objects inside of a scope are not guaranteed to be initialized.
In C++11 for example, you can do this:
char arr[10]{}; // this will initialize the objects in the array to default.
char arr[10]{0}; // the same.

What is difference between my atoi() calls?

I have a big number stored in a string and try to extract a single digit. But what are the differences between those calls?
#include <iostream>
#include <string>
int main(){
std::string bigNumber = "93485720394857230";
char tmp = bigNumber.at(5);
int digit = atoi(&tmp);
int digit2 = atoi(&bigNumber.at(5))
int digit3 = atoi(&bigNumber.at(12));
std::cout << "digit: " << digit << std::endl;
std::cout << "digit2: " << digit2 << std::endl;
std::cout << "digit3: " << digit3 << std::endl;
}
This will produce the following output.
digit: 7
digit2: 2147483647
digit3: 57230
The first one is the desired result. The second one seems to me to be a random number, which I cannot find in the string. The third one is the end of the string, but not just a single digit as I expected, but up from the 12th index to the end of the string. Can somebody explain the different outputs to me?
EDIT: Would this be an acceptable solution?
char tmp[2] = {bigNumber.at(5), '\0'};
int digit = atoi(tmp);
std::cout << "digit: " << digit << std::endl;
It is all more or less explicable.
int main(){
std::string bigNumber = "93485720394857230";
This line copies the single character '5' into the character variable. atoi will convert this correctly. atoi expects that the string parameter is a valid 0 terminated string. &tmp is only a pointer to the character variable - the behaviour of this call is undefined since the memory immediately following the character in memory is unknown. To be exact, you would have to create a null terminated string and pass that in.*
char tmp = bigNumber.at(5);
int digit = atoi(&tmp);
This line gets a pointer to the character in position 5 in the string. This happens to be a pointer into the original big number string above - so the string parameter to atoi looks like the string "5720394857230". atoi will clearly oveflow trying to turn this into an integer since no 32 bit integer will hold this.
int digit2 = atoi(&bigNumber.at(5))
This line gets a pointer into the string at position 12. The parameter to atoi is the string
"57230". This is converted into the integer 57230 correctly.
int digit3 = atoi(&bigNumber.at(12));
...
}
Since you are using C++, there are nicer methods to convert strings of characters into integers. One that I am partial to is the Boost lexical_cast library. You would use it like this:
char tmp = bigNumber.at(5);
// convert the character to a string then to an integer
int digit = boost::lexical_cast<int>(std::string(tmp));
// this copies the whole target string at position 5 and then attempts conversion
// if the conversion fails, then a bad_lexical_cast is thrown
int digit2=boost::lexical_cast<int>(std::string(bigNumber.at(5)));
* Strictly, atoi will scan through the numeric characters until a non-numeric one is found. It is clearly undefined when it would find one and what it will do when reading over invalid memory locations.
I know why the 2nd number is displayed.
From the atoi reference.
If the correct value is out of the range of representable values, INT_MAX or INT_MIN is returned.
2147483647 is INT_MAX
bigNumber.at() doesn't return a new string with a single character but the address of a character in the string. So the second call is actually:
atoi("720394857230")
which causes the internal algorithm to overflow.
Also, the first call is very dangerous since it depends on the (random) value in memory at (&tmp)+1.
You have to allocate a string with two characters, assign the single character from bigNumber.at() to the first and \0 to the second and then call atoi() with the address of the temporary string.
The argument to atoi should be a zero-terminated string.
Function at gives pointer to char in the string. Function atoi converts string to int, not only one char.