Why '1' and (char)1 are not equal when compared in c++? - casting

My main goal is to convert int to char type. I used (char)1 to type cast, but it doesn't seem to work due to the following result:
When I compare '1' and (char)1 in c++ in the following code
if ('1' == (char)1)
{
return 1;
}
However, it seems that the comparison is either invalid due to different variable type or they are actually not the same thing. I always thought converting integer 1 to character is (char)1. Can anyone tell me how I can convert integer 1 to char '1'?

'1' is equal to (char)49 according to http://www.asciitable.com/
(char)1 is equal to SOH (start of heading) which is a non-printable character.

Because the ASCII equivalent of '1' is 49, not 1.

'1' == The character CODE value for the printable 1, traditionally ASCII value, but today, the code point value in whatever charset is used.
The old trick is (ch - '0') to get the numeric value.
Depending on the language you should use a conversion function for a full string.
C++ - stoi, stol or strol or stringstream
C - atoi or atol (these work in C++ too)

As ibiza said, char(49) is in fact what 1 is. This is because char draws from the ASCII library.

Because when you do (char)X with X a number, you are just converting X into the range of a char, either -128 to 127 or 0 to 255 (like a modulo).
For example, (char)300 gives 44 (because 300 % 256 = 44) and (char)1 gives 1. As said in the others comments, 1 is the ASCII equivalent of SOH (Start of Heading), and not of the character '1'.

Related

Incrementing a uint8_t variable, strange outcome

In a C++ class I've the following code/while loop:
uint8_t len = 0;
while (*s != ',') {
len = (uint8_t)(len + 1u);
++s;
}
return (len);
The outcome should be a value between 0 and max 20.
As I receive a strange outcome, and started debugging. When I step through this
I get the following values for the variable Len:
‘\01’, ‘\02’, ‘\03’, ‘\04’, ‘\05’, ‘\06’, ‘\a’, ‘\b’, ‘\t’
I don’t understand the change from ‘\06’ to ‘\a’!
Can somebody explain this? I expect that the Len value is simply increased by 1 until character array pointer s hits the ',' char.
The values are correct, but your debugger interprets them as char type, not an integer type.
You can see escape sequences used in C++ here (and the corresponding values in ASCII).
\01 - 1 in octal, 1 in decimal
\02 - 2 in octal, 2 in decimal
...
\06 - 6 in octal, 6 in decimal
\a - equivalent to \07, the ASCII code to use the computer bell
\b - equivalent to \010 (10 octal, 8 decimal), the ASCII code for "backspace" character
\t - equivalent to \011 (11 octal, 9 decimal), the ASCII code for tabulator
etc.
I don't know if you can change the way your debugger interprets the data. Worst case, you can always print the value after casting it to int.
(gdb)p static_cast<int>(len)

Why is '1' + '1' = 98 and '1' + 1 = 50? [duplicate]

This question already has answers here:
Sum of two chars in C/C++
(7 answers)
Closed 4 years ago.
I'm coming from high language, PHP js and things. So this seem strange to me.
I'm using either local or online interpreter but I always get this result.
I suppose this result is because '2' is 50 in ASCII and 98 is 'b' but I'm not sure. Also I don't really understand how the conversion work.
The code is here:
#include <iostream>
#include <string>
int main()
{
std::cout << '1' + 1 << '\n';
std::cout << '1' + '1' << '\n';
}
Type char is integral type. Each character maps to an integer value. The value depends on the encoding used which in your case is probably ASCII. So the character '1' probably has an integer value of 49 thus the '1' + '1' expression is equivalent to 49 + 49 and results in 98. Adding integer value of 1 to 49 results in 50. Which is the same as adding integer value of 1 to (a value represented by the) character '1'.
In a nutshell, values are values, whether represented via character literals or integer literals.
'1' is a char constant with a specific value determined by the encoding used on your system. That encoding might be ASCII, but it might not. When used as an argument to +, it is promoted to an int. So decltype('1') is a char, but decltype('1' + '1') is an int.
On your system, it's clear that '1' has the value 49. That's why'1' + '1' is 98. And therefore '1' + 1 is 50.
Note that in C, '1' is an int type. Arguably that's less confusing than the way C++ has it.

Difference between converting int to char by (char) and by ASCII

I have an example:
int var = 5;
char ch = (char)var;
char ch2 = var+48;
cout << ch << endl;
cout << ch2 << endl;
I had some other code. (char) returned wrong answer, but +48 didn't. When I changed ONLY (char) to +48, then my code got corrected.
What is the difference between converting int to char by using (char) and +48 (ASCII) in C++?
char ch=(char)var; has the same effect as char ch=var; and assigns the numeric value 5 to ch. You're using ASCII (supported by all modern systems) and ASCII character code 5 represents Enquiry 'ENQ' an old terminal control code. Perhaps some old timer has a clue what it did!
char ch2 = var+48; assigns the numeric value 53 to ch2 which happens to represent the ASCII character for the digit '5'. ASCII 48 is zero (0) and the digits all appear in the ASCII table in order after that. So 48+5 lands on 53 (which represents the character '5').
In C++ char is a integer type. The value is interpreted as representing an ASCII character but it should be thought of as holding a number.
Its numeric range is either [-128,127] or [0,255]. That's because C++ requires sizeof(char)==1 and all modern platforms have 8 bit bytes.
NB: C++ doesn't actually mandate ASCII, but again that will be the case on all modern platforms.
PS: I think its an unfortunate artifact of C (inherited by C++) that sizeof(char)==1 and there isn't a separate fundamental type called byte.
A char is simply the base integral denomination in c++. Output statements, like cout and printf map char integers to the corresponding character mapping. On Windows computers this is typically ASCII.
Note that the 5th in ASCII maps to the Enquiry character which has no printable character, while the 53rd character maps to the printable character 5.
A generally accepted hack to store a number 0-9 in a char is to do: const char ch = var + '0' It's important to note the shortcomings here:
If your code is running on some non-ASCII character mapping then characters 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 may not be laid out in order in which case this wouldn't work
If var is outside the 0 - 9 range this var + '0' will map to something other than a numeric character mapping
A guaranteed way to get the most significant digit of a number independent of 1 or 2 is to use:
const auto ch = to_string(var).front()
Generally char represents a number as int does. Casting an int value to char doesn't provide it's ASCII representation.
The ASCII codes as numbers for digits range from 48 (== '0') to 58 (== '9'). So to get the printable digit you have to add '0' (or 48).
The difference is that casting to char (char) explicitly converts the digit to a char and adding 48 do not.
Its important to note that an int is typically 32 bit and char is typically 8 bit. This means that the number you can store in a char is from -127 to +127(or 0 to 255-(2^8-1) if you use unsigned char) and in an int from −2,147,483,648 (−231) to 2,147,483,647 (231 − 1)(or 0 to 2^32 -1 for unsigned).
Adding 48 to a value is not changing the type to char.

Casting an int to a char. Not storing the correct value

I'm trying to store a number as a character in a char vector named code
code->at(i) = static_cast<char>(distribution(generator));
However it is not storing the way I think it should
for some shouldn't '\x4' be the ascii value for 4? if not how do I achieve that result?
Here's another vector who's values were entered correctly.
You are casting without actually converting the int to a char. You need:
code->at(i) = distribution(generator) + '0';
No. \xN does not give you the ASCII code for the character N.
\xN is the ASCII character† whose code is N (in hexadecimal form).
So, when you write '\x4', you get the [unprintable] character with the ASCII code 4. Upon conversion to an integer, this value is still 4.
If you wanted the ASCII character that looks like 4, you'd write '\x34' because 34 is 4's ASCII code. You could also get there using some magic, based on numbers in ASCII being contiguous and starting from '0':
code->at(i) = '0' + distribution(generator);
† Ish.

Understanding how to create atoi; How are characters compared?

I am trying to improve my understanding of C++, pointer arithmetic especially. I use atoi pretty often, but I have rarely given thought as to how it works. Looking up how it is done, I understand it mostly, but there is one thing that I am confused about.
Here is an example of a solution I have found online:
int atoi( char* pStr )
{
int iRetVal = 0;
if ( pStr )
{
while ( *pStr && *pStr <= '9' && *pStr >= '0' )
{
iRetVal = (iRetVal * 10) + (*pStr - '0');
pStr++;
}
}
return iRetVal;
}
I think the main reason I have had a hard time grasping how atoi as been done in the past is the way characters are compared. The "while" statement is saying while the character exists, and the character is less-than-or-equal-to 9, and it is greater-than-or-equal-to 0 then do stuff. This statement says two things to me:
Characters can be compared to other characters logically (but what is the returned value?).
Before I looked into this I suppose I knew it subconsciously but I never actually thought about it, but a '5' character is "smaller" than a '6' character in the same way that 5 is less than 6, so you can compare the characters as integers, essentially (for this intent).
Somehow while (*sPtr) and *SPtr != 0 are different. This seems obvious to me, but I find that I cannot put it into words, which means I know this is true but I do not understand why.
Edit: I have no idea what the *pStr - '0' part would do.
Any help making sense of these observations would be very... helpful! Thanks!
while the character exists
No, not really. It says "while character is not 0 (or '\0'). Basically, ASCII character '\0' indicates an end of a "C" string. Since you don't want to go past the end of a character array (and the exact length is not known), every character is tested for '\0'.
Characters can be compared to other characters logically
That's right. Character is nothing but a number, well, at least in ASCII encoding. In ASCII, for instance, '0' corresponds to a decimal value of 48, '1' is 49, 'Z' is 90 (you can take a look at ASCII Table here). So yeah, you can compare characters just like you compare integers.
Somehow while (*sPtr) and *sPtr != 0 are different.
Not different at all. A decimal 0 is a special ASCII symbol (nul) that is used to indicate the end of "C" string, as I mentioned in the beginning. You cannot see or print (nul), but it's there.
The *pStr - '0' converts the character to its numeric value '1' - '0' = 1
The while loop checks if we are not at the end of the string and that we have a valid digit.
A character in C is represented simply as an ASCII value. Since all the digits are consecutive in ASCII (i.e. 0x30 == '0' and 0x39 == '9' with all the other digits in between), you can determine if a character is a digit by simply doing a range check, and you can get the digit's value by subtracting '0'.
Note that posted implementation of atoi is not complete. Real atoi can process negative values.
Somehow while (*sPtr) and *sPtr != 0 are different.
These two expressions are the same. When used as condition, *sPtr is considered true when value stored at address sPtr is not zero, and *sPtr != 0 is true when value stored at address sPtr is not zero. Difference is when used somewhere else, then second expression evaluates to true or false, but the first one evaluates to stored value.
C-style strings are null-terminated.
Therefore:
while ( *pStr && *pStr <= '9' && *pStr >= '0' )
This tests:
*pStr that we have not yet reached the end of the string and is equivalent to writing *pStr != 0 (note without the single quote, ASCII value 0, or NUL).
*pStr >= '0' && *pStr <= '9' (perhaps more logically) that the character at *pStr is in the range '0' (ASCII value 48) to '9' (ASCII value 57), that is a digit.
The representation of '0' in memory os 0x30 and the representation of '9' is 0x39. This is what the computer sees, and when it compares them with logical operators, it uses these values. The nul-termination character is represented as 0x00, (aka zero). The key here is that chars are just like any other int to the machine.
Therefore, the while statement is saying:
While the char we are examining is valid (aka NOT zero and therefore NOT a nul-terminator), and its value (as the machine sees it) is less than 0x39 and its value is greater than 0x30, proceed.
The body of the while loop then calculates the appropriate value to add to the accumulator based on the integer's position in the string. It then increments the pointer and goes again. Once it's done, it returns the accumulated value.
This chunk of code is using ascii values to accumulate an integer tally of it's alpha equivalent.
In regards to your first numbered bullet, it seems quite trivial that when comparing anything the result is boolean. Although I feel like you were trying to ask if the compiler actually understands "characters". To my understanding though this comparison is done using the ascii values of the characters. i.e. a < b is interpreted as ( 97 < 98).
(Note that it is also easy to see that ascii values are used when you compare 'a' and 'A', as 'A' is less than 'a')
Concerning your second bullet, it seems that the while loop is checking that there is in fact an assigned value that is not NULL (ascii value of 0). The and operator produces FALSE as soon as a false statement is encountered, so that you don't do comparison on a NULL char. As for the rest of the while loop, it is doing ascii comparison as I mentioned about bullet 1. It is just checking whether or not the given character corresponds to an ascii value that is related to a number. i.e. between '0' and '9' (or ascii: between 48 and 57)
LASTLY
the (*ptr-'0') is the most interesting part in my opinion. This statement returns an integer between 0 and 9 inclusive. If you take a look at an ascii chart you will notice the numbers 0 through 9 are beside each other. So imagine '3'-'0' which is 51 - 48 and produces 3! :D So in simpler terms, it is doing ascii subtraction and returning the corresponding integer value. :D
Cheers, and I hope this explains a bit
Let's break it down:
if ( pStr )
If you pass atoi a null pointer, pStr will be 0x00 - and this will be false. Otherwise, we have something to parse.
while ( *pStr && *pStr <= '9' && *pStr >= '0' )
Ok, there's a bunch of things going on here. *pStr means we check if the value pStr is pointing to is 0x00 or not. If you look at an ASCII table, the ASCII for 0x00 is 'null' and in C/C++ the convention is that strings are null terminated (as opposed to Pascal and Java style strings, which tell you their length then have that many characters). So, when *pStr evaluates to false, our string has come to an end and we should stop.
*pStr <= '9' && *pStr >= '0' works because the values for the ASCII characters '0' '1' '2' '3' '4' '5' '6' '7' '8' '9' are all contiguous - '0' is 0x30 and '9' is 0x39, for example. So, if pStr's pointed to value is outside this range, then we're not parsing an integer and we should stop.
iRetVal = (iRetVal * 10) + (*pStr - '0');
Because of the properties of ASCII numerals being contiguous in memory, it so happens that if we know we have a numeral, *pStr - '0' evaluates to its numerical value - 0 for '0' (0x30 - 0x30), 1 for '1' (0x31 - 0x30)... 9 for '9'. So we shift our number up and slide in the new place.
pStr++;
By adding one to the pointer, the pointer points to the next address in memory - the next character in the string we are converting to an integer.
Note that this function will screw up if the string is not null terminated, it has any non numerals (such as '-') or if it is non-ASCII in any way. It's not magic, it just relies on these things being true.