I (think I) understand how the maths with different variable types works. For example, if I go over the max limit of an unsigned int variable, it will loop back to 0.
I don't understand the behavior of this code with unsigned char:
#include<iostream>
int main() {
unsigned char var{ 0 };
for(int i = 0; i < 501; ++i) {
var += 1;
std::cout << var << '\n';
}
}
This just outputs 1...9, then some symbols and capital letters, and then it just doesn't print anything. It doesn't loop back to the values 1...9 etc.
On the other hand, if I cast to int before printing:
#include<iostream>
int main() {
unsigned char var{ 0 };
for(int i = 0; i < 501; ++i) {
var += 1;
std::cout << (int)var << '\n';
}
}
It does print from 1...255 and then loops back from 0...255.
Why is that? It seems that the unsgined char variable does loop (as we can see from the int cast).
Is it safe to to maths with unsigned char variables? What is the behavior that I see here?
Why doesn't it print the expected integer value?
The issue is not with the looping of char. The issue is with the insertion operation for std::ostream objects and 8-bit integer types. The non-member operator<< functions for these types treat all 8-bit integers (char, signed char, and unsigned char) as their ASCII character types.
operator<<(std::basic_ostream)
The canonical way to handle outputing 8-bit integer types is the way you're doing it. I personally prefer this instead:
char foo;
std::cout << +foo;
The unary + operator promotes the char type to an integer type, which then causes the integer printing function to be called.
Note that integer overflow is only defined for unsigned integer types. If you repeat this with char or signed char, the behavior is undefined by the standard. SOMETHING will happen, for sure, because we live in reality, but that overflow behavior may differ from compiler to compiler.
Why doesn't it repeat the 0..9 characters
I tested this using g++ to compile, and bash on Ubuntu 20.04. My non-printable characters are handled as explicit symbols in some cases, or nothing printed in other cases. The non-repeating behavior must be due to how your shell handles these non-printable characters. We can't answer that without more information.
Unsigned chars aren't trated as numbers in this case. This data type is literally a byte:
1 byte = 8 bits = 0000 0000 which means 0.
What cout is printing is the character that represents that byte you changed by adding +1 to it.
For example:
0 = 0000 0000
1 = 0000 0001
2 = 0000 0010
.
.
.
9 = 0000 1001
Then, here start other chars that arent related to numbers.
So, if you cast it to int, it will give you the numeric representations of that byte, giving you a 0-255 output.
Hope this clarifies!
Edit: Made the explanation more clear.
Related
I'd like to work with 12 bits unsigned integer. Since I am working with array, it is of interest for me to have overflowing value, e.g., 0 - 1 = 4095.
I tried the following but I don't obtain the expected behaviour:
struct bit_field
{
unsigned int x: 12; // 12 bits
};
bit_field ii, jj, kk;
ii.x = 4096;
jj.x = 1;
kk.x = 0;
cout << ii.x;
cout << kk.x - jj.x;
Output:
>> 0 // ov as expected
>> -1 // expected 4095
This is how C/C++ is expected to work; you don't get arbitarily sized integers. your storage width declaration within the struct doesn't change that: the type your operators see is still unsigned int. It's just that you say "when I store this, it's 12 bits".
Because kk.x and kk.x are unsigned integers, their subtraction works just as defined for these: their subtraction is promoting values to signed integers.
Note that you're writing C++, so you can perfectly well write your own class that implements the mathematical operations you want and has cast operators for integer types.
Need to read the value of character as a number and find corresponding hexadecimal value for that.
#include <iostream>
#include <iomanip>
using namespace std;
int main() {
char c = 197;
cout << hex << uppercase << setw(2) << setfill('0') << (short)c << endl;
}
Output:
FFC5
Expected output:
C5
The problem is that when you use char c = 197 you are overflowing the char type, producing a negative number (-59). Starting there it doesn't matter what conversion you make to larger types, it will remain a negative number.
To fully understand why you must know how two's complement works.
Basically, -59 and 192 have the same binary representation: 1100 0101, depending on the data type it is interpreted in one way or another. When you print it using hexadecimal format, the binary representation (the actual value stored in memory) is the one used, producing C5.
When the char is converted into an short/unsigned short, it is converting the -59 into its short/unsigned short representation, which is 1111 1111 1100 0101 (FFC5) for both cases.
The correct way to do it would be to store the initial value (197) into a variable which data type is able to represent it (unsigned char, short, unsigned short, ...) from the very beginning.
int main()
{
unsigned n;
cin>>n;
for(int i=(1<<31);i>0;i/=2)
(i&n)?(cout<<1):(cout<<0);
}
I ran the following code with n=1 but it prints nothing on the console. Changing the type of variable i to unsigned did the trick and printed 00000000000000000000000000000001. Any idea why?
Assuming two's complement, 1 << 31 results in a negative value, so your test for i > 0 fails immediately with the first test. You most likely would have had more luck with i != 0 then.
But we aware that 1 << 31 is a signed integer overflow, which is undefined behaviour anyway! So you should do 1U << 31 instead, too. If you assign this then positive value to a signed int, which is not capable to hold it, you have again undefined behaviour. So the correct for loop would look like this:
for(unsigned int i = 1U << 31; i > 0; i /= 2)
Although i /= 2 for unsigned values is equivalent to a bitshift (and is likely to be compiled to), I would prefere the bitshift operation explicitly here (i >>= 1), as this is what you actually intend.
Given that your platform is a 32-bit one, int i with a value of (i<<31) is a negative number. So, the execution never enters for-loop because you want i>0.
I have tried these 2 following codes:
int main()
{
int val=-125;
char code=val;
cout<<"\t"<<code<<" "<<(int)code;
getch();
}
The output i got is a^ -125
The second code is:
int main()
{
int val=-125;
unsigned char code=val;
cout<<"\t"<<code<<" "<<(int)code;
getch();
}
The output i got is: a^ 131
after trying both the codes is it safe to conclude that a character can have 2 ASCII values or my approach to find ASCII value(s) is flawed?
P.S.-
I was unable to upload the pictures of my output, so I am forced to type the output where the character I got isn't present in the standard keyboard.
In both examples 'code' has the same bitwise value. The first bit is 1, because it was a negativ number. Since both 'codes' have the same value the output character is the same (converting from number->character treats the number as an unsigned value).
After that you convert your character back to a (signed) interger. This conversion respects the type and the sign of you char.
->unsigned char -> int -> int always positiv
->char -> int -> int has the same sign as the char (and because the first bit was 1 it's negativ here)
unsigned integers in C++ have modulo 2n behavior, where n is the number of value bits.
that means if your char has 8 bits, then unsigned char has modulo 256 behavior.
this behavior is as if the values 0 through 255 were placed on a clockface. any operation that produces a result that goes past the 0-255 divide just effectively wraps around. just like arithmetic with hours on a clockface.
which means that assigning the value -125 yields the corresponding value in the range 0 through 255, namely -125 + 256 = 131.
Good day, colleagues!
I need to obtain cyclic series on successive numbers from 0 to 255. Is it legal to use unsigned char overflow like this:
unsigned char test_char = 0;
while (true) {
std::cout << test_char++ << " ";
}
Or will be more safely to use this code:
int test_int = 0;
while (true) {
std::cout << test_int++ % 256 << " ";
}
Of course, in real code there will be reasonable condition instead of while (true).
3.9.1/4 "Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer"
"This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type"
So, yes it is legal. And the second form is preferred, since it's more readable.
Even though sizeof(char) will always be 1, it is not necessary that a char will be exactly 8 bits. (I am guessing unsigned char will be similar).
So of the two, if given a choice, I would prefer the latter as the former might not even be correct.
btw, You probably intended unsigned int instead of int for the latter? Modulus with negative numbers could get tricky (after the int overflows, as Jimmy noted). If I recollect correctly, I believe it is compiler dependent.
unsigned char, like all other unsigned integral types, follows modulo 2n arithmetic, so basically both your methods are equivalent. Use the first
There is no such thing as unsigned overflow, per 3.9.1/4 as quoted by Erik. However, as Moron says, it is possible that the modulus of the unsigned char number system is greater than 256.
Note that your expression does not store the result of % 256 back to test_int. The safe way to do this is
test_int = ( test_int + 1 ) % 256;
std::cout << test_int << " ";
the output of the 2 samples is completely different.
the first one will print characters (a b c d e f g h ...)
the second one will print integers (0 1 2 3 4 ... 255 0 ...)
anyway it depends if you have a a control for overflowexception (.NET) otherwise in old C++ it the value is always valid and goes from 0 to 255