At work I found this code in my codebase, where chars are casted twice:
constexpr unsigned int foo(char ch0, char ch1, char ch2, char ch3)
{
return ((unsigned int)(unsigned char)(ch0)
| ((unsigned int)(unsigned char)(ch1) << 8)
| ((unsigned int)(unsigned char)(ch2) << 16)
| ((unsigned int)(unsigned char)(ch3) << 24))
;
}
Wouldn't one cast to unsigned int be sufficient?
And in that case better make it a static_cast<unsigned_int>?
Yes, there is a difference. Consider if one of the char's has the value of -1. When you do
(unsigned int)(unsigned char)-1
you get 255 for a 8 bit char since you first do the conversion modulo 2^8. If you instead used
(unsigned int)-1
then you would get 4294967295 for a 32 bit int since you are now doing the conversion modulo 2^32.
So the first cast guarantees the result will be representable in 8 bits, or whatever the actual size a char is, and then the second cast is to promote it to a wider type.
You can get rid of the casts to unsigned char if you chnage the function parameters to it like
constexpr unsigned int foo(unsigned char ch0, unsigned char ch1,
unsigned char ch2, unsigned char ch3)
{
return static_cast<unsigned int>(ch0)
| static_cast<unsigned int>(ch1) << 8)
| static_cast<unsigned int>(ch2) << 16)
| static_cast<unsigned int>(ch3) << 24))
;
}
Related
Toy program to split an integer into 4 bytes and later combine these bytes to get back the input value results into error. However the program works for positive integers. I am interested in signed integers. Need help.
Expected Output: -12345
Actual Output: -57
int main()
{
int j,i = -12345;
char b[4];
b[0] = (i >> 24) & 0xFF;
b[1] = (i >> 16) & 0xFF;
b[2] = (i >> 8) & 0xFF;
b[3] = (i >> 0) & 0xFF;
j = (int)((b[0] << 24) | (b[1] << 16) | (b[2] << 8) | (b[3] << 0));
std::cout << j;
return 0;
}
There are actually two problems that leads to your "error".
The first is that the result of e.g. b[0] << 24 will be an int. When you cast that to a char (and assuming that char is an 8-bit type) then you cut off the top 24 bits of the value, truncating it.
The second problem is that char could be unsigned (it's implementation-defined if char is signed or unsigned). If char is unsigned then the value -1 (0xffffffff) will become 255 (0x000000ff).
When you then bring all that together it will almost certainly result in wrong values.
In general, whenever you feel the need to do a C-style cast (like in (char)(b[0] << 24)) when programming in C++, you should take that as a sign that you're doing something wrong.
One possible way to solve your problem, always work with explicit unsigned data-types.
First you need to copy the original int value to an unsigned int:
unsigned ui;
memcpy(&ui, &i, sizeof ui);
Then use ui instead of i when doing the "split". And explicitly use unsigned char:
unsigned char b[sizeof(unsigned)] = { 0 };
b[0] = (ui >> 24) & 0xFF;
b[1] = (ui >> 16) & 0xFF;
b[2] = (ui >> 8) & 0xFF;
b[3] = (ui >> 0) & 0xFF;
Then to put it all back, again use an explicit unsigned type, and copy it to the resulting variable:
unsigned uj = (b[0] << 24) | (b[1] << 16) | (b[2] << 8) | (b[3] << 0);
memcpy(&j, &uj, sizeof j);
I suggest using unsigned data types here to avoid possible problems that can come from sign-extension during conversion.
Your code works only for possessive numbers! "i" is negative and by shifting it to to right b[0] becomes positive! and finally desensitization results error!
try
int main()
{
int j, i = -12345;
const char* bytes = reinterpret_cast<const char*>(&i);
j = *reinterpret_cast<const int*>(bytes);
std::cout << j;
return 0;
}
I am new in C++ programming. I am trying to implement a code through which I can make a single integer value from 6 or more individual bytes.
I have Implemented same for 4 bytes and it's working
My Code for 4 bytes:
char *command = "\x42\xa0\x82\xa1\x21\x22";
__int64 value;
value = (__int64)(((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]);
printf("%x %x %x %x %x",command[2], command[3], command[4], command[5], value);
Using this Code the value of value is 82a12122 but when I try to do for 6 byte then the result was is wrong.
Code for 6 Bytes:
char *command = "\x42\xa0\x82\xa1\x21\x22";
__int64 value;
value = (__int64)(((unsigned char)command[0] << 40) + ((unsigned char)command[1] << 32) + ((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]);
printf("%x %x %x %x %x %x %x", command[0], command[1], command[2], command[3], command[4], command[5], value);
The output value of value is 82a163c2 which is wrong, I need 42a082a12122.
So can anyone tell me how to get the expected output and what is wrong with the 6 Byte Code.
Thanks in Advance.
Just cast each byte to a sufficiently large unsigned type before shifting. Even after integral promotions (to unsigned int), the type is not large enough to shift by more than 32 bytes (in the usual case, which seems to apply to you).
See here for demonstration: https://godbolt.org/g/x855XH
unsigned long long large_ok(char x)
{
return ((unsigned long long)x) << 63;
}
unsigned long long large_incorrect(char x)
{
return ((unsigned long long)x) << 64;
}
unsigned long long still_ok(char x)
{
return ((unsigned char)x) << 31;
}
unsigned long long incorrect(char x)
{
return ((unsigned char)x) << 32;
}
In simpler terms:
The shift operators promote their operands to int/unsigned int automatically. This is why your four byte version works: unsigned int is large enough for all your shifts. However, (in your implementation, as in most common ones) it can only hold 32 bits, and the compiler will not automatically choose a 64 bit type if you shift by more than 32 bits (that would be impossible for the compiler to know).
If you use large enough integral types for the shift operands, the shift will have the larger type as the result and the shifts will do what you expect.
If you turn on warnings, your compiler will probably also complain to you that you are shifting by more bits than the type has and thus always getting zero (see demonstration).
(The bit counts mentioned are of course implementation defined.)
A final note: Types beginning with double underscores (__) or underscore + capital letter are reserved for the implementation - using them is not technically "safe". Modern C++ provides you with types such as uint64_t that should have the stated number of bits - use those instead.
Your shift overflows bytes, and you are not printing the integers correctly.
This code is working:
(Take note of the print format and how the shifts are done in uint64_t)
#include <stdio.h>
#include <cstdint>
int main()
{
const unsigned char *command = (const unsigned char *)"\x42\xa0\x82\xa1\x21\x22";
uint64_t value=0;
for (int i=0; i<6; i++)
{
value <<= 8;
value += command[i];
}
printf("%x %x %x %x %x %x %llx",
command[0], command[1], command[2], command[3], command[4], command[5], value);
}
Iam programming in C++ and Iam comming with another "stupid" problem. If I have 4 chars like these:
char a = 0x90
char b = 0x01
char c = 0x00
char d = 0x00
when that all means hexadecimal number 0x00000190 which is 400 decimal number.
How do I convert these chars to one int? I know i can do
int number = a;
but this will convert only one char to int. Could anybody help please?
You may use:
int number = (a & 0xFF)
| ((b & 0xFF) << 8)
| ((c & 0xFF) << 16)
| ((d & 0xFF) << 24);
It would be simpler with unsigned values.
like this
int number = ((d & 0xff) << 24) | ((c &0xff) << 16) | ((b & 0xff) << 8) | (a & 0xff);
the << is the bit shift operator and the & 0xff is necessary to avoid negative values when promoting char to int in the expression (totally right by Jarod42)
This works:
unsigned char ua = a;
unsigned char ub = b;
unsigned char uc = c;
unsigned char ud = d;
unsigned long x = ua + ub * 0x100ul + uc * 0x10000ul + ud * 0x1000000ul
It is like place-value arithmetic in decimal but you are using base 0x100 instead of base 10.
If you are doing this a lot you could wrap it up in an inline function or a macro.
Note - the other answers posted so far using bitwise operations on (char)0x90 are all wrong for systems where plain char is signed, as they are forgetting that (char)0x90 is a negative value there, so after the integral promotions are applied, there are a whole lot of 1 bits on the left.
I scan through the byte representation of an int variable and get somewhat unexpected result.
If I do
int a = 127;
cout << (unsigned int) *((char *)&a);
I get 127 as expected. If I do
int a = 256;
cout << (unsigned int) *((char *)&a + 1);
I get 1 as expected. But if I do
int a = 128;
cout << (unsigned int) *((char *)&a);
I have 4294967168 which is, well… quite fancy.
The question is: is there a way to get 128 when looking at first byte of an int variable which value is 128?
For the same reason that (unsigned int)(char)128 is 4294967168: char is signed by default on most commonly used systems. 128 cannot fit in a signed 8-bit quantity, so when you cast it to char, you get -128 (0x80 in hex).
Then, when you cast -128 to an unsigned int, you get 232 - 128, which is 4294967168.
If you want to get +128, then use an unsigned char instead of char.
char is signed here, so in your second example, *((char *)&a + 1) = ((char)256 +1) = (0+1) = 1, which is encoded as 0b00000000000000000000000000000001, so becomes 1 as an unsigned int.
In your third example, *((char *)&a) = (char)128 = (char)-127, which is encoded as 0b10000000000000000000000000000000, i.e., 2<<31, which is 4294967168
As the comments have pointed out, it looks like what's happening here is that you are running into an oddity of twos complement. In your last cast, since you are not using an unsigned char, the highest-order bit of the byte is being used to indicate positive or negative values. You then only have 7 bits out of the full 8 to represent your value, giving you a range of 0-127 for positive numbers (-128-127 overall).
If you exceed this range, then it wraps, and you get -128, which when casted back to an unsigned int will result in that abnormally large value.
int a = 128;
cout << (unsigned int) *((unsigned char *)&a);
Also all of your code is dependent on running on a little endian machine.
Here's how you should probably be doing these things:
int a = 127;
cout << (unsigned)(unsigned char)(0xFF & a);
int a = 256;
cout << (unsigned)(unsigned char)(0xFF & (a>>8));
int a = 128;
cout << (unsigned)(unsigned char)(0xFF & a);
I have a problem with bit shifting and unsigned longs. Here's my test code:
char header[4];
header[0] = 0x80;
header[1] = 0x00;
header[2] = 0x00;
header[3] = 0x00;
unsigned long l1 = 0x80000000UL;
unsigned long l2 = ((unsigned long) header[0] << 24) + ((unsigned long) header[1] << 16) + ((unsigned long) header[2] << 8) + (unsigned long) header[3];
cout << l1 << endl;
cout << l2 << endl;
I would expect l2 to also have a value of 2147483648 but instead it prints 18446744071562067968. I assume the bit shifting of the first byte causes problems?
Hopefully somebody can explain why this fails and how I modify the calculation of l2 so that it returns the correct value.
Thanks in advance.
Your value of 0x80 stored in a char is a signed quantity. When you cast this into a wider type, the value is being signed extended to keep the same value as a larger type.
Change the type of char in the first line to unsigned char and you will not get the sign extension happening.
To simplify what is happening in your case, run this:
char c = 0x80
unsigned long l = c
cout << l << endl;
You get this output:
18446744073709551488
which is -128 as a 64-bit integer (0x80 is -128 as a 8-bit integer).
Same result here (Linux/x86-64, GCC 4.4.5). The behavior depends on the size of unsigned long, which is at least 32 bits, but may be larger.
If you want exactly 32 bits, use a uint32_t instead (from the header <stdint.h>; not in C++03 but in the upcoming standard and widely supported).