Make a Integer from 6 bytes or more using C++ - c++

I am new in C++ programming. I am trying to implement a code through which I can make a single integer value from 6 or more individual bytes.
I have Implemented same for 4 bytes and it's working
My Code for 4 bytes:
char *command = "\x42\xa0\x82\xa1\x21\x22";
__int64 value;
value = (__int64)(((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]);
printf("%x %x %x %x %x",command[2], command[3], command[4], command[5], value);
Using this Code the value of value is 82a12122 but when I try to do for 6 byte then the result was is wrong.
Code for 6 Bytes:
char *command = "\x42\xa0\x82\xa1\x21\x22";
__int64 value;
value = (__int64)(((unsigned char)command[0] << 40) + ((unsigned char)command[1] << 32) + ((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]);
printf("%x %x %x %x %x %x %x", command[0], command[1], command[2], command[3], command[4], command[5], value);
The output value of value is 82a163c2 which is wrong, I need 42a082a12122.
So can anyone tell me how to get the expected output and what is wrong with the 6 Byte Code.
Thanks in Advance.

Just cast each byte to a sufficiently large unsigned type before shifting. Even after integral promotions (to unsigned int), the type is not large enough to shift by more than 32 bytes (in the usual case, which seems to apply to you).
See here for demonstration: https://godbolt.org/g/x855XH
unsigned long long large_ok(char x)
{
return ((unsigned long long)x) << 63;
}
unsigned long long large_incorrect(char x)
{
return ((unsigned long long)x) << 64;
}
unsigned long long still_ok(char x)
{
return ((unsigned char)x) << 31;
}
unsigned long long incorrect(char x)
{
return ((unsigned char)x) << 32;
}
In simpler terms:
The shift operators promote their operands to int/unsigned int automatically. This is why your four byte version works: unsigned int is large enough for all your shifts. However, (in your implementation, as in most common ones) it can only hold 32 bits, and the compiler will not automatically choose a 64 bit type if you shift by more than 32 bits (that would be impossible for the compiler to know).
If you use large enough integral types for the shift operands, the shift will have the larger type as the result and the shifts will do what you expect.
If you turn on warnings, your compiler will probably also complain to you that you are shifting by more bits than the type has and thus always getting zero (see demonstration).
(The bit counts mentioned are of course implementation defined.)
A final note: Types beginning with double underscores (__) or underscore + capital letter are reserved for the implementation - using them is not technically "safe". Modern C++ provides you with types such as uint64_t that should have the stated number of bits - use those instead.

Your shift overflows bytes, and you are not printing the integers correctly.
This code is working:
(Take note of the print format and how the shifts are done in uint64_t)
#include <stdio.h>
#include <cstdint>
int main()
{
const unsigned char *command = (const unsigned char *)"\x42\xa0\x82\xa1\x21\x22";
uint64_t value=0;
for (int i=0; i<6; i++)
{
value <<= 8;
value += command[i];
}
printf("%x %x %x %x %x %x %llx",
command[0], command[1], command[2], command[3], command[4], command[5], value);
}

Related

Reversing the byte order of 4 bytes without the Zeros C++ Using a macro

I almost have this figured out but I have a simple question. The below code reverses the byte order of 2 bytes and prints the below, but there are 12 zeros after "CDAB".
I am uncertain how to change the code so it only reverses 2 bytes without the extra zeros. Macros are beyond me... Does it have something to do with the size of the type of int?
Current Output:
Your Computer uses Little-Endian
Before: ABCD
After : CDAB000000000000
My current code:
// Writing Data to A file.
#include <iostream>
#include <math.h>
using namespace std;
//The below code is used to reverse byte/Endian order.
#define REVERSE_BYTES(...) do for(size_t REVERSE_BYTES=0; REVERSE_BYTES<sizeof(__VA_ARGS__)>>1; ++REVERSE_BYTES)\
((unsigned char*)&(__VA_ARGS__))[REVERSE_BYTES] ^= ((unsigned char*)&(__VA_ARGS__))[sizeof(__VA_ARGS__)-1-REVERSE_BYTES],\
((unsigned char*)&(__VA_ARGS__))[sizeof(__VA_ARGS__)-1-REVERSE_BYTES] ^= ((unsigned char*)&(__VA_ARGS__))[REVERSE_BYTES],\
((unsigned char*)&(__VA_ARGS__))[REVERSE_BYTES] ^= ((unsigned char*)&(__VA_ARGS__))[sizeof(__VA_ARGS__)-1-REVERSE_BYTES];\
while(0)
// Check for Endianess
int Endianess(int y)
{
int num = 1;
if (static_cast<unsigned char>(num) == 1)
{
y = 1; // Little Endian
}
else
{
y = 0; // Big Endian
}
return y;
}
int Reverse_Endian(unsigned long long Reverse_Byte_Order)
{
//unsigned long long x = 0xABCDEF0123456789;
unsigned long long x = 0xABCD;
printf("\nBefore: %llX\n", x);
REVERSE_BYTES(x);
printf("After : %llX\n", x);
return x;
}
int main()
{
int x = 0;
x = Endianess(x);
if (x == 0)
{
cout << "Your Computer uses Big-Endian";
}
else
{
cout << "Your Computer uses Little-Endian ";
}
Reverse_Endian(x);
return 0;
}
Although you assigned 0xABCD to x, this does not make x a two-byte integer. The size of the integer is determined by it's type unsigned long long (typically 64-bits/8-bytes).
So although the numerical value of x after your assignment is 0xABCD, it's still is represented by 8-bytes of memory (visualized as 0x000000000000ABCD).
printf needs to know the variable type/size for technical reasons, but still only prints the numerical value (only adding leading zeros as requested). printf, by itself, does not "hex dump" a arbitrary variable.
Pre-processor macros just preform text substitution. Preforming this subsitution manually can help understand what's going on. Start with the original:
#define REVERSE_BYTES(...) do for(size_t REVERSE_BYTES=0; REVERSE_BYTES<sizeof(__VA_ARGS__)>>1; ++REVERSE_BYTES)\
((unsigned char*)&(__VA_ARGS__))[REVERSE_BYTES] ^= ((unsigned char*)&(__VA_ARGS__))[sizeof(__VA_ARGS__)-1-REVERSE_BYTES],\
((unsigned char*)&(__VA_ARGS__))[sizeof(__VA_ARGS__)-1-REVERSE_BYTES] ^= ((unsigned char*)&(__VA_ARGS__))[REVERSE_BYTES],\
((unsigned char*)&(__VA_ARGS__))[REVERSE_BYTES] ^= ((unsigned char*)&(__VA_ARGS__))[sizeof(__VA_ARGS__)-1-REVERSE_BYTES];\
while(0);
Your REVERSE_BYTES(x); statement is substituted using the macro. The __VA_ARGS__ is replaced by the macro argument:
do for(size_t REVERSE_BYTES=0; REVERSE_BYTES<sizeof(x)>>1; ++REVERSE_BYTES)
((unsigned char*)&(x))[REVERSE_BYTES] ^= ((unsigned char*)&(x))[sizeof(x)-1-REVERSE_BYTES],
((unsigned char*)&(x))[sizeof(x)-1-REVERSE_BYTES] ^= ((unsigned char*)&(x))[REVERSE_BYTES],
((unsigned char*)&(x))[REVERSE_BYTES] ^= ((unsigned char*)&(x))[sizeof(x)-1-REVERSE_BYTES];
while(0);
Beacuse x is an unsigned long long, we'll assume that it's 8 bytes on your machine. sizeof(x) should equal 8.
do for(size_t REVERSE_BYTES=0; REVERSE_BYTES < 4; ++REVERSE_BYTES)
((unsigned char*)&(x))[REVERSE_BYTES] ^= ((unsigned char*)&(x))[7-REVERSE_BYTES],
((unsigned char*)&(x))[7-REVERSE_BYTES] ^= ((unsigned char*)&(x))[REVERSE_BYTES],
((unsigned char*)&(x))[REVERSE_BYTES] ^= ((unsigned char*)&(x))[7-REVERSE_BYTES];
while(0);
The for statement is wrapped in a redundant do-while loop, and consists of only one statement. Change these macro idiosyncrasies to make it fit in with the style of your code:
for(size_t i=0; i<4; ++i) {
((unsigned char*)&(x))[i] ^= ((unsigned char*)&(x))[7-i];
((unsigned char*)&(x))[7-i] ^= ((unsigned char*)&(x))[i];
((unsigned char*)&(x))[i] ^= ((unsigned char*)&(x))[7-i];
}
This macro is using the XOR swap algorithm to swap the bytes (starting with the first+last and moving towards the middle). We can re-write this using a temporary variable to do the same thing, but improve readability:
for(size_t i=0; i<4; ++i) {
unsigned char tmp = ((unsigned char*)&(x))[7-i];
((unsigned char*)&(x))[7-i] = ((unsigned char*)&(x))[i];
((unsigned char*)&(x))[i] = tmp;
}
Hopefully, this demonstrates better what the macro is actually doing. As we've seen the 4 and 7 are controlling the number of bytes swapped. These are derived from sizeof(x). So if we want to swap fewer bytes, we just need to make sizeof(x) smaller.
Changing unsigned long long x = 0xABCD to unsigned short x = 0xABCD will make sizeof(x) == 2. This should result in only two bytes being swapped, even if you leave the macro unchanged. You could also use casting or modify the macro to accept a custom size to achieve the same thing.

C++ - Getting size in bits of integer

I need to know whether an integer is 32 bits long or not (I want to know if it's exactly 32 bits long (8 hexadecimal characters). How could I achieve this in C++? Should I do this with the hexadecimal representation or with the unsigned int one?
My code is as follows:
mistream.open("myfile.txt");
if(mistream)
{
for(int i=0; i<longArray; i++)
{
mistream >> hex >> datos[i];
}
}
mistream.close();
Where mistream is of type ifstream, and datos is an unsigned int array
Thank you
std::numeric_limits<unsigned>::digits
is a static integer constant (or constexpr in C++11) giving the number of bits (since unsigned is stored in base 2, it gives binary digits).
You need to #include <limits> to get this, and you'll notice here that this gives the same value as Thomas' answer (while also being generalizable to other primitive types)
For reference (you changed your question after I answered), every integer of a given type (eg, unsigned) in a given program is exactly the same size.
What you're now asking is not the size of the integer in bits, because that never varies, but whether the top bit is set. You can test this trivially with
bool isTopBitSet(uint32_t v) {
return v & 0x80000000u;
}
(replace the unsigned hex literal with something like T{1} << (std::numeric_limits<T>::digits-1) if you want to generalise to unsigned T other than uint32_t).
As already hinted in a comment by #chux, you can use a combination of the sizeof operator and the CHAR_BIT macro constant. The former tells you (at compile-time) the size (in multiples of sizeof(char) aka bytes) of its argument type. The latter is the number of bits to the byte (usually 8).
You can encapsulate this nicely into a function template.
#include <climits> // CHAR_BIT
#include <cstddef> // std::size_t
#include <iostream> // std::cout, std::endl
template <typename T>
constexpr std::size_t
bit_size() noexcept
{
return sizeof(T) * CHAR_BIT;
}
int
main()
{
std::cout << bit_size<int>() << std::endl;
std::cout << bit_size<long>() << std::endl;
}
On my implementation, it outputs 32 and 64.
Since the function is a constexpr, you can use it in static contexts, such as in static_assert<bit_size<int>() >= 32, "too small");.
Try this:
#include <climits>
unsigned int bits_per_byte = CHAR_BIT;
unsigned int bits_per_integer = CHAR_BIT * sizeof(int);
The identifier CHAR_BIT represents the number of bits in a char.
The sizeof returns the number of char locations occupied by the integer.
Multiplying them gives us the number of bits for an integer.
OP said "if it's exactly 32 bits long (8 hexadecimal characters)" and further with ".. interested in knowing if the value is between power(2, 31) and power(2, 32) - 1". So it is a little fuzzy on negative 32-bit numbers.
Certainly OP wants to know the result based on the value and not the type.
bool integer_is_32_bits_long(int x) =
// cope with 32-bit int
((INT_MAX == 0x7FFFFFFF) && (x < 0)) ||
// larger 32-bit int
((INT_MAX > 0x7FFFFFFF) && (x >= 0x80000000) && (x <= 0xFFFFFFFF));
Of course if int is 16-bit, then the result is always false.
I want to know if it's exactly 32 bits long (8 hexadecimal characters)
I am interested in knowing if the value is between power(2, 31) and power(2, 32) - 1
So you want to know if the upper bit is set? Then you can simply test if the number is negative:
bool upperBitSet(int x)
{
return x < 0;
}
For unsigned numbers, you can simply shift left and back right and then check if you lost data:
bool upperBitSet(unsigned x)
{
return (x << 1 >> 1) != x;
}
The simplest way probably is to check if the 32nd bit is set:
bool isReally32bitsLong(uint32_t in) {
return (in >> 31)!=0;
}
bool isExactly32BitsLong(uint64_t in) {
return ((in >> 31)!=0) && ((in >> 32) == 0);
}

How to shift bits in big endian in c++

Here is code for little endian bit shift, i want to convert it in big endian bit shift.
please help me out. actually this is LZW decompression code using little endian shift.
but i want big endian code
unsigned int input_code(FILE *input)
{
unsigned int val;
static int bitcount=0;
static unsigned long inbitbuf=0L;
while (bitcount <= 24)
{
inbitbuf |=(unsigned long) getc(input) << (24-bitcount);
bitcount += 8;
}
val=inbitbuf >> (32-BITS);
inbitbuf <<= BITS;
bitcount -= BITS;
return(val);
}
void output_code(FILE *output,unsigned int code)
{
static int output_bit_count=0;
static unsigned long output_bit_buffer=0L;
output_bit_buffer |= (unsigned long) code << (32-BITS-output_bit_count);
output_bit_count += BITS;
while (output_bit_count >= 8)
{
putc(output_bit_buffer >> 24,output);
output_bit_buffer <<= 8;
output_bit_count -= 8;
}
}
You probably want something like.
unsigned char raw[4];
unsigned int val;
if (4 != fread(raw, 1, 4, input)) {
// error condition, return early or throw or something
}
val = static_cast<unsigned int>(data[3])
| static_cast<unsigned int>(data[2]) << 8
| static_cast<unsigned int>(data[1]) << 16
| static_cast<unsigned int>(data[0]) << 24;
if you were doing little endian, reverse the indexes and everything stays the same.
A good rant on endianness and the code that people seem to write, if you want more.
Its a good idea to mask (perform a bitwise OR against) the bytes one at a time before shifting them . Obviously if you are shifting a 16 bit integer the unmasked bits will just be pushed off either end into oblivion. But for integers larger that 16 bits (I actually had to use 24 bit integers once) it's best to mask each byte before shifting and recombining (perform a bitwise OR on) them.

Unexpected result in byte representation of a variable

I scan through the byte representation of an int variable and get somewhat unexpected result.
If I do
int a = 127;
cout << (unsigned int) *((char *)&a);
I get 127 as expected. If I do
int a = 256;
cout << (unsigned int) *((char *)&a + 1);
I get 1 as expected. But if I do
int a = 128;
cout << (unsigned int) *((char *)&a);
I have 4294967168 which is, well… quite fancy.
The question is: is there a way to get 128 when looking at first byte of an int variable which value is 128?
For the same reason that (unsigned int)(char)128 is 4294967168: char is signed by default on most commonly used systems. 128 cannot fit in a signed 8-bit quantity, so when you cast it to char, you get -128 (0x80 in hex).
Then, when you cast -128 to an unsigned int, you get 232 - 128, which is 4294967168.
If you want to get +128, then use an unsigned char instead of char.
char is signed here, so in your second example, *((char *)&a + 1) = ((char)256 +1) = (0+1) = 1, which is encoded as 0b00000000000000000000000000000001, so becomes 1 as an unsigned int.
In your third example, *((char *)&a) = (char)128 = (char)-127, which is encoded as 0b10000000000000000000000000000000, i.e., 2<<31, which is 4294967168
As the comments have pointed out, it looks like what's happening here is that you are running into an oddity of twos complement. In your last cast, since you are not using an unsigned char, the highest-order bit of the byte is being used to indicate positive or negative values. You then only have 7 bits out of the full 8 to represent your value, giving you a range of 0-127 for positive numbers (-128-127 overall).
If you exceed this range, then it wraps, and you get -128, which when casted back to an unsigned int will result in that abnormally large value.
int a = 128;
cout << (unsigned int) *((unsigned char *)&a);
Also all of your code is dependent on running on a little endian machine.
Here's how you should probably be doing these things:
int a = 127;
cout << (unsigned)(unsigned char)(0xFF & a);
int a = 256;
cout << (unsigned)(unsigned char)(0xFF & (a>>8));
int a = 128;
cout << (unsigned)(unsigned char)(0xFF & a);

Unsigned long and bit shifting

I have a problem with bit shifting and unsigned longs. Here's my test code:
char header[4];
header[0] = 0x80;
header[1] = 0x00;
header[2] = 0x00;
header[3] = 0x00;
unsigned long l1 = 0x80000000UL;
unsigned long l2 = ((unsigned long) header[0] << 24) + ((unsigned long) header[1] << 16) + ((unsigned long) header[2] << 8) + (unsigned long) header[3];
cout << l1 << endl;
cout << l2 << endl;
I would expect l2 to also have a value of 2147483648 but instead it prints 18446744071562067968. I assume the bit shifting of the first byte causes problems?
Hopefully somebody can explain why this fails and how I modify the calculation of l2 so that it returns the correct value.
Thanks in advance.
Your value of 0x80 stored in a char is a signed quantity. When you cast this into a wider type, the value is being signed extended to keep the same value as a larger type.
Change the type of char in the first line to unsigned char and you will not get the sign extension happening.
To simplify what is happening in your case, run this:
char c = 0x80
unsigned long l = c
cout << l << endl;
You get this output:
18446744073709551488
which is -128 as a 64-bit integer (0x80 is -128 as a 8-bit integer).
Same result here (Linux/x86-64, GCC 4.4.5). The behavior depends on the size of unsigned long, which is at least 32 bits, but may be larger.
If you want exactly 32 bits, use a uint32_t instead (from the header <stdint.h>; not in C++03 but in the upcoming standard and widely supported).