How to shift bits in big endian in c++ - c++

Here is code for little endian bit shift, i want to convert it in big endian bit shift.
please help me out. actually this is LZW decompression code using little endian shift.
but i want big endian code
unsigned int input_code(FILE *input)
{
unsigned int val;
static int bitcount=0;
static unsigned long inbitbuf=0L;
while (bitcount <= 24)
{
inbitbuf |=(unsigned long) getc(input) << (24-bitcount);
bitcount += 8;
}
val=inbitbuf >> (32-BITS);
inbitbuf <<= BITS;
bitcount -= BITS;
return(val);
}
void output_code(FILE *output,unsigned int code)
{
static int output_bit_count=0;
static unsigned long output_bit_buffer=0L;
output_bit_buffer |= (unsigned long) code << (32-BITS-output_bit_count);
output_bit_count += BITS;
while (output_bit_count >= 8)
{
putc(output_bit_buffer >> 24,output);
output_bit_buffer <<= 8;
output_bit_count -= 8;
}
}

You probably want something like.
unsigned char raw[4];
unsigned int val;
if (4 != fread(raw, 1, 4, input)) {
// error condition, return early or throw or something
}
val = static_cast<unsigned int>(data[3])
| static_cast<unsigned int>(data[2]) << 8
| static_cast<unsigned int>(data[1]) << 16
| static_cast<unsigned int>(data[0]) << 24;
if you were doing little endian, reverse the indexes and everything stays the same.
A good rant on endianness and the code that people seem to write, if you want more.

Its a good idea to mask (perform a bitwise OR against) the bytes one at a time before shifting them . Obviously if you are shifting a 16 bit integer the unmasked bits will just be pushed off either end into oblivion. But for integers larger that 16 bits (I actually had to use 24 bit integers once) it's best to mask each byte before shifting and recombining (perform a bitwise OR on) them.

Related

Make a Integer from 6 bytes or more using C++

I am new in C++ programming. I am trying to implement a code through which I can make a single integer value from 6 or more individual bytes.
I have Implemented same for 4 bytes and it's working
My Code for 4 bytes:
char *command = "\x42\xa0\x82\xa1\x21\x22";
__int64 value;
value = (__int64)(((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]);
printf("%x %x %x %x %x",command[2], command[3], command[4], command[5], value);
Using this Code the value of value is 82a12122 but when I try to do for 6 byte then the result was is wrong.
Code for 6 Bytes:
char *command = "\x42\xa0\x82\xa1\x21\x22";
__int64 value;
value = (__int64)(((unsigned char)command[0] << 40) + ((unsigned char)command[1] << 32) + ((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]);
printf("%x %x %x %x %x %x %x", command[0], command[1], command[2], command[3], command[4], command[5], value);
The output value of value is 82a163c2 which is wrong, I need 42a082a12122.
So can anyone tell me how to get the expected output and what is wrong with the 6 Byte Code.
Thanks in Advance.
Just cast each byte to a sufficiently large unsigned type before shifting. Even after integral promotions (to unsigned int), the type is not large enough to shift by more than 32 bytes (in the usual case, which seems to apply to you).
See here for demonstration: https://godbolt.org/g/x855XH
unsigned long long large_ok(char x)
{
return ((unsigned long long)x) << 63;
}
unsigned long long large_incorrect(char x)
{
return ((unsigned long long)x) << 64;
}
unsigned long long still_ok(char x)
{
return ((unsigned char)x) << 31;
}
unsigned long long incorrect(char x)
{
return ((unsigned char)x) << 32;
}
In simpler terms:
The shift operators promote their operands to int/unsigned int automatically. This is why your four byte version works: unsigned int is large enough for all your shifts. However, (in your implementation, as in most common ones) it can only hold 32 bits, and the compiler will not automatically choose a 64 bit type if you shift by more than 32 bits (that would be impossible for the compiler to know).
If you use large enough integral types for the shift operands, the shift will have the larger type as the result and the shifts will do what you expect.
If you turn on warnings, your compiler will probably also complain to you that you are shifting by more bits than the type has and thus always getting zero (see demonstration).
(The bit counts mentioned are of course implementation defined.)
A final note: Types beginning with double underscores (__) or underscore + capital letter are reserved for the implementation - using them is not technically "safe". Modern C++ provides you with types such as uint64_t that should have the stated number of bits - use those instead.
Your shift overflows bytes, and you are not printing the integers correctly.
This code is working:
(Take note of the print format and how the shifts are done in uint64_t)
#include <stdio.h>
#include <cstdint>
int main()
{
const unsigned char *command = (const unsigned char *)"\x42\xa0\x82\xa1\x21\x22";
uint64_t value=0;
for (int i=0; i<6; i++)
{
value <<= 8;
value += command[i];
}
printf("%x %x %x %x %x %x %llx",
command[0], command[1], command[2], command[3], command[4], command[5], value);
}

Bitwise shift of buffer in CUDA [duplicate]

What is the best way to implement a bitwise memmove? The method should take an additional destination and source bit-offset and the count should be in bits too.
I saw that ARM provides a non-standard _membitmove, which does exactly what I need, but I couldn't find its source.
Bind's bitset includes isc_bitstring_copy, but it's not efficient
I'm aware that the C standard library doesn't provide such a method, but I also couldn't find any third-party code providing a similar method.
Assuming "best" means "easiest", you can copy bits one by one. Conceptually, an address of a bit is an object (struct) that has a pointer to a byte in memory and an index of a bit in the byte.
struct pointer_to_bit
{
uint8_t* p;
int b;
};
void membitmovebl(
void *dest,
const void *src,
int dest_offset,
int src_offset,
size_t nbits)
{
// Create pointers to bits
struct pointer_to_bit d = {dest, dest_offset};
struct pointer_to_bit s = {src, src_offset};
// Bring the bit offsets to range (0...7)
d.p += d.b / 8; // replace division by right-shift if bit offset can be negative
d.b %= 8; // replace "%=8" by "&=7" if bit offset can be negative
s.p += s.b / 8;
s.b %= 8;
// Determine whether it's OK to loop forward
if (d.p < s.p || d.p == s.p && d.b <= s.b)
{
// Copy bits one by one
for (size_t i = 0; i < nbits; i++)
{
// Read 1 bit
int bit = (*s.p >> s.b) & 1;
// Write 1 bit
*d.p &= ~(1 << d.b);
*d.p |= bit << d.b;
// Advance pointers
if (++s.b == 8)
{
s.b = 0;
++s.p;
}
if (++d.b == 8)
{
d.b = 0;
++d.p;
}
}
}
else
{
// Copy stuff backwards - essentially the same code but ++ replaced by --
}
}
If you want to write a version optimized for speed, you will have to do copying by bytes (or, better, words), unroll loops, and handle a number of special cases (memmove does that; you will have to do more because your function is more complicated).
P.S. Oh, seeing that you call isc_bitstring_copy inefficient, you probably want the speed optimization. You can use the following idea:
Start copying bits individually until the destination is byte-aligned (d.b == 0). Then, it is easy to copy 8 bits at once, doing some bit twiddling. Do this until there are less than 8 bits left to copy; then continue copying bits one by one.
// Copy 8 bits from s to d and advance pointers
*d.p = *s.p++ >> s.b;
*d.p++ |= *s.p << (8 - s.b);
P.P.S Oh, and seeing your comment on what you are going to use the code for, you don't really need to implement all the versions (byte/halfword/word, big/little-endian); you only want the easiest one - the one working with words (uint32_t).
Here is a partial implementation (not tested). There are obvious efficiency and usability improvements.
Copy n bytes from src to dest (not overlapping src), and shift bits at dest rightwards by bit bits, 0 <= bit <= 7. This assumes that the least significant bits are at the right of the bytes
void memcpy_with_bitshift(unsigned char *dest, unsigned char *src, size_t n, int bit)
{
int i;
memcpy(dest, src, n);
for (i = 0; i < n; i++) {
dest[i] >> bit;
}
for (i = 0; i < n; i++) {
dest[i+1] |= (src[i] << (8 - bit));
}
}
Some improvements to be made:
Don't overwrite first bit bits at beginning of dest.
Merge loops
Have a way to copy a number of bits not divisible by 8
Fix for >8 bits in a char

Check value of least significant bit (LSB) and most significant bit (MSB) in C/C++

I need to check the value of the least significant bit (LSB) and most significant bit (MSB) of an integer in C/C++. How would I do this?
//int value;
int LSB = value & 1;
Alternatively (which is not theoretically portable, but practically it is - see Steve's comment)
//int value;
int LSB = value % 2;
Details:
The second formula is simpler. The % operator is the remainder operator. A number's LSB is 1 iff it is an odd number and 0 otherwise. So we check the remainder of dividing with 2. The logic of the first formula is this: number 1 in binary is this:
0000...0001
If you binary-AND this with an arbitrary number, all the bits of the result will be 0 except the last one because 0 AND anything else is 0. The last bit of the result will be 1 iff the last bit of your number was 1 because 1 & 1 == 1 and 1 & 0 == 0
This is a good tutorial for bitwise operations.
HTH.
You can do something like this:
#include <iostream>
int main(int argc, char **argv)
{
int a = 3;
std::cout << (a & 1) << std::endl;
return 0;
}
This way you AND your variable with the LSB, because
3: 011
1: 001
in 3-bit representation. So being AND:
AND
-----
0 0 | 0
0 1 | 0
1 0 | 0
1 1 | 1
You will be able to know if LSB is 1 or not.
edit: find MSB.
First of all read Endianess article to agree on what MSB means. In the following lines we suppose to handle with big-endian notation.
To find the MSB, in the following snippet we will focus applying a right shift until the MSB will be ANDed with 1.
Consider the following code:
#include <iostream>
#include <limits.h>
int main(int argc, char **argv)
{
unsigned int a = 128; // we want to find MSB of this 32-bit unsigned int
int MSB = 0; // this variable will represent the MSB we're looking for
// sizeof(unsigned int) = 4 (in Bytes)
// 1 Byte = 8 bits
// So 4 Bytes are 4 * 8 = 32 bits
// We have to perform a right shift 32 times to have the
// MSB in the LSB position.
for (int i = sizeof(unsigned int) * 8; i > 0; i--) {
MSB = (a & 1); // in the last iteration this contains the MSB value
a >>= 1; // perform the 1-bit right shift
}
// this prints out '0', because the 32-bit representation of
// unsigned int 128 is:
// 00000000000000000000000010000000
std::cout << "MSB: " << MSB << std::endl;
return 0;
}
If you print MSB outside of the cycle you will get 0.
If you change the value of a:
unsigned int a = UINT_MAX; // found in <limits.h>
MSB will be 1, because its 32-bit representation is:
UINT_MAX: 11111111111111111111111111111111
However, if you do the same thing with a signed integer things will be different.
#include <iostream>
#include <limits.h>
int main(int argc, char **argv)
{
int a = -128; // we want to find MSB of this 32-bit unsigned int
int MSB = 0; // this variable will represent the MSB we're looking for
// sizeof(int) = 4 (in Bytes)
// 1 Byte = 8 bits
// So 4 Bytes are 4 * 8 = 32 bits
// We have to perform a right shift 32 times to have the
// MSB in the LSB position.
for (int i = sizeof(int) * 8; i > 0; i--) {
MSB = (a & 1); // in the last iteration this contains the MSB value
a >>= 1; // perform the 1-bit right shift
}
// this prints out '1', because the 32-bit representation of
// int -128 is:
// 10000000000000000000000010000000
std::cout << "MSB: " << MSB << std::endl;
return 0;
}
As I said in the comment below, the MSB of a positive integer is always 0, while the MSB of a negative integer is always 1.
You can check INT_MAX 32-bit representation:
INT_MAX: 01111111111111111111111111111111
Now. Why the cycle uses sizeof()?
If you simply do the cycle as I wrote in the comment: (sorry for the = missing in comment)
for (; a != 0; a >>= 1)
MSB = a & 1;
you will get 1 always, because C++ won't consider the 'zero-pad bits' (because you specified a != 0 as exit statement) higher than the highest 1. For example for 32-bit integers we have:
int 7 : 00000000000000000000000000000111
^ this will be your fake MSB
without considering the full size
of the variable.
int 16: 00000000000000000000000000010000
^ fake MSB
int LSB = value & 1;
int MSB = value >> (sizeof(value)*8 - 1) & 1;
Others have already mentioned:
int LSB = value & 1;
for getting the least significant bit. But there is a cheatier way to get the MSB than has been mentioned. If the value is a signed type already, just do:
int MSB = value < 0;
If it's an unsigned quantity, cast it to the signed type of the same size, e.g. if value was declared as unsigned, do:
int MSB = (int)value < 0;
Yes, officially, not portable, undefined behavior, whatever. But on every two's complement system and every compiler for them that I'm aware of, it happens to work; after all, the high bit is the sign bit, so if the signed form is negative, then the MSB is 1, if it's non-negative, the MSB is 0. So conveniently, a signed test for negative numbers is equivalent to retrieving the MSB.
LSB is easy. Just x & 1.
MSSB is a bit trickier, as bytes may not be 8 bits and sizeof(int) may not be 4, and there might be padding bits to the right.
Also, with a signed integer, do you mean the sign bit of the MS value bit.
If you mean the sign bit, life is easy. It's just x < 0
If you mean the most significant value bit, to be completely portable.
int answer = 0;
int rack = 1;
int mask = 1;
while(rack < INT_MAX)
{
rack << = 1;
mask << = 1;
rack |= 1;
}
return x & mask;
That's a long-winded way of doing it. In reality
x & (1 << (sizeof(int) * CHAR_BIT) - 2);
will be quite portable enough and your ints won't have padding bits.

Integer Byte Swapping in C++

I'm working on a homework assignment for my C++ class. The question I am working on reads as follows:
Write a function that takes an unsigned short int (2 bytes) and swaps the bytes. For example, if the x = 258 ( 00000001 00000010 ) after the swap, x will be 513 ( 00000010 00000001 ).
Here is my code so far:
#include <iostream>
using namespace std;
unsigned short int ByteSwap(unsigned short int *x);
int main()
{
unsigned short int x = 258;
ByteSwap(&x);
cout << endl << x << endl;
system("pause");
return 0;
}
and
unsigned short int ByteSwap(unsigned short int *x)
{
long s;
long byte1[8], byte2[8];
for (int i = 0; i < 16; i++)
{
s = (*x >> i)%2;
if(i < 8)
{
byte1[i] = s;
cout << byte1[i];
}
if(i == 8)
cout << " ";
if(i >= 8)
{
byte2[i-8] = s;
cout << byte2[i];
}
}
//Here I need to swap the two bytes
return *x;
}
My code has two problems I am hoping you can help me solve.
For some reason both of my bytes are 01000000
I really am not sure how I would swap the bytes. My teachers notes on bit manipulation are very broken and hard to follow and do not make much sense me.
Thank you very much in advance. I truly appreciate you helping me.
New in C++23:
The standard library now has a function that provides exactly this facility:
#include <iostream>
#include <bit>
int main() {
unsigned short x = 258;
x = std::byteswap(x);
std::cout << x << endl;
}
Original Answer:
I think you're overcomplicating it, if we assume a short consists of 2 bytes (16 bits), all you need
to do is
extract the high byte hibyte = (x & 0xff00) >> 8;
extract the low byte lobyte = (x & 0xff);
combine them in the reverse order x = lobyte << 8 | hibyte;
It looks like you are trying to swap them a single bit at a time. That's a bit... crazy. What you need to do is isolate the 2 bytes and then just do some shifting. Let's break it down:
uint16_t x = 258;
uint16_t hi = (x & 0xff00); // isolate the upper byte with the AND operator
uint16_t lo = (x & 0xff); // isolate the lower byte with the AND operator
Now you just need to recombine them in the opposite order:
uint16_t y = (lo << 8); // shift the lower byte to the high position and assign it to y
y |= (hi >> 8); // OR in the upper half, into the low position
Of course this can be done in less steps. For example:
uint16_t y = (lo << 8) | (hi >> 8);
Or to swap without using any temporary variables:
uint16_t y = ((x & 0xff) << 8) | ((x & 0xff00) >> 8);
You're making hard work of that.
You only neeed exchange the bytes. So work out how to extract the two byte values, then how to re-assemble them the other way around
(homework so no full answer given)
EDIT: Not sure why I bothered :) Usefulness of an answer to a homework question is measured by how much the OP (and maybe other readers) learn, which isn't maximized by giving the answer to the homewortk question directly...
Here is an unrolled example to demonstrate byte by byte:
unsigned int swap_bytes(unsigned int original_value)
{
unsigned int new_value = 0; // Start with a known value.
unsigned int byte; // Temporary variable.
// Copy the lowest order byte from the original to
// the new value:
byte = original_value & 0xFF; // Keep only the lowest byte from original value.
new_value = new_value * 0x100; // Shift one byte left to make room for a new byte.
new_value |= byte; // Put the byte, from original, into new value.
// For the next byte, shift the original value by one byte
// and repeat the process:
original_value = original_value >> 8; // 8 bits per byte.
byte = original_value & 0xFF; // Keep only the lowest byte from original value.
new_value = new_value * 0x100; // Shift one byte left to make room for a new byte.
new_value |= byte; // Put the byte, from original, into new value.
//...
return new_value;
}
Ugly implementation of Jerry's suggestion to treat the short as an array of two bytes:
#include <iostream>
typedef union mini
{
unsigned char b[2];
short s;
} micro;
int main()
{
micro x;
x.s = 258;
unsigned char tmp = x.b[0];
x.b[0] = x.b[1];
x.b[1] = tmp;
std::cout << x.s << std::endl;
}
Using library functions, the following code may be useful (in a non-homework context):
unsigned long swap_bytes_with_value_size(unsigned long value, unsigned int value_size) {
switch (value_size) {
case sizeof(char):
return value;
case sizeof(short):
return _byteswap_ushort(static_cast<unsigned short>(value));
case sizeof(int):
return _byteswap_ulong(value);
case sizeof(long long):
return static_cast<unsigned long>(_byteswap_uint64(value));
default:
printf("Invalid value size");
return 0;
}
}
The byte swapping functions are defined in stdlib.h at least when using the MinGW toolchain.
#include <stdio.h>
int main()
{
unsigned short a = 258;
a = (a>>8)|((a&0xff)<<8);
printf("%d",a);
}
While you can do this with bit manipulation, you can also do without, if you prefer. Either way, you shouldn't need any loops though. To do it without bit manipulation, you'd view the short as an array of two chars, and swap the two chars, in roughly the same way as you would swap two items while (for example) sorting an array.
To do it with bit manipulation, the swapped version is basically the lower byte shifted left 8 bits ord with the upper half shifted left 8 bits. You'll probably want to treat it as an unsigned type though, to ensure the upper half doesn't get filled with one bits when you do the right shift.
This should also work for you.
#include <iostream>
int main() {
unsigned int i = 0xCCFF;
std::cout << std::hex << i << std::endl;
i = ( ((i<<8) & 0xFFFF) | ((i >>8) & 0xFFFF)); // swaps the bytes
std::cout << std::hex << i << std::endl;
}
A bit old fashioned, but still a good bit of fun.
XOR swap: ( see How does XOR variable swapping work? )
#include <iostream>
#include <stdint.h>
int main()
{
uint16_t x = 0x1234;
uint8_t *a = reinterpret_cast<uint8_t*>(&x);
std::cout << std::hex << x << std::endl;
*(a+0) ^= *(a+1) ^= *(a+0) ^= *(a+1);
std::cout << std::hex << x << std::endl;
}
This is a problem:
byte2[i-8] = s;
cout << byte2[i];//<--should be i-8 as well
This is causing a buffer overrun.
However, that's not a great way to do it. Look into the bit shift operators << and >>.

Big Endian and Little Endian for Files in C++

I am trying to write some processor independent code to write some files in big endian. I have a sample of code below and I can't understand why it doesn't work. All it is supposed to do is let byte store each byte of data one by one in big endian order. In my actual program I would then write the individual byte out to a file, so I get the same byte order in the file regardless of processor architecture.
#include <iostream>
int main (int argc, char * const argv[]) {
long data = 0x12345678;
long bitmask = (0xFF << (sizeof(long) - 1) * 8);
char byte = 0;
for(long i = 0; i < sizeof(long); i++) {
byte = data & bitmask;
data <<= 8;
}
return 0;
}
For some reason byte always has the value of 0. This confuses me, I am looking at the debugger and see this:
data = 00010010001101000101011001111000
bitmask = 11111111000000000000000000000000
I would think that data & mask would give 00010010, but it just makes byte 00000000 every time! How can his be? I have written some code for the little endian order and this works great, see below:
#include <iostream>
int main (int argc, char * const argv[]) {
long data = 0x12345678;
long bitmask = 0xFF;
char byte = 0;
for(long i = 0; i < sizeof(long); i++) {
byte = data & bitmask;
data >>= 8;
}
return 0;
}
Why does the little endian one work and the big endian not? Thanks for any help :-)
You should use the standard functions ntohl() and kin for this. They operate on explicit sized variables (i.e. uint16_t and uin32_t) rather than compiler-specific long, which necessary for portability.
Some platforms provide 64-bit versions in <endian.h>
In your example, data is 0x12345678.
Your first assignment to byte is therefore:
byte = 0x12000000;
which won't fit in a byte, so it gets truncated to zero.
try:
byte = (data & bitmask) >> (sizeof(long) - 1) * 8);
You're getting the shifting all wrong.
#include <iostream>
int main (int argc, char * const argv[]) {
long data = 0x12345678;
int shift = (sizeof(long) - 1) * 8
const unsigned long mask = 0xff;
char byte = 0;
for (long i = 0; i < sizeof(long); i++, shift -= 8) {
byte = (data & (mask << shift)) >> shift;
}
return 0;
}
Now, I wouldn't recommend you do things this way. I would recommend instead writing some nice conversion functions. Many compilers have these as builtins. So you can write your functions to do it the hard way, then switch them to just forward to the compiler builtin when you figure out what it is.
#include <tr1/cstdint> // To get uint16_t, uint32_t and so on.
inline uint16_t to_bigendian(uint16_t val, char bytes[2])
{
bytes[0] = (val >> 8) & 0xffu;
bytes[1] = val & 0xffu;
}
inline uint32_t to_bigendian(uint32_t val, char bytes[4])
{
bytes[0] = (val >> 24) & 0xffu;
bytes[1] = (val >> 16) & 0xffu;
bytes[2] = (val >> 8) & 0xffu;
bytes[3] = val & 0xffu;
}
This code is simpler and easier to understand than your loop. It's also faster. And lastly, it is recognized by some compilers and automatically turned into the single byte swap operation that would be required on most CPUs.
because you are masking off the top byte from an integer and then not shifting it back down 24 bits ...
Change your loop to:
for(long i = 0; i < sizeof(long); i++) {
byte = (data & bitmask) >> 24;
data <<= 8;
}