Below is an example of processing very similar to what I am working with. I understand the concept of endianness and have read through the suggested posts but it doesn't seem to explain what is happening here.
I have an array of unsigned characters that I am packing with data. I was under the impression that memcpy was endianness agnostic. I would think that the left-most bit would stay the left must bit. However when I attempt to print the characters each word is copied backwards.
Why does this happen?
#include <iostream>
#include <cstring>
#include <array>
const unsigned int MAX_VALUE = 64ul;
typedef unsigned char DDS_Octet[MAX_VALUE];
int main()
{
// create an array and populate it with printable
// characters
DDS_Octet octet;
for(int i = 0; i < MAX_VALUE; ++i)
octet[i] = (i + 33);
// print characters before the memcpy operation
for(int i = 0; i < MAX_VALUE; ++i)
{
if(i && !(i % 4)) std::cout << "\n";
std::cout << octet[i] << "\t";
}
std::cout << "\n\n------------------------------\n";
// This is an equivalent copy operation
// to what is actually being used
std::array<unsigned int, 16> arr;
memcpy(
arr.data(),
octet,
sizeof(octet));
// print the character contents of each
// word left to right (MSB to LSB on little endian)
for(auto i : arr)
std::cout
<< (char)(i >> 24) << "\t"
<< (char)((i >> 16) & 0xFF) << "\t"
<< (char)((i >> 8) & 0xFF) << "\t"
<< (char)(i & 0xFF) << "\n";
** output **
! " # $
% & ' (
) * + ,
- . / 0
1 2 3 4
5 6 7 8
9 : ; <
= > ? #
A B C D
E F G H
I J K L
M N O P
Q R S T
U V W X
Y Z [ \
] ^ _ `
------------------------------
$ # " !
( ' & %
, + * )
0 / . -
4 3 2 1
8 7 6 5
< ; : 9
# ? > =
D C B A
H G F E
L K J I
P O N M
T S R Q
X W V U
\ [ Z Y
` _ ^ ]
----Update-----
I took a look at the memcpy source code (below) which was far more simple than expected. It actually explains everything. It would seem that it would be correct to say that the endianness of the integer is the cause for this, but incorrect to say that memcpy does not play a role. What I was overlooking what that data is being copied on a byte-by-byte operation. Given that, it makes sense that the little endian integer would reverse it.
void *
memcpy (void *dest, const void *src, size_t len)
{
char *d = dest;
const char *s = src;
while (len--)
*d++ = *s++;
return dest;
}
When you memcpy 4 chars into a 4-byte unsigned int they get stored in the same order they were in the original array. That is, the first char in the input array will be stored in the lowest address byte of the unsigned int, the second in the second lowest address byte, and so on.
x86 is little-endian. The lowest address byte of an unsigned int is the least significant byte.
The shift operators are endianess-independent though. They work on the logical representation of an integer, not the physical bytes. That means, for an unsigned int i on a little-endain platform, i & 0xFF gives the lowest address byte and (i >> 24) & 0xFF gives the highest address byte, while on a big-endian platform i & 0xFF gives the highest address byte and (i >> 24) & 0xFF gives the lowest address byte.
Taken together, these threee facts explain why your data is reversed. '!' is the first char in your array, so when you memcpy that array into an array of unsigned int '!' becomes the lowest address byte of the first unsigned int in the destination array. The lowest address byte is the least significant on your little-endian platform, and so that is the byte you retrieve with i & 0xFF.
Maybe this will let you understand easier. Let's say we have these data defined:
uint32_t val = 0x01020304;
auto *pi = reinterpret_cast<unsigned char *>( &val );
Following code will produce the same result on big-endian and little-endian platform:
std::cout << ( (val >> 24) & 0xFF ) << '\t'
<< ( (val >> 16) & 0xFF ) << '\t'
<< ( (val >> 8) & 0xFF ) << '\t'
<< ( (val >> 0) & 0xFF ) << '\n';
but this code will have different output:
std::cout << static_cast<unsigned int>( pi[0] ) << '\t'
<< static_cast<unsigned int>( pi[1] ) << '\t'
<< static_cast<unsigned int>( pi[2] ) << '\t'
<< static_cast<unsigned int>( pi[3] ) << '\n';
it has nothing to do with memcpy(), it is how ints are stored in memory and how bit shifting operation works.
The value 0x12345678 is stored as 4 bytes: 0x78 0x56 0x34 0x12. But 0x12345678>>24 is still 0x12 because that has nothing to do with the 4 separate bytes.
If you have the 4 bytes: 0x78 0x56 0x34 0x12, and interpret them as a 4-byte little-endian integer, you get 0x12345678. If you right-shift by 24 bits, you get the 4th byte: 0x12. If you right-shift by 16 bits and mask with 0xff, you get the 3rd byte: 0x34. And so on. Because ((0x12345678 >> 16) & 0xff) == 0x34
The memcpy has nothing to do with it.
Related
I had already asked a question how to get 4 int8_t into a 32bit int, I was told that I have to cast the int8_t to a uint8_t first to pack it with bitshifting into a 32bit integer.
int8_t offsetX = -10;
int8_t offsetY = 120;
int8_t offsetZ = -60;
using U = std::uint8_t;
int toShader = (U(offsetX) << 24) | (U(offsetY) << 16) | (U(offsetZ) << 8) | (0 << 0);
std::cout << (int)(toShader >> 24) << " "<< (int)(toShader >> 16) << " " << (int)(toShader >> 8) << std::endl;
My Output is
-10 -2440 -624444
It's not what I expected, of course, does anyone have a solution?
In the shader I want to unpack the int16 later and that is only possible with a 32bit integer because glsl does not have any other data types.
int offsetX = data[gl_InstanceID * 3 + 2] >> 24;
int offsetY = data[gl_InstanceID * 3 + 2] >> 16 ;
int offsetZ = data[gl_InstanceID * 3 + 2] >> 8 ;
What is written in the square bracket does not matter it is about the correct shifting of the bits or casting after the bracket.
If any of the offsets is negative, then the shift results in undefined behaviour.
Solution: Convert the offsets to an unsigned type first.
However, this brings another potential problem: If you convert to unsigned, then negative numbers will have very large values with set bits in most significant bytes, and OR operation with those bits will always result in 1 regardless of offsetX and offsetY. A solution is to convert into a small unsigned type (std::uint8_t), and another is to mask the unused bytes. Former is probably simpler:
using U = std::uint8_t;
int third = U(offsetX) << 24u
| U(offsetY) << 16u
| U(offsetZ) << 8u
| 0u << 0u;
I think you're forgetting to mask the bits that you care about before shifting them.
Perhaps this is what you're looking for:
int32 offsetX = (data[gl_InstanceID * 3 + 2] & 0xFF000000) >> 24;
int32 offsetY = (data[gl_InstanceID * 3 + 2] & 0x00FF0000) >> 16 ;
int32 offsetZ = (data[gl_InstanceID * 3 + 2] & 0x0000FF00) >> 8 ;
if (offsetX & 0x80) offsetX |= 0xFFFFFF00;
if (offsetY & 0x80) offsetY |= 0xFFFFFF00;
if (offsetZ & 0x80) offsetZ |= 0xFFFFFF00;
Without the bit mask, the X part will end up in offsetY, and the X and Y part in offsetZ.
on CPU side you can use union to avoid bit shifts and bit masking and branches ...
int8_t x,y,z,w; // your 8bit ints
int32_t i; // your 32bit int
union my_union // just helper union for the casting
{
int8_t i8[4];
int32_t i32;
} a;
// 4x8bit -> 32bit
a.i8[0]=x;
a.i8[1]=y;
a.i8[2]=z;
a.i8[3]=w;
i=a.i32;
// 32bit -> 4x8bit
a.i32=i;
x=a.i8[0];
y=a.i8[1];
z=a.i8[2];
w=a.i8[3];
If you do not like unions the same can be done with pointers...
Beware on GLSL side is this not possible (nor unions nor pointers) and you have to use bitshifts and masks like in the other answer...
So this sensor I have returns a signed value between -500-500 by returning two (high and low) signed bytes. How can I use these to figure out what the actual value is? I know I need to do 2's complement, but I'm not sure how. This is what I have now -
real_velocity = temp.values[0];
if(temp.values[1] != -1)
real_velocity += temp.values[1];
//if high byte > 1, negative number - take 2's compliment
if(temp.values[1] > 1) {
real_velocity = ~real_velocity;
real_velocity += 1;
}
But it just returns the negative value of what would be a positive. So for instance, -200 returns bytes 255 (high) and 56(low). Added these are 311. But when I run the above code it tells me -311. Thank you for any help.
-200 in hex is 0xFF38,
you're getting two bytes 0xFF and 0x38,
converting these back to decimal you might recognise them
0xFF = 255,
0x38 = 56
your sensor is not returning 2 signed bytes but a simply the high and low byte of a signed 16 bit number.
so your result is
value = (highbyte << 8) + lowbyte
value being a 16 bit signed variable.
Based on the example you gave, it appears that the value is already 2's complement. You just need to shift the high byte left 8 bits and OR the values together.
real_velocity = (short) (temp.values[0] | (temp.values[1] << 8));
You can shift the bits and mask the values.
int main()
{
char data[2];
data[0] = 0xFF; //high
data[1] = 56; //low
int value = 0;
if (data[0] & 0x80) //sign
value = 0xFFFF8000;
value |= ((data[0] & 0x7F) << 8) | data[1];
std::cout<<std::hex<<value<<std::endl;
std::cout<<std::dec<<value<<std::endl;
std::cin.get();
}
Output:
ffffff38
-200
real_velocity = temp.values[0];
real_velocity = real_velocity << 8;
real_velocity |= temp.values[1];
// And, assuming 32-bit integers
real_velocity <<= 16;
real_velocity >>= 16;
For 8-bit bytes, first just convert to unsigned:
typedef unsigned char Byte;
unsigned const u = (Byte( temp.values[1] ) << 8) | Byte( temp.values[0] );
Then if that is greater than the upper range for 16-bit two's complement, subtract 216:
int const i = int(u >= (1u << 15)? u - (1u << 16) : u);
You could do tricks at the bit level, but I don't think there's any point in that.
The above assuming that CHAR_BIT = 8, that unsigned is more than 16 bits, and that the machine and desired result is two's complement.
#include <iostream>
using namespace std;
int main()
{
typedef unsigned char Byte;
struct { char values[2]; } const temp = { 56, 255 };
unsigned const u = (Byte( temp.values[1] ) << 8) | Byte( temp.values[0] );
int const i = int(u >= (1u << 15)? u - (1u << 16) : u);
cout << i << endl;
}
I'm working on a homework assignment for my C++ class. The question I am working on reads as follows:
Write a function that takes an unsigned short int (2 bytes) and swaps the bytes. For example, if the x = 258 ( 00000001 00000010 ) after the swap, x will be 513 ( 00000010 00000001 ).
Here is my code so far:
#include <iostream>
using namespace std;
unsigned short int ByteSwap(unsigned short int *x);
int main()
{
unsigned short int x = 258;
ByteSwap(&x);
cout << endl << x << endl;
system("pause");
return 0;
}
and
unsigned short int ByteSwap(unsigned short int *x)
{
long s;
long byte1[8], byte2[8];
for (int i = 0; i < 16; i++)
{
s = (*x >> i)%2;
if(i < 8)
{
byte1[i] = s;
cout << byte1[i];
}
if(i == 8)
cout << " ";
if(i >= 8)
{
byte2[i-8] = s;
cout << byte2[i];
}
}
//Here I need to swap the two bytes
return *x;
}
My code has two problems I am hoping you can help me solve.
For some reason both of my bytes are 01000000
I really am not sure how I would swap the bytes. My teachers notes on bit manipulation are very broken and hard to follow and do not make much sense me.
Thank you very much in advance. I truly appreciate you helping me.
New in C++23:
The standard library now has a function that provides exactly this facility:
#include <iostream>
#include <bit>
int main() {
unsigned short x = 258;
x = std::byteswap(x);
std::cout << x << endl;
}
Original Answer:
I think you're overcomplicating it, if we assume a short consists of 2 bytes (16 bits), all you need
to do is
extract the high byte hibyte = (x & 0xff00) >> 8;
extract the low byte lobyte = (x & 0xff);
combine them in the reverse order x = lobyte << 8 | hibyte;
It looks like you are trying to swap them a single bit at a time. That's a bit... crazy. What you need to do is isolate the 2 bytes and then just do some shifting. Let's break it down:
uint16_t x = 258;
uint16_t hi = (x & 0xff00); // isolate the upper byte with the AND operator
uint16_t lo = (x & 0xff); // isolate the lower byte with the AND operator
Now you just need to recombine them in the opposite order:
uint16_t y = (lo << 8); // shift the lower byte to the high position and assign it to y
y |= (hi >> 8); // OR in the upper half, into the low position
Of course this can be done in less steps. For example:
uint16_t y = (lo << 8) | (hi >> 8);
Or to swap without using any temporary variables:
uint16_t y = ((x & 0xff) << 8) | ((x & 0xff00) >> 8);
You're making hard work of that.
You only neeed exchange the bytes. So work out how to extract the two byte values, then how to re-assemble them the other way around
(homework so no full answer given)
EDIT: Not sure why I bothered :) Usefulness of an answer to a homework question is measured by how much the OP (and maybe other readers) learn, which isn't maximized by giving the answer to the homewortk question directly...
Here is an unrolled example to demonstrate byte by byte:
unsigned int swap_bytes(unsigned int original_value)
{
unsigned int new_value = 0; // Start with a known value.
unsigned int byte; // Temporary variable.
// Copy the lowest order byte from the original to
// the new value:
byte = original_value & 0xFF; // Keep only the lowest byte from original value.
new_value = new_value * 0x100; // Shift one byte left to make room for a new byte.
new_value |= byte; // Put the byte, from original, into new value.
// For the next byte, shift the original value by one byte
// and repeat the process:
original_value = original_value >> 8; // 8 bits per byte.
byte = original_value & 0xFF; // Keep only the lowest byte from original value.
new_value = new_value * 0x100; // Shift one byte left to make room for a new byte.
new_value |= byte; // Put the byte, from original, into new value.
//...
return new_value;
}
Ugly implementation of Jerry's suggestion to treat the short as an array of two bytes:
#include <iostream>
typedef union mini
{
unsigned char b[2];
short s;
} micro;
int main()
{
micro x;
x.s = 258;
unsigned char tmp = x.b[0];
x.b[0] = x.b[1];
x.b[1] = tmp;
std::cout << x.s << std::endl;
}
Using library functions, the following code may be useful (in a non-homework context):
unsigned long swap_bytes_with_value_size(unsigned long value, unsigned int value_size) {
switch (value_size) {
case sizeof(char):
return value;
case sizeof(short):
return _byteswap_ushort(static_cast<unsigned short>(value));
case sizeof(int):
return _byteswap_ulong(value);
case sizeof(long long):
return static_cast<unsigned long>(_byteswap_uint64(value));
default:
printf("Invalid value size");
return 0;
}
}
The byte swapping functions are defined in stdlib.h at least when using the MinGW toolchain.
#include <stdio.h>
int main()
{
unsigned short a = 258;
a = (a>>8)|((a&0xff)<<8);
printf("%d",a);
}
While you can do this with bit manipulation, you can also do without, if you prefer. Either way, you shouldn't need any loops though. To do it without bit manipulation, you'd view the short as an array of two chars, and swap the two chars, in roughly the same way as you would swap two items while (for example) sorting an array.
To do it with bit manipulation, the swapped version is basically the lower byte shifted left 8 bits ord with the upper half shifted left 8 bits. You'll probably want to treat it as an unsigned type though, to ensure the upper half doesn't get filled with one bits when you do the right shift.
This should also work for you.
#include <iostream>
int main() {
unsigned int i = 0xCCFF;
std::cout << std::hex << i << std::endl;
i = ( ((i<<8) & 0xFFFF) | ((i >>8) & 0xFFFF)); // swaps the bytes
std::cout << std::hex << i << std::endl;
}
A bit old fashioned, but still a good bit of fun.
XOR swap: ( see How does XOR variable swapping work? )
#include <iostream>
#include <stdint.h>
int main()
{
uint16_t x = 0x1234;
uint8_t *a = reinterpret_cast<uint8_t*>(&x);
std::cout << std::hex << x << std::endl;
*(a+0) ^= *(a+1) ^= *(a+0) ^= *(a+1);
std::cout << std::hex << x << std::endl;
}
This is a problem:
byte2[i-8] = s;
cout << byte2[i];//<--should be i-8 as well
This is causing a buffer overrun.
However, that's not a great way to do it. Look into the bit shift operators << and >>.
I'm programming in C++. I need to convert a 24-bit signed integer (stored in a 3-byte array) to float (normalizing to [-1.0,1.0]).
The platform is MSVC++ on x86 (which means the input is little-endian).
I tried this:
float convert(const unsigned char* src)
{
int i = src[2];
i = (i << 8) | src[1];
i = (i << 8) | src[0];
const float Q = 2.0 / ((1 << 24) - 1.0);
return (i + 0.5) * Q;
}
I'm not entirely sure, but it seems the results I'm getting from this code are incorrect. So, is my code wrong and if so, why?
You are not sign extending the 24 bits into an integer; the upper bits will always be zero. This code will work no matter what your int size is:
if (i & 0x800000)
i |= ~0xffffff;
Edit: Problem 2 is your scaling constant. In simple terms, you want to multiply by the new maximum and divide by the old maximum, assuming that 0 remains at 0.0 after conversion.
const float Q = 1.0 / 0x7fffff;
Finally, why are you adding 0.5 in the final conversion? I could understand if you were trying to round to an integer value, but you're going the other direction.
Edit 2: The source you point to has a very detailed rationale for your choices. Not the way I would have chosen, but perfectly defensible nonetheless. My advice for the multiplier still holds, but the maximum is different because of the 0.5 added factor:
const float Q = 1.0 / (0x7fffff + 0.5);
Because the positive and negative magnitudes are the same after the addition, this should scale both directions correctly.
Since you are using a char array, it does not necessarily follow that the input is little endian by virtue of being x86; the char array makes the byte order architecture independent.
Your code is somewhat over complicated. A simple solution is to shift the 24 bit data to scale it to a 32bit value (so that the machine's natural signed arithmetic will work), and then use a simple ratio of the result with the maximum possible value (which is INT_MAX less 256 because of the vacant lower 8 bits).
#include <limits.h>
float convert(const unsigned char* src)
{
int i = src[2] << 24 | src[1] << 16 | src[0] << 8 ;
return i / (float)(INT_MAX - 256) ;
}
Test code:
unsigned char* makeS24( unsigned int i, unsigned char* s24 )
{
s24[2] = (unsigned char)(i >> 16) ;
s24[1] = (unsigned char)((i >> 8) & 0xff);
s24[0] = (unsigned char)(i & 0xff);
return s24 ;
}
#include <iostream>
int main()
{
unsigned char s24[3] ;
volatile int x = INT_MIN / 2 ;
std::cout << convert( makeS24( 0x800000, s24 )) << std::endl ; // -1.0
std::cout << convert( makeS24( 0x7fffff, s24 )) << std::endl ; // 1.0
std::cout << convert( makeS24( 0, s24 )) << std::endl ; // 0.0
std::cout << convert( makeS24( 0xc00000, s24 )) << std::endl ; // -0.5
std::cout << convert( makeS24( 0x400000, s24 )) << std::endl ; // 0.5
}
Since it's not symmetrical, this is probably the best compromise.
Maps -((2^23)-1) to -1.0 and ((2^23)-1) to 1.0.
(Note: this is the same conversion style used by 24 bit WAV files)
float convert( const unsigned char* src )
{
int i = ( ( src[ 2 ] << 24 ) | ( src[ 1 ] << 16 ) | ( src[ 0 ] << 8 ) ) >> 8;
return ( ( float ) i ) / 8388607.0;
}
The solution that works for me:
/**
* Convert 24 byte that are saved into a char* and represent a float
* in little endian format to a C float number.
*/
float convert(const unsigned char* src)
{
float num_float;
// concatenate the chars (short integers) and
// save them to a long int
long int num_integer = (
((src[2] & 0xFF) << 16) |
((src[1] & 0xFF) << 8) |
(src[0] & 0xFF)
) & 0xFFFFFFFF;
// copy the bits from the long int variable
// to the float.
memcpy(&num_float, &num_integer, 4);
return num_float;
}
Works for me:
float convert(const char* stream)
{
int fromStream =
(0x00 << 24) +
(stream[2] << 16) +
(stream[1] << 8) +
stream[0];
return (float)fromStream;
}
Looks like you're treating it as an 24-bit unsigned integer. If the most significant bit is 1, you need to make i negative by setting the remaining 8 bits to 1 as well.
I'm not sure if it's good programming practice, but this seems to work (at least with g++ on 32-bit Linux, haven't tried it on anything else yet) and is certainly more elegant than extracting byte-by-byte from a char array, especially if it's not really a char array but rather a stream (in my case, it's a file stream) that you read from (if it is a char array, you can use memcpy instead of istream::read).
Just load the 24-bit variable into the less significant 3 bytes of a signed 32-bit (signed long). Then shift the long variable one byte to the left, so that the sign bit appears where it's meant to. Finally, just normalize the 32-bit variable, and you're all set.
union _24bit_LE{
char access;
signed long _long;
}_24bit_LE_buf;
float getnormalized24bitsample(){
std::ifstream::read(&_24bit_LE_buf.access+1, 3);
return (_24bit_LE_buf._long<<8) / (0x7fffffff + .5);
}
(Strangely, it doesn't seem to work when you just read into the 3 more significant bytes right away).
EDIT: it turns out this method seems to have some problems I don't fully understand yet. Better don't use it for the time being.
This one, got from here, works for me.
typedef union {
struct {
unsigned short lo;
unsigned short hi;
} u16;
unsigned int u32;
signed int i32;
float f;
}Versatype32;
//Bipolar version (-1.0 to ~1.0)
void fInt24_to_float(float* dest, const char* src, size_t length) {
Versatype32 xTemp;
while (length--) {
xTemp.u32 = *(int*)src;
//Check if Negative by right shifting 8
xTemp.u32 <<= 8; //(If it's a negative, we'll know) (Props to Norman Wong)
//Convert to float
xTemp.f = (float)xTemp.i32;
//Skip divide down if zero
if (xTemp.u32 != 0) {
//Divide by (1<<31 or 2^31)
xTemp.u16.hi -= 0x0F80; //BAM! Bitmagic!
}
*dest = xTemp.f;
//Move to next set
dest++;
src += 3;
} //Are we done yet?
//Yes!
return;
}
I was looking at an example of reading bits from a byte and the implementation looked simple and easy to understand. I was wondering if anyone has a similar example of how to insert bits into a byte or byte array, that is easier to understand and also implement like the example below.
Here is the example I found of reading bits from a byte:
static int GetBits3(byte b, int offset, int count)
{
return (b >> offset) & ((1 << count) - 1);
}
Here is what I'm trying to do. This is my current implementation, I'm just a little confused with the bit-masking/shifting, etc., so I'm trying to find out if there is an easier way to do what I'm doing
BYTE Msg[2];
Msg_Id = 3;
Msg_Event = 1;
Msg_Ready = 2;
Msg[0] = ( ( Msg_Event << 4 ) & 0xF0 ) | ( Msg_Id & 0x0F ) ;
Msg[1] = Msg_Ready & 0x0F; //MsgReady & Unused
If you are using consecutive integer constant values like in the example above, you should shift the bits with these constants when putting them inside a byte. Otherwise they overlap: in your example, Msg_Id equals Msg_Event & Msg_Ready. These can be used like
Msg[0] = ( 1 << Msg_Event ) | ( 1 << Msg_Id); // sets the 2nd and 4th bits
(Note that bits within a byte are indexed from 0.) The other approach would be using powers of 2 as constant values:
Msg_Id = 4; // equals 1 << 2
Msg_Event = 1; // equals 1 << 0
Msg_Ready = 2; // equals 1 << 1
Note that in your code above, masking with 0x0F or 0xF0 is not really needed: (Msg_Id & 0x0F) == Msg_Id and ((Msg_Event << 4) & 0xF0) == (Msg_Event << 4).
You could use a bit field. For instance :
struct Msg
{
unsigned MsgEvent : 1; // 1 bit
unsigned MsgReady : 1; // 1 bit
};
You could then use a union to manipulate either the bitfield or the byte, something like this :
struct MsgBitField {
unsigned MsgEvent : 1; // 1 bit
unsigned MsgReady : 1; // 1 bit
};
union ByteAsBitField {
unsigned char Byte;
MsgBitField Message;
};
int main() {
ByteAsBitField MyByte;
MyByte.Byte = 0;
MyByte.Message.MsgEvent = true;
}