I have a UINT8 pointer mArray, which is being assigned information via a *(UINT16 *) casting. EG:
int offset = someValue;
*(UINT16 *)&mArray[offset] = mUINT16;
for(int i = 0; i < mArrayLength; i++)
printf("%02X",*(mArray + i));
output: ... FF AA ...
expected: ... AA FF ...
The value I am expecting to be printed when it reaches offset is to be AA FF, but the value that is printed is FF AA, and for the life of me I can't figure out why.
You are using a little endian machine.
You didn't specify but I'm guessing your mArray is an array of bytes instead of an array of UINT16s.
You're also running on a little-endian machine. On little endian machines the bytes are stored in the opposite order of big-endian machines. Big endians store them pretty much the way humans read them.
You are probably using a computer that uses a "little-endian" representation of numbers in memory (such as Intel x86 architecture). Basically this means that the least significant byte of any value will be stored at the lowest address of the memory location that is used to store the values. See Wikipdia for details.
In your case, the number 0xAAFF consists of the two bytes 0xAA and 0xFF with 0xFF being the least significant one. Hence, a little-endian machine will store 0xFF at the lowest address and then 0xAA. Hence, if you interpret the memory location to which you have written an UINT16 value as an UINT8, you will get the byte written to that location which happens to be 0xFF
If you want to write an array of UINT16 values into an appropriately sized array of UINT8 values such that the output will match your expectations you could do it in the following way:
/* copy inItems UINT16 values from inArray to outArray in
* MSB first (big-endian) order
void copyBigEndianArray(UINT16 *inArray, size_t inItems, UINT8 *outArray)
for (int i = 0; i < inItems; i++)
// shift one byte right: AAFF -> 00AA
outArray[2*i] = inArray[i] >> 8;
// cut off left byte in conversion: AAFF -> FF
outArray[2*i + 1] = inArray[i]
You might also want to check out the hton*/ntoh*-family of functions if they are available on your platform.
It's because your computer's CPU is using little endian representation of integers in memory
So I have a little piece of code that takes 2 uint8_t's and places then next to each other, and then returns a uint16_t. The point is not adding the 2 variables, but putting them next to each other and creating a uint16_t from them.
The way I expect this to work is that when the first uint8_t is 0, and the second uint8_t is 1, I expect the uint16_t to also be one.
However, this is in my code not the case.
This is my code:
uint8_t *bytes = new uint8_t[2];
bytes[0] = 0;
bytes[1] = 1;
uint16_t out = *((uint16_t*)bytes);
It is supposed to make the bytes uint8_t pointer into a uint16_t pointer, and then take the value. I expect that value to be 1 since x86 is little endian. However it returns 256.
Setting the first byte to 1 and the second byte to 0 makes it work as expected. But I am wondering why I need to switch the bytes around in order for it to work.
Can anyone explain that to me?
There is no uint16_t or compatible object at that address, and so the behaviour of *((uint16_t*)bytes) is undefined.
I expect that value to be 1 since x86 is little endian. However it returns 256.
Even if the program was fixed to have well defined behaviour, your expectation is backwards. In little endian, the least significant byte is stored in the lowest address. Thus 2 byte value 1 is stored as 1, 0 and not 0, 1.
Does endianess also affect the order of the bit's in the byte or not?
There is no way to access a bit by "address"1, so there is no concept of endianness. When converting to text, bits are conventionally shown most significant on left and least on right; just like digits of decimal numbers. I don't know if this is true in right to left writing systems.
1 You can sort of create "virtual addresses" for bits using bitfields. The order of bitfields i.e. whether the first bitfield is most or least significant is implementation defined and not necessarily related to byte endianness at all.
Here is a correct way to set two octets as uint16_t. The result will depend on endianness of the system:
// no need to complicate a simple example with dynamic allocation
uint16_t out;
// note that there is an exception in language rules that
// allows accessing any object through narrow (unsigned) char
// or std::byte pointers; thus following is well defined
std::byte* data = reinterpret_cast<std::byte*>(&out);
data[0] = 1;
data[1] = 0;
Note that assuming that input is in native endianness is usually not a good choice, especially when compatibility across multiple systems is required, such as when communicating through network, or accessing files that may be shared to other systems.
In these cases, the communication protocol, or the file format typically specify that the data is in specific endianness which may or may not be the same as the native endianness of your target system. De facto standard in network communication is to use big endian. Data in particular endianness can be converted to native endianness using bit shifts, as shown in Frodyne's answer for example.
In a little endian system the small bytes are placed first. In other words: The low byte is placed on offset 0, and the high byte on offset 1 (and so on). So this:
uint8_t* bytes = new uint8_t[2];
bytes[0] = 1;
bytes[1] = 0;
uint16_t out = *((uint16_t*)bytes);
Produces the out = 1 result you want.
However, as you can see this is easy to get wrong, so in general I would recommend that instead of trying to place stuff correctly in memory and then cast it around, you do something like this:
uint16_t out = lowByte + (highByte << 8);
That will work on any machine, regardless of endianness.
Edit: Bit shifting explanation added.
x << y means to shift the bits in x y places to the left (>> moves them to the right instead).
If X contains the bit-pattern xxxxxxxx, and Y contains the bit-pattern yyyyyyyy, then (X << 8) produces the pattern: xxxxxxxx00000000, and Y + (X << 8) produces: xxxxxxxxyyyyyyyy.
(And Y + (X<<8) + (Z<<16) produces zzzzzzzzxxxxxxxxyyyyyyyy, etc.)
A single shift to the left is the same as multiplying by 2, so X << 8 is the same as X * 2^8 = X * 256. That means that you can also do: Y + (X*256) + (Z*65536), but I think the shifts are clearer and show the intent better.
Note that again: Endianness does not matter. Shifting 8 bits to the left will always clear the low 8 bits.
You can read more here: https://en.wikipedia.org/wiki/Bitwise_operation. Note the difference between Arithmetic and Logical shifts - in C/C++ unsigned values use logical shifts, and signed use arithmetic shifts.
If p is a pointer to some multi-byte value, then:
"Little-endian" means that the byte at p is the least-significant byte, in other words, it contains bits 0-7 of the value.
"Big-endian" means that the byte at p is the most-significant byte, which for a 16-bit value would be bits 8-15.
Since the Intel is little-endian, bytes[0] contains bits 0-7 of the uint16_t value and bytes[1] contains bits 8-15. Since you are trying to set bit 0, you need:
bytes[0] = 1; // Bits 0-7
bytes[1] = 0; // Bits 8-15
Your code works but your misinterpreted how to read "bytes"
#include <cstdint>
#include <cstddef>
#include <iostream>
int main()
uint8_t *in = new uint8_t[2];
in[0] = 3;
in[1] = 1;
uint16_t out = *((uint16_t*)in);
std::cout << "out: " << out << "\n in: " << in[1]*256 + in[0]<< std::endl;
return 0;
By the way, you should take care of alignment when casting this way.
One way to think in numbers is to use MSB and LSB order
which is MSB is the highest Bit and LSB ist lowest Bit for
Little Endian machines.
For ex.
(u)int32: MSB:Bit 31 ... LSB: Bit 0
(u)int16: MSB:Bit 15 ... LSB: Bit 0
(u)int8 : MSB:Bit 7 ... LSB: Bit 0
with your cast to a 16Bit value the Bytes will arrange like this
16Bit <= 8Bit 8Bit
Bit15 Bit0 Bit7 .. 0 Bit7 .. 0
0000 0001 0000 0000 0000 0001 0000 0000
which is 256 -> correct value.
I am debugging a issue where the data coming out in a buffer gets a wrong value. I am sending the buffer (u8) type from Kernel driver to HAL. In HAL there is a Uint16 buffer which is receiving the values from this buffer.
//Code to copy BUFFLEN contents from u8 data[BUFFLEN] into uint16_tData ;
uint16_t uint16_tData[BUFFLEN / 2];
float floatdata[BUFFLEN / 2];
I am getting the float values from the uint16_tData buffer using this type cast:
floatdata[index] = (*((float *)((void *)&uint16_tData[index1];
Now my question: How Can I interpret the data residing in the floatdata array?
Say I have data floatdata[0] = 53640;
How can I Interpret a float data out of this nibble in floatdata[0]
Note 4 bytes from u8 --> 2 elements of array uint16_tData --> one element of array floatdata.
I wanted to know with an example that say:
Values transmitted from the driver are:
u8 val1 = 206
u8 val2 = 208
u8 val3 = 120
u8 val4 = 68
In the HAL, how will the conversion take place and what values will I get here?
uint16_tData[0] = ?
uint16_tData[1] = ?
And how will it be the interpretation of the data as a float in floatdata?
floatdata[0] = ?
Why do this at all?
I think you want this:
float *floatdata;
floatdata = (float *)uint16_tData;
// Now use floatdata[index] ...
Casting items one at a time is silly. Casting from uint16_t to float is even sillier, it would be better keep it as uint8_t (bytes) whic his what it seems to come in as, or at least use uint32_t as the intermediate.
Is there anything that prevents you from doing this? Can you think of how you can redesign your code so you can avoid dealing with these issues entirely?
according to my understanding of your question is this what you want
float f ;
uint8_t u8arr[4]={206,208,120,68};
uint16_t arr[2];
uint16 are "made of" 2 u8, and single precision float is 32 bit (4 u8 or 2 uint16), but how they are "converted" it depends on the code that handles those data, in particular in the meaning of "sending" (from kernel driver) and how the code that "receives" expect the data. I admit I am ignorant about the topic and I can't check and research it now, but I suspect you have a problem of endianness: your data have a "meaning" altogether, not as single octets, but you put them in memory in the wrong order.
Basically, you have a buffer which contains your octects, say
u8 u8 u8 u8 4 u8 arr element (arr[0], arr[1] ...)
\____/ \____/
uint16 uint16 2 uint16 arr element (arr[0] and arr[1])
in this order of increasing memory address. If you want that A B is read as the "correct" uint16 number, you have to "sort" it in memory so that you match the endianness of the processor: e.g. if you want the number A*8 + B on a little endian cpu, you have to write two u8 in memory in the opposite order:
I suspect HAL uses uint16 just to store data, not to interpret them as 16bit number, so the endianness here is not a problem. But it is once you want to get the correct "float" from 4 u8 (or 2 uint16) in memory: you have to put the u8 in the correct order.
If you want that your u8s are interpreted as float through a cast, you first have to put them in the opposite order in a little endian machine, e.g.
address of arr8[0] and arr16[0]
u8 u8 u8 u8 4 u8 arr element (arr8[0], arr8[1] ...)
\____/ \____/
uint16 uint16 2 uint16 arr element (arr16[0] and arr16[1])
Then, if you "read" those octets as float on a litte endian machine you have the right "float".
So, it is up to you to sort correctly your u8, knowing that they will be interpreted as another type with a "width" wider than one octet. Portable code should use proper ways to compile correctly for processors with different endianness.
See also this wikipedia article
I am not HAL/DX friendly so it can be a silly note
but are you sure that your floats are 16bit and not 32bit?
if not then you need 2x16bit uint per single 32bit float and the bug is there.
If yes then read further
go here: http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050(v=vs.85).aspx#alpha_16_bit
look for 16 bit rules
look for the bit count and locations (sign,exp,mantisa)
ok now what is your problem:
you have uint buffers but store float inside ?
in this case just do this:
float16 *p=(float16*)uint16_tData;
and use p instead of uint16_tData
you have uint buffers with uint values stored inside ?
this means you need to convert uint to float
use internal = operator if exists
if not write your own conversion
you can ignore sign for uints
set exp to 0 (2^0)
set mantissa to uint value
truncate unfitted bits
just shift right / inc exponent until MSB of uint fits into mantissa
put all together with bit shift/and/or to their places (see that link above)
you have float values and need to store them in uint as uint
so simply do backwards previous point
extract exponent, mantisa, ignore sign or use 2'os complement
shift mantisa left by exponent (if positive) else right by -exponent
and that is it.
PS. be aware on some platforms/compilers/data types the bit shift can insert also ones from carry or the other side of number. In that case and the result with bit-mask.
PPS. do not forget to apply exponent bias !!!
Hope it helps a little
I have an array of 256 unsigned integers called frequencies[256] (one integer for each ascii value). My goal is to read through an input and for each character i increment the integer in the array that corresponds to it (for example the character 'A' will cause the frequencies[65] integer to increase by one) and when the input is over I must output each integer as 4 characters in little endian form.
So far I have made a loop that goes through the input and increases each corresponding integer in the array. But i am very confused on how to output each integer in little endian form. I understand that each byte of the four bytes of each integer should be output as a character (for instance the unsigned integer 1 in little endian is "00000001 00000000 00000000 00000000" which i would want to output as the 4 ascii characters that correspond to those bytes).
But how do i get at the binary representation of an unsigned integer in my code and how would i go about chopping it up and rearranging it?
Thanks for the help.
For hardware portability, please use the following solution:
int freqs[256];
for (int i = 0; i < 256; ++i)
printf("%02x %02x %02x %02x\n", (freqs[i] >> 0 ) & 0xFF
, (freqs[i] >> 8 ) & 0xFF
, (freqs[i] >> 16) & 0xFF
, (freqs[i] >> 24) & 0xFF);
You can use memcpy which copies a block of memory.
char tab[4] ;
memcpy(tab, frequencies+i, sizeof(int));
now, tab[0], tab[1], etc. will be your characters.
A program to swap from big to little endian: Little Endian - Big Endian Problem.
To understand if your system is little or big endian: https://stackoverflow.com/a/1024954/2436175.
Transform your chars/integers in a set of printable bits: https://stackoverflow.com/a/7349767/2436175
It's not really clear what you mean by "little endian" here.
Integers don't have endianness per se; endianness only comes
into play when you cut them up into smaller pieces. So which
smaller pieces to you mean: bytes or characters. If characters,
just convert in the normal way, and reverse the generated
string. If bytes (or any other smaller piece), each individual
byte can be represented as a function of the int: i & 0xFF
calculates the low order byte, (i >> 8) & 0xFF the next
lowest, and so forth. (If the bytes aren't 8 bits, then change
the shift value and the mask correspondingly.)
And with regards to your second paragraph: a single byte of an
int doesn't necessarily correspond to a character, regardless
of the encodig. For the four bytes you show, for example, none
of them corresponds to a character in any of the usual
With regards to the last paragraph: to get the binary
representation of an unsigned integer, use the same algorithm
that you would use for any representation:
asText( unsigned int value, int base, int minDigits = 1 )
static std::string digits( "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ" );
assert( base >= 2 && base <= digits.size() );
std::string results;
while ( value != 0 || minDigits > 0 ) {
results += digits[ value % base ];
value /= base;
-- minDigits;
// results is now little endian. For the normal big-endian
std::reverse( results.begin(), results.end() );
return results;
Called with base equal to 2, this will give you your binary
I have 8 bool variables, and I want to "merge" them into a byte.
Is there an easy/preferred method to do this?
How about the other way around, decoding a byte into 8 separate boolean values?
I come in assuming it's not an unreasonable question, but since I couldn't find relevant documentation via Google, it's probably another one of those "nonono all your intuition is wrong" cases.
The hard way:
unsigned char ToByte(bool b[8])
unsigned char c = 0;
for (int i=0; i < 8; ++i)
if (b[i])
c |= 1 << i;
return c;
void FromByte(unsigned char c, bool b[8])
for (int i=0; i < 8; ++i)
b[i] = (c & (1<<i)) != 0;
Or the cool way:
struct Bits
unsigned b0:1, b1:1, b2:1, b3:1, b4:1, b5:1, b6:1, b7:1;
union CBits
Bits bits;
unsigned char byte;
Then you can assign to one member of the union and read from another. But note that the order of the bits in Bits is implementation defined.
Note that reading one union member after writing another is well-defined in ISO C99, and as an extension in several major C++ implementations (including MSVC and GNU-compatible C++ compilers), but is Undefined Behaviour in ISO C++. memcpy or C++20 std::bit_cast are the safe ways to type-pun in portable C++.
(Also, the bit-order of bitfields within a char is implementation defined, as is possible padding between bitfield members.)
You might want to look into std::bitset. It allows you to compactly store booleans as bits, with all of the operators you would expect.
No point fooling around with bit-flipping and whatnot when you can abstract away.
The cool way (using the multiplication technique)
inline uint8_t pack8bools(bool* a)
uint64_t t;
memcpy(&t, a, sizeof t); // strict-aliasing & alignment safe load
return 0x8040201008040201ULL*t >> 56;
// bit order: a[0]<<7 | a[1]<<6 | ... | a[7]<<0 on little-endian
// for a[0] => LSB, use 0x0102040810204080ULL on little-endian
void unpack8bools(uint8_t b, bool* a)
// on little-endian, a[0] = (b>>7) & 1 like printing order
auto MAGIC = 0x8040201008040201ULL; // for opposite order, byte-reverse this
auto MASK = 0x8080808080808080ULL;
uint64_t t = ((MAGIC*b) & MASK) >> 7;
memcpy(a, &t, sizeof t); // store 8 bytes without UB
Assuming sizeof(bool) == 1
To portably do LSB <-> a[0] (like the pext/pdep version below) instead of using the opposite of host endianness, use htole64(0x0102040810204080ULL) as the magic multiplier in both versions. (htole64 is from BSD / GNU <endian.h>). That arranges the multiplier bytes to match little-endian order for the bool array. htobe64 with the same constant gives the other order, MSB-first like you'd use for printing a number in base 2.
You may want to make sure that the bool array is 8-byte aligned (alignas(8)) for performance, and that the compiler knows this. memcpy is always safe for any alignment, but on ISAs that require alignment, a compiler can only inline memcpy as a single load or store instruction if it knows the pointer is sufficiently aligned. *(uint64_t*)a would promise alignment, but also violate the strict-aliasing rule. Even on ISAs that allow unaligned loads, they can be faster when naturally aligned. But the compiler can still inline memcpy without seeing that guarantee at compile time.
How they work
Suppose we have 8 bools b[0] to b[7] whose least significant bits are named a-h respectively that we want to pack into a single byte. Treating those 8 consecutive bools as one 64-bit word and load them we'll get the bits in reversed order in a little-endian machine. Now we'll do a multiplication (here dots are zero bits)
| b7 || b6 || b4 || b4 || b3 || b2 || b1 || b0 |
× 1000000001000000001000000001000000001000000001000000001000000001
+ ↑...e....↑..d.....↑.c......↑b.......a
= abcdefghxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
The arrows are added so it's easier to see the position of the set bits in the magic number. At this point 8 least significant bits has been put in the top byte, we'll just need to mask the remaining bits out
So the magic number for packing would be 0b1000000001000000001000000001000000001000000001000000001000000001 or 0x8040201008040201. If you're on a big endian machine you'll need to use the magic number 0x0102040810204080 which is calculated in a similar manner
For unpacking we can do a similar multiplication
| b7 || b6 || b4 || b4 || b3 || b2 || b1 || b0 |
× 1000000001000000001000000001000000001000000001000000001000000001
= h0abcdefgh0abcdefgh0abcdefgh0abcdefgh0abcdefgh0abcdefgh0abcdefgh
& 1000000010000000100000001000000010000000100000001000000010000000
= h0000000g0000000f0000000e0000000d0000000c0000000b0000000a0000000
After multiplying we have the needed bits at the most significant positions, so we need to mask out irrelevant bits and shift the remaining ones to the least significant positions. The output will be the bytes contain a to h in little endian.
The efficient way
On newer x86 CPUs with BMI2 there are PEXT and PDEP instructions for this purpose. The pack8bools function above can be replaced with
_pext_u64(*((uint64_t*)a), 0x0101010101010101ULL);
And the unpack8bools function can be implemented as
_pdep_u64(b, 0x0101010101010101ULL);
(This maps LSB -> LSB, like a 0x0102040810204080ULL multiplier constant, opposite of 0x8040201008040201ULL. x86 is little-endian: a[0] = (b>>0) & 1; after memcpy.)
Unfortunately those instructions are very slow on AMD before Zen 3 so you may need to compare with the multiplication method above to see which is better
The other fast way is SSE2
x86 SIMD has an operation that takes the high bit of every byte (or float or double) in a vector register, and gives it to you as an integer. The instruction for bytes is pmovmskb. This can of course do 16 bytes at a time with the same number of instructions, so it gets better than the multiply trick if you have lots of this to do.
#include <immintrin.h>
inline uint8_t pack8bools_SSE2(const bool* a)
__m128i v = _mm_loadl_epi64( (const __m128i*)a ); // 8-byte load, despite the pointer type.
// __m128 v = _mm_cvtsi64_si128( uint64 ); // alternative if you already have an 8-byte integer
v = _mm_slli_epi32(v, 7); // low bit of each byte becomes the highest
return _mm_movemask_epi8(v);
There isn't a single instruction to unpack until AVX-512, which has mask-to-vector instructions. It is doable with SIMD, but likely not as efficiently as the multiply trick. See Convert 16 bits mask to 16 bytes mask and more generally is there an inverse instruction to the movemask instruction in intel avx2? for unpacking bitmaps to other element sizes.
How to efficiently convert an 8-bit bitmap to array of 0/1 integers with x86 SIMD has some answers specifically for 8-bits -> 8-bytes, but if you can't do 16 bits at a time for that direction, the multiply trick is probably better, and pext certainly is (except on CPUs where it's disastrously slow, like AMD before Zen 3).
#include <stdint.h> // to get the uint8_t type
uint8_t GetByteFromBools(const bool eightBools[8])
uint8_t ret = 0;
for (int i=0; i<8; i++) if (eightBools[i] == true) ret |= (1<<i);
return ret;
void DecodeByteIntoEightBools(uint8_t theByte, bool eightBools[8])
for (int i=0; i<8; i++) eightBools[i] = ((theByte & (1<<i)) != 0);
bool a,b,c,d,e,f,g,h;
//do stuff
char y= a<<7 | b<<6 | c<<5 | d<<4 | e <<3 | f<<2 | g<<1 | h;//merge
although you are probably better off using a bitset
I'd like to note that type punning through unions is UB in C++ (as rodrigo does in his answer. The safest way to do that is memcpy()
struct Bits
unsigned b0:1, b1:1, b2:1, b3:1, b4:1, b5:1, b6:1, b7:1;
unsigned char toByte(Bits b){
unsigned char ret;
memcpy(&ret, &b, 1);
return ret;
As others have said, the compiler is smart enough to optimize out memcpy().
BTW, this is the way that Boost does type punning.
There is no way to pack 8 bool variables into one byte. There is a way packing 8 logical true/false states in a single byte using Bitmasking.
You would use the bitwise shift operation and casting to archive it. a function could work like this:
unsigned char toByte(bool *bools)
unsigned char byte = \0;
for(int i = 0; i < 8; ++i) byte |= ((unsigned char) bools[i]) << i;
return byte;
Thanks Christian Rau for the correction s!
So if I have a 4 byte number (say hex) and want to store a byte say DD into hex, at the nth byte position without changing the other elements of hex's number, what's the easiest way of going about that? I'm guessing it's some combination of bitwise operations, but I'm still quite new with them, and have found them quite confusing thus far?
byte n = 0xDD;
uint i = 0x12345678;
i = (i & ~0x0000FF00) | ((uint)n << 8);
Edit: Forgot to mention, be careful if you're doing this with signed data types, so that things don't get inadvertently sign-extended.
Mehrdad's answer shows how to do it with bit manipulation. You could also use the old byte array trick (assuming C or some other language that allows this silliness):
byte n = 0xDD;
uint i = 0x12345678;
byte *b = (byte*)&i;
b[1] = n;
Of course, that's processor specific in that big-endian machines have the bytes reversed from little-endian. Also, this technique limits you to working on exact byte boundaries whereas the bit manipulation will let you modify any given 8 bits. That is, you might want to turn 0x12345678 into 0x12345DD8, which the technique I show won't do.