How does this Union and Bit field interaction work? - c++

So here is an example:
struct field
{
unsigned int a : 8;
unsigned int b : 8;
unsigned int c : 8;
unsigned int d : 8;
};
union test
{
unsigned int raw;
field bits;
};
int main()
{
test aUnion;
aUnion.raw = 0xabcdef;
printf("a: %x \n", aUnion.bits.a);
printf("b: %x \n", aUnion.bits.b);
printf("c: %x \n", aUnion.bits.c);
printf("d: %x \n", aUnion.bits.d);
return 0;
}
now running this I get:
a: ef
b: cd
c: ab
d: 0
And I guess I just dont really get whats happening here. So I set raw to a value, and since this is a union, everything else pulls from that since they have all been set to be smaller than an unsigned int? so the bit field is based on raw? but how does that map out? why is d: 0 in this instance?
I would appreciate any help here.

Using hexadecimal representation of an integer is useful because it makes clear what is the value of every byte of the integer. So the setting
aUnion.raw = 0xabcdef;
means that the value of least significant byte is 0xef, that the second least significant byte has value 0xcd and so on. But you are setting the raw field of the union, that is an integer so it is 4 bytes long. In the previous representation the most significant byte is missing, so it can be written as
aUnion.raw = 0x00abcdef;
(it is like making explicit that an integer x = 42 has 0 hundreds, 0 thousands and so on).
Your union fields represent respectively a =byte[0], b = byte[1], c = byte[2] and d = byte[3] of the integer raw, since in a union all the elements share the same memory location. This is true because you are running your code in a little endian architecture (least significant bytes come first).
So:
a = byte[0] of raw = 0xef
b = byte[1] of raw = 0xcd
c = byte[2] of raw = 0xab
d = byte[3] of raw = 0x00

Its because your unsigned int isn't 32 bit long enough (all 32 bits not set) to completely fill all the bit field values. Because it only 24 bits long, the bit field d is showing hex value of 00 . Try it for e.g.
aUnion.raw = 0xffabcdef;
which will produce
a: ef
b: cd
c: ab
d: ff
Since the dd bit field occupies bits 24-32 (on little endian), unless the assigned unsigned int field has been assigned a value that occupies those bits set, that bit field position doesn't show the value too.

Related

not able to shift hex data in a unsigned long

i am trying to convert IEEE 754 Floating Point Representation to its Decimal Equivalent so i have an example data [7E FF 01 46 4B CD CC CC CC CC CC 10 40 1B 7E] which is in hex.
char strResponseData[STATUS_BUFFERSIZE]={0};
unsigned long strData = (((strResponseData[12] & 0xFF)<< 512 ) |((strResponseData[11] & 0xFF) << 256) |((strResponseData[10] & 0xFF)<< 128 ) |((strResponseData[9] & 0xFF)<< 64) |((strResponseData[8] & 0xFF)<< 32 ) |((strResponseData[7]& 0xFF) << 16) |((strResponseData[6] & 0xFF )<< 8) |(strResponseData[5] & 0xFF));
value = IEEEHexToDec(strData,1);
then i am passing this value to this function
IEEEHexToDec(unsigned long number, int isDoublePrecision)
{
int mantissaShift = isDoublePrecision ? 52 : 23;
unsigned long exponentMask = isDoublePrecision ? 0x7FF0000000000000 : 0x7f800000;
int bias = isDoublePrecision ? 1023 : 127;
int signShift = isDoublePrecision ? 63 : 31;
int sign = (number >> signShift) & 0x01;
int exponent = ((number & exponentMask) >> mantissaShift) - bias;
int power = -1;
double total = 0.0;
for ( int i = 0; i < mantissaShift; i++ )
{
int calc = (number >> (mantissaShift-i-1)) & 0x01;
total += calc * pow(2.0, power);
power--;
}
double value = (sign ? -1 : 1) * pow(2.0, exponent) * (total + 1.0);
return value;
}
but in return am getting value 0, also when am trying to print strData it is giving me only CCCCCD.
i am using eclipse ide.
please i need some suggestion
((strResponseData[12] & 0xFF)<< 512 )
First, the << operator takes a number of bits to shift, you seem to be confusing it with multiplication by the resulting power of two - while it has the same effect, you need to supply the exponent. Given that you have no typical data types of 512 bit width, it's fairly certain that this should actually be.
((strResponseData[12] & 0xFF)<< 9 )
Next, it's necessary for the value to be shifted to be of a sufficient type to hold the result before you do the shift. A char is obviously not sufficient, so you need to explicitly cast the value to a sufficient type to hold the result before you perform the shift.
Additionally keep in mind that depending on your platform an unsigned long may be either a 32 bit or 64 bit type, so if you were doing an operation with a bit shift where the result would not fit in 32 bits, you may want to use an unsigned long long or better yet make things unambiguous, for example with #include <stdint.h> and type such as uint32_t or uint64_t. Given that your question is tagged "embedded" this is especially important to keep in mind as you might be targeting a 32 (or even 8) bit processor, but sometimes building algorithms to test on the development machine instead.
Further, a char can be either a signed or an unsigned type. Before shifting, you should make that explicit. Given that you are combining multiple pieces of something, it is almost certain that at least most of these should be treated as unsigned.
So probably you want something like
((uint32_t)(strResponseData[12] & 0xFF)<< 9 )
Unless you are on an odd platform where char is not 8 bits (for example some TI DSP's) you probably don't need to pre-mask with 0xff, but it's not hurting anything
Finally it is not 100% clear what you are staring with:
i have an example data [7E FF 01 46 4B CD CC CC CC CC CC 10 40 1B 7E] which is in hex.
Is ambiguous as it is not clear if you mean
[0x7e, 0xff, 0x01, 0x46...]
Which would be an array of byte values which debugging code has printed out in hex for human convenience, or if you actually mean that you something such as
"[7E FF 01 46 .... ]"
Which string of text containing a human readable representation of hex digits as printable characters. In the latter case, you'd first have to convert the character representation of hex digits or octets into into numeric values.

Struct with bit fields reordering?

I wrote a struct with anonymous struct and uninon:
byte is typedef unsigned char byte
struct dns_flags
{
union
{
struct
{
byte QR : 1;
byte opCode : 4;
byte AA : 1;
byte TC : 1;
byte RD : 1;
byte RA : 1;
byte zero : 3;
byte rcode : 4;
};
uint16_t flagsValue;
};
};
Which represents DNS protocol flags.
I used #pragma pack(push,1) and while sizeof(dns_flags) == 2 when flagsValue == 0x8180; then rcode = 8. So I wonder about the layout of the struct in memory: The rcode nibble is the higher one?! that just doesn't make any sense... working with VS2012
Guy
The actual order of bitfields isn't specified in the standard. It is entirely up to the compiler to do what it likes (as long as it does it the same way each time). I think most compilers follow the byte-order of the machine itself (so the first field is the lowest bit in a little endian machine, and the highest bit in a big endian machine), but this is not guaranteed, just "convention".
Do not use bit fields in a union to address bits of an unsigned integer (the implementation has compiler and machine dependencies). A compatible solution:
struct dns_flags {
uint16_t flagsValue;
uint16_t qr() const { return flagsValue & 1; }
...
};
I don't know what standard mandates, you can refer to Mats Peterson's answer, but this is how memory is layed out for this struct.
It is because your machine is little Endian and the value 0x8180 gets stored as 8081 i.e. lower byte is 0x80 at lower address and higher byte is 0x81 at higher address.
(LB)(0x80) bytes (7->0) = 0x80
(HB)(0x81) bytes (15->8) = 0x81
This forms your lower byte (0x80)
byte QR : 1; <------- Lower Address --> ( bit 0)
byte opCode : 4; --> ( bits 1,2,3,4)
byte AA : 1; --> ( bit 5)
byte TC : 1; --> ( bit 6)
byte RD : 1; --> ( bit 7)
And this is the higher byte (0x81)
byte RA : 1; --# (bit 8)
byte zero : 3; --# (bits 9,10,11)
byte rcode : 4; <------- Higher Address --# (bits 12,13,14,15)
If you check the value of RA it should be one because it is at bit position 8 and 8th bit in 0x8180 is 1(converted to int type).
And it the value written is 0x8280 then RA will have value 0 and zero will have 1.

memcpy from Byte * to unsigned int Is Reversing Byte Order

I have a CFBitVector that looks like '100000000000000'
I pass the byte array to CFBitVectorGetBits, which then contains the values from this CFBitVector. After this call, bytes[2] looks like:
bytes[0] == '0x80'
bytes[1] == '0x00'
This is exactly what I would expect. However, when copying the contents of bytes[2] to unsigned int bytesValue, the value is 128 when it should be 32768. The decimal value 128 is represented by the hex value 0x0080. Essentially it seems that the byte order is reversed while performing memcpy. What is going on here? Is this just an issue with endianness?
Thanks
CFMutableBitVectorRef bitVector = CFBitVectorCreateMutable(kCFAllocatorDefault, 16);
CFBitVectorSetCount(bitVector, 16);
CFBitVectorSetBitAtIndex(bitVector, 0, 1);
CFRange range = CFRangeMake(0, 16);
Byte bytes[2] = {0,0};
unsigned int bytesValue = 0;
CFBitVectorGetBits(bitVector, range, bytes);
memcpy(&bytesValue, bytes, sizeof(bytes));
return bytesValue;
What is going on here? Is this just an issue with endianness?
Yes.
Your computer is little endian. The 16-bit value 32768 would be represented in-memory as:
00 80
On a little endian machine. You have:
80 00
Which is the opposite, representing 128 as you're seeing.

convert 4 bytes to 3 bytes in C++

I have a requirement, where 3 bytes (24 bits) need to be populated in a binary protocol. The original value is stored in an int (32 bits). One way to achieve this would be as follows:-
Technique1:-
long x = 24;
long y = htonl(x);
long z = y>>8;
memcpy(dest, z, 3);
Please let me know if above is the correct way to do it?
The other way, which i dont understand was implemented as below
Technique2:-
typedef struct {
char data1;
char data2[3];
} some_data;
typedef union {
long original_data;
some_data data;
} combined_data;
long x = 24;
combined_data somedata;
somedata.original_data = htonl(x);
memcpy(dest, &combined_data.data.data2, 3);
What i dont understand is, how did the 3 bytes end up in combined_data.data.data2 as opposed to first byte should go into combined_data.data.data1 and next 2 bytes should go into
combined_data.data.data2?
This is x86_64 platform running 2.6.x linux and gcc.
PARTIALLY SOLVED:-
On x86_64 platform, memory is addressed from right to left. So a variable of type long with value 24, will have following memory representation
|--Byte4--|--Byte3--|--Byte2--|--Byte1--|
0 0 0 0x18
With htonl() performed on above long type, the memory becomes
|--Byte4--|--Byte3--|--Byte2--|--Byte1--|
0x18 0 0 0
In the struct some_data, the
data1 = Byte1
data2[0] = Byte2
data2[1] = Byte3
data4[2] = Byte4
But my Question still holds, Why not simply right shift by 8 as shown in technique 1 ?
A byte takes 8 bits :-)
int x = 24;
int y = x<<8;
moving by 0 you are changing nothing. By 1 - *2, by 2 - *4, by 8 - *256.
if we are on the BIG ENDIAN machine, 4 bytes are put in memory as so: 2143. And such algorythms won't work for numbers greater than 2^15. On the other way, on the BIG ENDIAN machine you should define, what means " putting integer in 3 bytes"
Hmm. I think, the second proposed algorythm will be ok, but change the order of bytes:
You have them as 2143. You need 321, I think. But better check it.
Edit: I checked on wiki - x86 is little endian, they say, so algorythms are OK

UINT16 value appears to be "backwards" when printing

I have a UINT8 pointer mArray, which is being assigned information via a *(UINT16 *) casting. EG:
int offset = someValue;
UINT16 mUINT16 = 0xAAFF
*(UINT16 *)&mArray[offset] = mUINT16;
for(int i = 0; i < mArrayLength; i++)
{
printf("%02X",*(mArray + i));
}
output: ... FF AA ...
expected: ... AA FF ...
The value I am expecting to be printed when it reaches offset is to be AA FF, but the value that is printed is FF AA, and for the life of me I can't figure out why.
You are using a little endian machine.
You didn't specify but I'm guessing your mArray is an array of bytes instead of an array of UINT16s.
You're also running on a little-endian machine. On little endian machines the bytes are stored in the opposite order of big-endian machines. Big endians store them pretty much the way humans read them.
You are probably using a computer that uses a "little-endian" representation of numbers in memory (such as Intel x86 architecture). Basically this means that the least significant byte of any value will be stored at the lowest address of the memory location that is used to store the values. See Wikipdia for details.
In your case, the number 0xAAFF consists of the two bytes 0xAA and 0xFF with 0xFF being the least significant one. Hence, a little-endian machine will store 0xFF at the lowest address and then 0xAA. Hence, if you interpret the memory location to which you have written an UINT16 value as an UINT8, you will get the byte written to that location which happens to be 0xFF
If you want to write an array of UINT16 values into an appropriately sized array of UINT8 values such that the output will match your expectations you could do it in the following way:
/* copy inItems UINT16 values from inArray to outArray in
* MSB first (big-endian) order
*/
void copyBigEndianArray(UINT16 *inArray, size_t inItems, UINT8 *outArray)
{
for (int i = 0; i < inItems; i++)
{
// shift one byte right: AAFF -> 00AA
outArray[2*i] = inArray[i] >> 8;
// cut off left byte in conversion: AAFF -> FF
outArray[2*i + 1] = inArray[i]
}
}
You might also want to check out the hton*/ntoh*-family of functions if they are available on your platform.
It's because your computer's CPU is using little endian representation of integers in memory