Bit field manipulation - c++

In the following code
#include <iostream>
using namespace std;
struct field
{
unsigned first : 5;
unsigned second : 9;
};
int main()
{
union
{
field word;
int i;
};
i = 0;
cout<<"First is : "<<word.first<<" Second is : "<<word.second<<" I is "<<i<<"\n";
word.first = 2;
cout<<"First is : "<<word.first<<" Second is : "<<word.second<<" I is "<<i<<"\n";
return 0;
}
when I initialize word.first = 2, as expected it updates 5-bits of the word, and gives the desired output. It is the output of 'i' that is a bit confusing. With word.first = 2, i gives output as 2, and when I do word.second = 2, output for i is 64. Since, they share the same memory block, shouldnt the output (for i) in the latter case be 2?

This particular result is platform-specific; you should read up on endianness.
But to answer your question, no, word.first and word.second don't share memory; they occupy separate bits. Evidently, the underlying representation on your platform is thus:
bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | second | first |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|<------------------- i ----------------------->|
So setting word.second = 2 sets bit #6 of i, and 26 = 64.

While this depends on both your platform and your specific compiler, this is what happens in your case:
The union overlays both the int and the struct onto the same memory. Let us assume, for now, that your int has a size of 32 bits. Again, this depends on multiple factors. Your memory layout will look something like this:
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
SSSSSSSSSFFFFF
Where I stands for the integer, S for the second field and F for the first field of your struct. Note that I have represented the most significant bit on the left.
When you initialize the integer to zero, all the bits are set as zero, so first and second are also zero.
When you set word.first to two, the memory layout becomes:
00000000000000000000000000000010
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
SSSSSSSSSFFFFF
Whichs leads to a value of 2 for the integer. However, by setting the value of word.second to 2, the memory layout becomes:
00000000000000000000000001000000
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
SSSSSSSSSFFFFF
Which gives you a value of 64 for the integer.

Related

type conversion from int to class behaving weirdly

So. I am trying to convert a uint16_t (16 byte int) to class. To get the class member varaible. But it is not working as expected.
class test{
public:
uint8_t m_pcp : 3; // Defining max size as 3 bytes
bool m_dei : 1;
uint16_t m_vid : 12; // Defining max size as 12 bytes
public:
test(uint16_t vid, uint8_t pcp=0, bool dei=0) {
m_vid = vid;
m_pcp = pcp;
m_dei = dei;
}
};
int main() {
uint16_t tci = 65535;
test t = (test)tci;
cout<<"pcp: "<<t.m_pcp<<" dei: "<<t.m_dei<<" vid "<<t.m_vid<<"\n";
return 0;
}
Expected output:
pcp:1 dei: 1 vid 4095
The actual output:
pcp: dei: 0 vid 4095
Also,
cout<<sizeof(t)
returns 2. shouldn't it be 4?
Am I doing something wrong?
test t = (test)tci;
This line does not perform the cast you expect (which would be a reinterpret_cast, but it would not compile). It simply calls your constructor with the default values. So m_vid is assigned 65535 truncated to 12 bits, and m_pcp and m_dei are assigned 0. Try removing the constructor to see that it does not compile.
The only way I know to do what you want is to write a correct constructor, like so:
test(uint16_t i) {
m_vid = i & 0x0fff;
i >>= 12;
m_dei = i & 0x1;
i >>= 1;
m_pcp = i & 0x7;
}
Demo
Also I'm not sure why you would expect m_pcp to be 1, since the 3 highest bits of 65535 make 7.
Also, cout<<sizeof(t) returns 2. shouldn't it be 4?
No, 3+1+12=16 bits make 2 bytes.
Your bitfields have 16 bits in total, so 2 bytes is correct for size. (compiler will pack together adjacent bitfields -- but be wary since may vary across compilers) Your constructor on a single uint16_t value assigns just 12 bits of the value to m_vid and 0 to the other members. The first 12 bits of 65535 are 4095, so the output is correctly as you note (NOTE: your comments on bitfields being bytes should read "bits"), but your expectation for the others is off. The constructor clearly says to provide a 0 value for them if not specified.

packing an array of 3 values in buffer

I have the following problem I am unable to solve gracefully.
I have a data type that can take 3 possible values (0,1,2).
I have an array of 20 element of this data type.
As I want to encode the information on the least amount of memory, I did the following :
consider that each element can take up to 4 values (2 bits)
each char holds 8 bits, so I can put 4 times an element
5 char holds 40 bits, so I can store 20 elements.
I have done this and it works time.
However I'm interested evaluating the space gained by using the fact that my element can only take 3 values and not 4.
Every possible combination gives us 3 to the 20th power, which is 3,486,784,401. However 256 to the 4th power gives us 4,294,967,296 , which is greater. This means I could encode my data on 4 char .
Is there an generic method to do the 2nd idea here ? The 1st idea is simple to implement with bit mask / bit shifts. However since 3 values doesn't fit in an integer number of bits, I have no idea how to encode / decode any of these values into an array of 4 char.
Do you have any idea or reference on how it's done ? I think there must be a general method. If anything I'm interested about the feasability of this
edit : this could be simplified to : how to store 5 values from 0 to 2 into 1 byte only (as 256 >= 3^5 = 243)
You should be able to do what you said using 4 bytes. Assume that you store the 20 values into a single int32_t called value, here is how you would extract any particular element:
element[0] = value % 3;
element[1] = (value / 3) % 3;
element[2] = (value / 9) % 3;
...
element[19] = (value / 1162261467) % 3; // 1162261467 = 3 ^ 19
Or as a loop:
for (i=0;i<20;i++) {
element[i] = value % 3;
value /= 3;
}
To build value from element, you would just do the reverse, something like this:
value = 0;
for (i=19;i>=0;i--)
value = value * 3 + element[i];
There is a generic way to figure out how much bits you need:
If your data type has N different values, then you need log(N) / log(2) bits to store this value. For instance in your example, log(3) / log(2) equals 1.585 bits.
Of course in reality you will to pack a fixed amount of values in an integer number of bits, so you have to multiply this 1.585 with that amount and round up. For instance if you pack 5 of them:
1.585 × 5 = 7.925, meaning that 5 of your values just fit in one 8-bit char.
The way to unpack the values has been shown in JS1's answer. The generic formula for unpacking is element[i] = (value / (N ^ i) ) mod N
Final note, this is only meaningful if you really need to optimize memory usage. For comparison, here are some popular ways people pack these value types. Most of the time the extra space taken up is not a problem.
an array of bool: uses 8 bits to store one bool. And a lot of people really dislike the behavior of std::vector<bool>.
enum Bla { BLA_A, BLA_B, BLA_C}; an array or vector of Bla probably uses 32 bits per element (sizeof(Bla) == sizeof(int)).

what value will be printed out if it is out of range in C++

I am a rookie in C++ and I have got a question here.
I use an int to print the first 100 power of 2. I know that the outcome will be out of range of an int variable. I am just curious since the result given by the program is 0. How did 0 come out?
Thanks in advance!
My code is as followed:
#include<iostream>
using namespace std;
void main()
{
int a=1;
unsigned int b=1;
for (int i=1;i<=100;i++)
{
a=2*a;
b=2*b;
}
cout<<"the first 1oo powers of 2 is (using an signed int): "<<a<<endl;
cout<<"the first 1oo powers of 2 is (using an unsigned int): "<<b<<endl;
//The fix
cout<<"Enter a Char to Exit."<<endl;
char theFix;
cin>>theFix;
}
Multiplying an unsigned integer or a positive signed integer by 2 is like shifting left by 1, while a 0 bit will be shifted in from the right. After 32 iterations (assuming 32 bit integers), the entire value will be all 0 bits. After that, shifting 0 left will not change the outcome anymore.
Since, you're new to C++, you might not know how the computer stores information. Eventually, all integers are broken down into 32-bit binary numbers (a bunch of 1's and 0's).
a = a * 2; // multiplication
a << 1; // left shift
These two instructions are actually synonymous due to the nature of binary numbers.
For instance, 0....000010 in binary notation == 2 in decimal notation.
So,
2 * 2 = 4 = 0....000100
4 * 2 = 8 = 0....001000
8 * 2 = 16 = 0....010000
and so on...
Since the bit count is capped at 32 for integers, you'll get a huge number 2^32 == 1000....000. When you multiply by 2 again, the number is shifted left again and you end up with 000...000000 = 0.
All further multiplications of 0 = zero, so that's where your final result came from.
EDIT: Would just like to point out that this is one of the only situations where this exact result would occur. If you were to try using the number 3, for example, you would see the expected integer overflow behavior.

c++; Is bitset the solution for me?

I am writing a program and using memcpy to copy some bytes of data, using the following code;
#define ETH_ALEN 6
unsigned char sourceMAC[6];
unsigned char destMAC[6];
char* txBuffer;
....
memcpy((void*)txBuffer, (void*)destMAC, ETH_ALEN);
memcpy((void*)(txBuffer+ETH_ALEN), (void*)sourceMAC, ETH_ALEN);
Now I want to copy some data on to the end of this buffer (txBuffer) that is less than a single byte or greater than one byte, so it is not a multiple of 8 (doesn't finish on a whole byte boundary), so memcpy() can't be used (I don't believe?).
I want to add 16 more bits worth of data which is a round 4 bytes. First I need to add a value into the next 3 bits of txtBuffer which I have stored in an int, and a fourth bit which is always 0. Next I need to copy another 12 bit value, again I have this in an int.
So the first decimal value stored in an int is between 0 and 7 inclusively, the same is true for the second number I mention to go into the final 12 bits. The stored value is within the rang of 2^12. Should I for example 'bit-copy' the last three bits of the int into memory, or merge all these values together some how?
Is there a way I can compile these three values into 4 bytes to copy with memcpy, or should I use something like bitset to copy them in, bit at a time?
How should I solve this issue?
Thank you.
Assuming int is 4 bytes on your platform
int composed = 0;
int three_bits = something;
int twelve_bits = something_else;
composed = (three_bits & 0x07) | (1 << 3) | ((twelve_bits << 4) & 0xFFFFFF0);

c/c++ how to convert short to char

I am using ms c++. I am using struct like
struct header {
unsigned port : 16;
unsigned destport : 16;
unsigned not_used : 7;
unsigned packet_length : 9;
};
struct header HR;
here this value of header i need to put in separate char array.
i did memcpy(&REQUEST[0], &HR, sizeof(HR));
but value of packet_length is not appearing properly.
like if i am assigning HR.packet_length = 31;
i am getting -128(at fifth byte) and 15(at sixth byte).
if you can help me with this or if their is more elegant way to do this.
thanks
Sounds like the expected behaviour with your struct as you defined packet_length to be 9 bits long. So the lowest bit of its value is already within the fifth byte of the memory. Thus the value -128 you see there (as the highest bit of 1 in a signed char is interpreted as a negative value), and the value 15 is what is left in the 6th byte.
The memory bits look like this (in reverse order, i.e. higher to lower bits):
byte 6 | byte 5 | ...
0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0
packet_length | not_used | ...
Note also that this approach may not be portable, as the byte order inside multibyte variables is platform dependent (see endianness).
Update: I am not an expert in cross-platform development, neither did you tell much details about the layout of your request etc. Anyway, in this situation I would try to set the fields of the request individually instead of memcopying the struct into it. That way I could at least control the exact values of each individual field.
struct header {
unsigned port : 16;
unsigned destport : 16;
unsigned not_used : 7;
unsigned packet_length : 9;
};
int main(){
struct header HR = {.packet_length = 31};
printf("%u\n", HR.packet_length);
}
$ gcc new.c && ./a.out
31
Update:
i now that i can print that value directly by using attribute in struct. But i need to send this struct on network and their i am using java.
In that case, use an array of chars (length 16+16+7+9) and parse on the other side using java.
Size of array will be less than struct, and more packing could be possible in a single MTU.