What does "straddle" in a Bitwise Context mean? - bit-manipulation

I am reading about bitfields from:
http://en.cppreference.com/w/cpp/language/bit_field.
The article mentions "straddl(e)"ing bits.
An example context includes...
"Adjacent bit field members may be packed to share and straddle the
individual bytes."
What does this word mean in a bitwise field?

I used that word when writing the page to refer to the situation, as correctly spotted in comments, shown in the 2nd example on that page:
#include <iostream>
struct S {
// will usually occupy 2 bytes:
// 3 bits: value of b1
// 2 bits: unused
// 6 bits: value of b2
// 2 bits: value of b3
// 3 bits: unused
unsigned char b1 : 3, : 2, b2 : 6, b3 : 2;
};
int main()
{
std::cout << sizeof(S) << '\n'; // usually prints 2
}
Here (assuming sizeof(S) is 2) the field b2 is 6 bits long, the first 3 bits are in the first byte, the second 3 bits are in the second byte. It is straddling two bytes. (the next example shows how to force all 6 bits into one byte)

Related

memory alignment- total size of structure multiple of structure alignement and not processing size

In a previous post, I have understood why we must take an alignment for a structure equal to the biggest attribute size.
Now, I would like to know why, once we have chosen this alignment, we have to do padding so that the total structure size is a multiple of the structure alignement and not the processor word size.
Here an example :
#include <stdio.h>
// Alignment requirements
// (typical 32 bit machine)
// char 1 byte
// short int 2 bytes
// int 4 bytes
// double 8 bytes
typedef struct structc_tag
{
char c;
double d;
int s;
} structc_t;
int main()
{
printf("sizeof(structd_t) = %d\n", sizeof(structd_t));
return 0;
}
We could think that size of structd_t is equal to 20 bytes with :
char c;
char Padding1[7]
double d;
int s;
because we have taken a structure alignement equal to 8 (double d).
But actually, total size is equal to 24 because 20 is not a multiple of 8 and we have to do padding after "int s" (char Padding2[4]).
If I take an array of structure, the first elements of each structure is at good adresses for a 32 bits processor (0, 20, 40) because 20, 40 ... are multiple of 4 (processing word size).
So Why have I got to do padding for having a multiple of 8 (the structure alignement), i.e 24 bytes in this example, instead of having 20 bytes (which gives good adresses for the 32 bits processor (word size = 4 bytes) : 0, 20, 40 ... ?
Thanks for your help
Consider struct_t array[n]. Would the size of structure be 20 (not multiple of 8), the second element were aligned at a 4-byte boundary.
To clarify: Let array[0] be at address 0. If size of the structure is 20, array[1] starts at address 20, and it's d lands at address 28, which is not a proper alignment for a double.

How to get the least significant 3 bits of a char in C++?

The following text is what I'm stuck with on a piece of documentation.
The least significant 3 bits of the first char of the array indicates whether
it is A or B. If the 3 bits are 0x2, then the array is in a A
format. If the 3 bits are 0x3, then the array is in a B format.
This is the first time in my life I have ever touched on with this least significant bits thingy. After searching on StackOverflow, this is what I did:
int lsb = first & 3;
if (lsb == 0x02)
{
// A
}
else if (lsb == 0x03)
{
// B
}
Is this correct? I want to ensure this is the right way (and avoid blowing my foot off later) before I move on.
The least significant 3 bits of x are taken using x&7 unlike the first & 3 you use. In fact first & 3 will take the least significant 2 bits of first.
You should convert the numbers to binary to understand why this is so: 3 in binary is 11, while 7 is 111.
Normally, 3 least significant bits should be yourchar&0x07 unstead.
7 because 7 is 1+2+4 or binary 111, corresponding to the 3 LSB.
EDIT: grilled, should be deleted. Sorry.
The variable you need will have every bit zero and three LSBs 1, which is 0111 in short.
0111 is 0x7, use variable & 0x7 to mask your variable.
Google bit masking for more information about it.
d3 = b11 = b01 | b10
So no, right now you're comparing only the 2 LSBs. b111 would be d7
If you want to write down the number of bits to take, You'd have to write it as
unsigned int ls3b = ~(UINT_MAX << 3);
what this does is, it takes the all 1 bit array, shifts it by 3 bits to the left (leaving the 3 LSBs 0) and then inverts it.

Bit field manipulation

In the following code
#include <iostream>
using namespace std;
struct field
{
unsigned first : 5;
unsigned second : 9;
};
int main()
{
union
{
field word;
int i;
};
i = 0;
cout<<"First is : "<<word.first<<" Second is : "<<word.second<<" I is "<<i<<"\n";
word.first = 2;
cout<<"First is : "<<word.first<<" Second is : "<<word.second<<" I is "<<i<<"\n";
return 0;
}
when I initialize word.first = 2, as expected it updates 5-bits of the word, and gives the desired output. It is the output of 'i' that is a bit confusing. With word.first = 2, i gives output as 2, and when I do word.second = 2, output for i is 64. Since, they share the same memory block, shouldnt the output (for i) in the latter case be 2?
This particular result is platform-specific; you should read up on endianness.
But to answer your question, no, word.first and word.second don't share memory; they occupy separate bits. Evidently, the underlying representation on your platform is thus:
bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | second | first |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|<------------------- i ----------------------->|
So setting word.second = 2 sets bit #6 of i, and 26 = 64.
While this depends on both your platform and your specific compiler, this is what happens in your case:
The union overlays both the int and the struct onto the same memory. Let us assume, for now, that your int has a size of 32 bits. Again, this depends on multiple factors. Your memory layout will look something like this:
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
SSSSSSSSSFFFFF
Where I stands for the integer, S for the second field and F for the first field of your struct. Note that I have represented the most significant bit on the left.
When you initialize the integer to zero, all the bits are set as zero, so first and second are also zero.
When you set word.first to two, the memory layout becomes:
00000000000000000000000000000010
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
SSSSSSSSSFFFFF
Whichs leads to a value of 2 for the integer. However, by setting the value of word.second to 2, the memory layout becomes:
00000000000000000000000001000000
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
SSSSSSSSSFFFFF
Which gives you a value of 64 for the integer.

C++ Explanation of the code

I have array from serial read, named sensor_buffer. It contains 21 bytes.
gyro_out_X=((sensor_buffer[1]<<8)+sensor_buffer[2]);
gyro_out_Y=((sensor_buffer[3]<<8)+sensor_buffer[4]);
gyro_out_Z=((sensor_buffer[5]<<8)+sensor_buffer[6]);
acc_out_X=((sensor_buffer[7]<<8)+sensor_buffer[8]);
acc_out_Y=((sensor_buffer[9]<<8)+sensor_buffer[10]);
acc_out_Z=((sensor_buffer[11]<<8)+sensor_buffer[12]);
HMC_xo=((sensor_buffer[13]<<8)+sensor_buffer[14]);
HMC_yo=((sensor_buffer[15]<<8)+sensor_buffer[16]);
HMC_zo=((sensor_buffer[17]<<8)+sensor_buffer[18]);
adc_pressure=(((long)sensor_buffer[19]<<16)+(sensor_buffer[20]<<8)+sensor_buffer[21]);
What does this line do:
variable = (array_var<<8) + next_array_var
What effect does it have on the 8 bits?
<<8 ?
UPDATE:
Any example in another language (java, processing)?
Example for processing: (why use H like header?).
/*
* ReceiveBinaryData_P
*
* portIndex must be set to the port connected to the Arduino
*/
import processing.serial.*;
Serial myPort; // Create object from Serial class
short portIndex = 1; // select the com port, 0 is the first port
char HEADER = 'H';
int value1, value2; // Data received from the serial port
void setup()
{
size(600, 600);
// Open whatever serial port is connected to Arduino.
String portName = Serial.list()[portIndex];
println(Serial.list());
println(" Connecting to -> " + Serial.list()[portIndex]);
myPort = new Serial(this, portName, 9600);
}
void draw()
{
// read the header and two binary *(16 bit) integers:
if ( myPort.available() >= 5) // If at least 5 bytes are available,
{
if( myPort.read() == HEADER) // is this the header
{
value1 = myPort.read(); // read the least significant byte
value1 = myPort.read() * 256 + value1; // add the most significant byte
value2 = myPort.read(); // read the least significant byte
value2 = myPort.read() * 256 + value2; // add the most significant byte
println("Message received: " + value1 + "," + value2);
}
}
background(255); // Set background to white
fill(0); // set fill to black
// draw rectangle with coordinates based on the integers received from Arduino
rect(0, 0, value1,value2);
}
Your code has the same pattern:
value = (partial_value << 8) | (other_partial_value)
Your array has data stored in 8 bit bytes, but the values are in 16 bit bytes. Each of your data points are two bytes, with the most significant byte stored first in your array. This pattern simply builds the full 16 bit value by shifting the most significant byte 8 bits to the left, then OR'ing the least significant byte into the lower 8 bits.
Its a shift operator. It shifts the bits in you variable to the left by 8. Shift by 1 bit to the left is equivalent to multiplying by two (shifting to the right divides by 2). So essentially <<8 is equivalent to multiplying by 2^8.
See here for a list of C++ operators and what they do:
http://en.wikipedia.org/wiki/C%2B%2B_operators
<< is the left bit-shift operator, the result is the bits from the first operand moved to the left, with 0 bits filling in from the right.
A simple example in pseudocode:
x = 10000101;
x = x << 3;
now x is "00101000"
Study the Bitwise operation article on wikipedia for an introduction.
This is just a bit shift operator. If is basically taking the value and shitfing the bits a places to the left. This is equivalent to multiplying the value by 2^8. The code looks like its reading in 2 bytes of the array and creating a 16 bit integer from each pair.
It seems that sensor_buffer is a matrix of chars.
In order to get your value, e.g. gyro_out_X you have to combine sensor_buffer[1] and sensor_buffer[2],
where
sensor_buffer[1] holds the most significant byte and
sensor_buffer[2] holds the least significant byte
in that case
int gyro_out_X=((sensor_buffer[1]<<8)+sensor_buffer[2]);
combines the two bytes:
if sensor_buffer[1] is 0xFF
and sensor_buffer[2] is 0x10
then gyro_out_X is 0xFF10
It shifts the bits 8 places to the left, eg:
0000000001000100 << 8 = 0100010000000000
0000000001000100 << 1 =
0000000010001000 << 1 =
0000000100010000 << 1 =
0000001000100000 << 1 =
0000010001000000 << 1 =
0000100010000000 << 1 =
0001000100000000 << 1 =
0010001000000000 << 1 =
0100010000000000

Size of structure with a char, a double, an int and a t [duplicate]

This question already has answers here:
Why isn't sizeof for a struct equal to the sum of sizeof of each member?
(13 answers)
Closed 8 years ago.
When I run only the code fragment
int *t;
std::cout << sizeof(char) << std::endl;
std::cout << sizeof(double) << std::endl;
std::cout << sizeof(int) << std::endl;
std::cout << sizeof(t) << std::endl;
it gives me a result like this:
1
8
4
4
Total: 17.
But when I test sizeof struct which contains these data types it gives me 24, and I am confused. What are the additional 7 bytes?
This is the code
#include <iostream>
#include <stdio.h>
struct struct_type{
int i;
char ch;
int *p;
double d;
} s;
int main(){
int *t;
//std::cout << sizeof(char) <<std::endl;
//std::cout << sizeof(double) <<std::endl;
//std::cout << sizeof(int) <<std::endl;
//std::cout << sizeof(t) <<std::endl;
printf("s_type is %d byes long",sizeof(struct struct_type));
return 0;
}
:EDIT
I have updated my code like this
#include <iostream>
#include <stdio.h>
struct struct_type{
double d_attribute;
int i__attribute__(int(packed));
int * p__attribute_(int(packed));;
char ch;
} s;
int main(){
int *t;
//std::cout<<sizeof(char)<<std::endl;
//std::cout<<sizeof(double)<<std::endl;
//std::cout<<sizeof(int)<<std::endl;
//std::cout<<sizeof(t)<<std::endl;
printf("s_type is %d bytes long",sizeof(s));
return 0;
}
and now it shows me 16 bytes. Is it good, or have I lost some important bytes?
There is some unused bytes between some members to keep the alignments correct. For example, a pointer by default reside on 4-byte boundaries for efficiency, i.e. its address must be a multiple of 4. If the struct contains only a char and a pointer
struct {
char a;
void* b;
};
then b cannot use the adderss #1 — it must be placed at #4.
0 1 2 3 4 5 6 7
+---+- - - - - -+---------------+
| a | (unused) | b |
+---+- - - - - -+---------------+
In your case, the extra 7 bytes comes from 3 bytes due to alignment of int*, and 4 bytes due to alignment of double.
0 1 2 3 4 5 6 7 8 9 a b c d e f
+---------------+---+- - - - - -+---------------+- - - - - - - -+
| i |ch | | p | |
+---------------+---+- - - - - -+---------------+- - - - - - - -+
10 11 12 13 14 15 16 17
+-------------------------------+
| d |
+-------------------------------+
... it gives me 24, and I am confused. What are the additional 7 bytes?
These are padding bytes inserted by the compiler. Data structure padding is implementation dependent.
From Wikipedia, Data structure alignment:
Data alignment means putting the data at a memory offset equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory. To align the data, it may be necessary to insert some meaningless bytes between the end of the last data structure and the start of the next, which is data structure padding.
To expand slightly on KennyDM's excellent answer (Kenny - please do steal this to supplement your answer if you want), this is probably what your memory structure looks like once the compiler has aligned all of the variables:
0 1 2 3 4 5 6 7
+-------------------+----+-----------+
| i | ch | (unused) |
+-------------------+----+-----------+
8 9 10 11 12 13 14 15
+-------------------+----------------+
| p | (unused) |
+-------------------+----------------+
16 17 18 19 20 21 22 23
+------------------------------------+
| d |
+------------------------------------+
So, because of the 3-byte gap between "ch" and "p", and the 4 byte gap between "p" and "d", you get a 7 byte padding for your structure, thus the size of 24 bytes. Since your environment's double has 8-byte alignment (i.e. it must reside in it's own block of 8-bytes, as you can see above), the entire struct will also be 8-byte aligned over-all, and so even re-ordering the variables will not alter the size from 24 bytes.
It's 24 bytes due to padding.
Most compilers pad data to a multiple of its size.
So, a 4-byte int is padded to a multiple of 4 bytes.
A 8-byte double is padded to a multiple of 8 bytes.
For your structure, this means:
struct struct_type{
int i; // offset 0 (0*4)
char ch; // offset 4 (4*1)
char padding1[3];
int *p; // offset 8 (2*4)
char padding1[4];
double d; // offset 16 (2*8)
}s;
You can optimize your struct like that:
struct struct_type{
double d;
int i;
int *p;
char ch;
}s;
sizeof(s)==17 on most compilers (20 on some others)
The compiler is allowed to align the members of the structure to addresses for faster access. e.g. 32-bit-boundaries. It is only required by the standard, that the members of the object are stored in the order they are declared. So always make sure you use sizeof and offsetof when you need an exact position in memory.
The additional size comes from data alignment, i.e. the members are aligned to multiples 4 or 8 bytes.
Your compiler probably aligns int and pointers to multiples for 4 bytes and the double to multiples for 8 bytes.
If you move the double to a different position within the struct, you might be able to reduce the size of the struct from 24 to 20 bytes. But it depends on the compiler.
Also sometimes you need the struct to mantain the order you required. In this cases, if you are using gcc, you should use the __attribute__((packed)) statement.
See also this for further info.
$9.2/12 states - "Nonstatic data members of a (non-union) class declared without an intervening access-specifier are allocated so that later members have higher addresses within a class object. The order of allocation of nonstatic data members separated by an access-specifier is unspecified (11.1). Implementation alignment requirements
might cause two adjacent members not to be allocated immediately after each other; so might
requirements for space for managing virtual functions (10.3) and virtual base classes (10.1)."
So just like the sizeof(double) and sizeof(int), the offsets at which structure members would be aligned is unspecified, except that members that are declared later are at higher addresses.
See comp.lang.c FAQ list · Question 2.12:
Why is my compiler leaving holes in structures, wasting space and preventing ``binary'' I/O to external data files? Can I turn this off, or otherwise control the alignment of structure fields?