I was surfing in one of our organisational data documents and I came across the following piece of code.
struct A {
unsigned short int i:1;
unsigned short int j:1;
unsigned short int k:14;
};
int main(){
A aa;
int n = sizeof(aa);
cout << n;
}
Initially I thought the size will be 6 bytes as the size of the unsigned short int is 2 bytes. but the output of the above code was 2 bytes(On visual studio 2008).
Is there a slight possibility that the i:1, j:1 and k:14 makes it a bit field or something? Its just a guess and I am not very sure about it. Can somebody please help me in this?
Yes, this is bitfield, indeed.
Well, i'm not very much sure about c++, but In c99 standard, as per chapter 6.7.2.1 (10):
An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next bits or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.
That makes your structure size (1 bit + 1 bit + 14 bits) = 16 bits = 2 bytes.
Note: No structure padding is considered here.
Edit:
As per C++14 standard, chapter §9.7,
A member-declarator of the form
identifieropt attribute-specifier-seqopt: constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. [...] Allocation of bit-fields within a class object is
implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.
Related
See the C version of this questions here.
I have two questions concerning bit fields when there are padding bits.
Say I have a struct defined as
struct T {
unsigned int x: 1;
unsigned int y: 1;
};
Struct T only has two bits actually used.
Question 1: are these two bits always the least significant bits of the underlying unsigned int? Or it is platform dependent?
Question 2: Are those unused 30 bits always initialized to 0? What does the C++ standard say about it?
Question 1: are these two bits always the least significant bits of the underlying unsigned int? Or it is platform dependent?
Very platform dependent. The standard even has a note just to clarify how much:
[class.bit]
1 ...Allocation of bit-fields within a class object is
implementation-defined. Alignment of bit-fields is
implementation-defined. Bit-fields are packed into some addressable
allocation unit. [ Note: Bit-fields straddle allocation units on some
machines and not on others. Bit-fields are assigned right-to-left on
some machines, left-to-right on others. — end note ]
You can't assume much of anything about the object layout of a bit field.
Question 2: Are those unused 30 bits always initialized to 0? What does the C++ standard say about it?
Your example has a simple aggregate, so we can enumerate the possible initializations. Specifying no initializer...
T t;
... will default initialize it, leaving the members with indeterminate value. On the other hand, if you specify empty braces...
T t{};
... the object will be aggregate initialized, and so the bit fields will be initialized with {} themselves, and set to zero. But that applies only to the members of the aggregate, which are the bit fields. It's not specified what value, if any, the padding bits take. So we cannot assume they will be initialized to zero.
Q1: Usually from low to hi (i.e. x is 1 << 0, y is 1 << 1, etc).
Q2: The value of the unused bits is undefined. On some compilers/platforms, stack initialised variables might be set to zero first (might!!), but don't count on it!! Heap allocated variables could be anything, so it's best to assume the bits are garbage. Using a slightly non standard anonymous struct buried in a union, you could do something like this to ensure the value of the bits:
union T {
unsigned intval;
struct {
unsigned x : 1;
unsigned y : 1;
};
};
T foo;
foo.intval = 0;
What does the following C++ code mean?
unsigned char a : 1;
unsigned char b : 7;
I guess it creates two char a and b, and both of them should be one byte long, but I have no idea what the ": 1" and ": 7" part does.
The 1 and the 7 are bit sizes to limit the range of the values. They're typically found in structures and unions. For example, on some systems (depends on char width and packing rules, etc), the code:
typedef struct {
unsigned char a : 1;
unsigned char b : 7;
} tOneAndSevenBits;
creates an 8-bit value, one bit for a and 7 bits for b.
Typically used in C to access "compressed" values such as a 4-bit nybble which might be contained in the top half of an 8-bit char:
typedef struct {
unsigned char leftFour : 4;
unsigned char rightFour : 4;
} tTwoNybbles;
For the language lawyers amongst us, the 9.6 section of the C++11 standard explains this in detail, slightly paraphrased:
Bit-fields [class.bit]
A member-declarator of the form
identifieropt attribute-specifieropt : constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. The optional attribute-specifier appertains to the entity being declared. The bit-field attribute is not part of the type of the class member.
The constant-expression shall be an integral constant expression with a value greater than or equal to zero. The value of the integral constant expression may be larger than the number of bits in the object representation of the bit-field’s type; in such cases the extra bits are used as padding bits and do not participate in the value representation of the bit-field.
Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.
Note: bit-fields straddle allocation units on some machines and not on others. Bit-fields are assigned right-to-left on some machines, left-to-right on others. - end note
I believe those would be bitfields.
Strictly speaking, a bitfield must be a int, unsigned int, or _Bool. Although most compilers will take any integral type.
Ref C11 6.7.2.1:
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.
Your compiler will probably allocate 1 byte of storage, but it is free to grab more.
Ref C11 6.7.2.1:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field.
The savings comes when you have multiple bitfields that are declared one after another. In this case, the storage allocated will be packed if possible.
Ref C11 6.7.2.1:
If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined.
The smallest unit of storage is a byte, see quotes from standard here:
The fundamental storage unit in the C++ memory model is the byte.
But then a memory location is defined to possibly be adjacent bit-fields:
A memory location is either an object of scalar type or a maximal sequence of adjacent bit-fields all having nonzero width.
I would like to understand this definition:
If the smallest storage unit is a byte, why don't we define a memory location as a sequence of bytes?
How do C-style bitfields fit with the first sentence at all?
What is the point of maximal sequence; what is maximal here?
If we have bitfields in the definition of memory, why do we need anything else? E.g. a float or int both are made up of bits, so the 'either an object of scalar type'-part seems redundand.
Let's analyze the terms:
For reference:
http://en.cppreference.com/w/cpp/language/bit_field
http://en.cppreference.com/w/cpp/language/memory_model
Byte
As you said, smallest unit of (usually) 8 bits in memory, explicitly addressable using a memory address.
Bit-Field
A sequence of BITS given with explicit bit count!
Memory location
Every single address, of a byte-multiple type OR (!!!) the beginning of a contigious sequence of bit-fields of non-zero size.
Your questions ##
Let's take the cpp-reference example with some more commments and answer your questions one by one:
struct S {
char a; // memory location #1, 8-bit character, no sequence, see missing :#, scalar-type.
int b : 5; // memory location #2, new sequence, new location, integer-type of 5-bits length
int c : 11, // memory location #2 (continued) integer-type of 11-bits length
: 0, // (continued but ending!) IMPORTANT: zero-size-bitfield, sequence ends here!!!
d : 8; // memory location #3 integer-type 8-bit, starts a new bit-field sequence, thus, new memory-location
struct {
int ee : 8; // memory location #4
} e;
} obj; // The object 'obj' consists of 4 separate memory locations
If the smallest storage unit is a byte, why don't we define a memory location as a sequence of bytes?
Maybe we want to have a fine-grained bit-level control of memory-consumption for given system-types, i.e. 7 bit integer, or 4 bit char, ...
A byte as the holy-grail of units would deny us that freedom
How do C-style bitfields fit with the first sentence at all?
Actually, since the bit-field feature originates in C...
The important thing here is, even if you define a struct with bitfields, consuming for example only 11 bits, the first bit will be byte-aligned in the memory, i.e. will have a location aligned to 8-bit steps and the data-type will finally consume at least (!) 16 bits, to hold the bitfield...
The exact way to store the data is at least in C not standardized afaik.
What is the point of maximal sequence; what is maximal here?
The point of maximal sequence is to allow efficient memory alignment of individual fields, compiler optimization, ... Maximal in this case means all bitfields declared in a sequences of size >= 1, i.e. i.e. no other scalar types and no bitfield with ':0'
If we have bitfields in the definition of memory, why do we need anything else? E.g. a float or int both are made up of bits, so the 'either an object of scalar type'-part seems redundand.
Nope, both are made up of bits, BUT: Not specifying the bit-size of the type, will make the compiler assume default size, i.e. int: 32-bit... If you don't need so much resolution of the integer value, but for example only 24bit, you write unsigned int v : 24, ...
Of course, the non-bitfield way to write stuff can be expressed with bitfields, e.g.:
int a,
int b : 32 // should be equal to a
BUT (something I don't know, any captain here?)
If the system defined default with of type T is n-bits and you write something like:
T value : m // m > n
I don't know what is the resulting behaviour...
You can infer some of the reasons by looking at the statement that follows: " Two or more threads of execution can access separate memory locations without interfering with each other."
I.e. two threads cannot access separate bytes of an object. And two threads accessing adjacent bitfields may also interfere with each other.
The maximal sequence here is because the standard doesn't exactly specify how a sequence of bitfields is mapped to bytes, and which of those bytes can then be accessed independently. Implementations may vary in this respect. However, the maximal sequence of bitfields is the longest sequence that any implementation may allocate as a whole. In particular, a maximal sequence ends with a btfield of width 0. The next bitfield starts a new sequence.
And while integers and floats are made up of bits, "bitfield" in C and C++ refers specifically to 'object members of integral type, whose width in bits is explicitly specified.' Not everything made of bits is a bitfield.
I'm reading a bit about alignment in C++, and I am not sure why the alignment of a class that contains solely a char array member is not the sizeof of the array, but turns out to be always 1. For example
#include <iostream>
struct Foo{char m_[16];}; // shouldn't this have a 16 byte alignment?!
int main()
{
std::cout << sizeof(Foo) << " " << alignof(Foo);
}
Live on Coliru
in the code above it's clear that the sizeof(Foo) is 16, however its alignment is 1, see the output of the code.
Why is the alignof(Foo) 1 in this case?
Note that if I replace char m_[16]; with a fundamental type like int m_;, then alignof(Foo) becomes what I would've expected, i.e. sizeof(int) (on my machine this is 4).
Same happens if I simply declare an array char arr[16];, then alignof(arr) will be 1.
Note: data alignment has been explained in details in this article. If you want to know what the term means in general and why it is an important issue read the article.
Aligment is defined in C++ as an implementation-defined integer value representing the number of bytes between successive addresses at which a given object can be allocated [6.11.1] Alignment.
Moreover alignments must be non-negative integral powers of 2 [6.11.4] Alignment.
When we calculate the alignment of a struct we have to take into account yet another rule [6.11.5] Alignment:
Alignments have an order from weaker to stronger or stricter
alignments. Stricter alignments have larger alignment values. An
address that satisfies an alignment requirement also satisfies any
weaker valid alignment requirement.
It's not directly stated but these rules imply that struct alignment has to be at least as strict as the alignment of its most strictly aligned member. It could be bigger but it doesn't have to be and usually isn't.
So when the alignment of the struct from OP's example is decided the alignment of the struct must be no less than alignment of its only member's type char[16]. Then by the 8.3.6 [expr.alignof]:
When alignof is applied to a reference type, the result is the
alignment of the referenced type. When alignof is applied to an array
type, the result is the alignment of the element type.
alignof(char[16]) equals alignof(char) which will usually be 1 because of [6.11.6] Alignment:
(...) narrow character types shall have the weakest alignment requirement.
In this example:
struct Foo
{
char c[16];
double d;
};
double has more strict alignment than char so alignof(Foo) equals alignof(double).
What does the following C++ code mean?
unsigned char a : 1;
unsigned char b : 7;
I guess it creates two char a and b, and both of them should be one byte long, but I have no idea what the ": 1" and ": 7" part does.
The 1 and the 7 are bit sizes to limit the range of the values. They're typically found in structures and unions. For example, on some systems (depends on char width and packing rules, etc), the code:
typedef struct {
unsigned char a : 1;
unsigned char b : 7;
} tOneAndSevenBits;
creates an 8-bit value, one bit for a and 7 bits for b.
Typically used in C to access "compressed" values such as a 4-bit nybble which might be contained in the top half of an 8-bit char:
typedef struct {
unsigned char leftFour : 4;
unsigned char rightFour : 4;
} tTwoNybbles;
For the language lawyers amongst us, the 9.6 section of the C++11 standard explains this in detail, slightly paraphrased:
Bit-fields [class.bit]
A member-declarator of the form
identifieropt attribute-specifieropt : constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. The optional attribute-specifier appertains to the entity being declared. The bit-field attribute is not part of the type of the class member.
The constant-expression shall be an integral constant expression with a value greater than or equal to zero. The value of the integral constant expression may be larger than the number of bits in the object representation of the bit-field’s type; in such cases the extra bits are used as padding bits and do not participate in the value representation of the bit-field.
Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.
Note: bit-fields straddle allocation units on some machines and not on others. Bit-fields are assigned right-to-left on some machines, left-to-right on others. - end note
I believe those would be bitfields.
Strictly speaking, a bitfield must be a int, unsigned int, or _Bool. Although most compilers will take any integral type.
Ref C11 6.7.2.1:
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.
Your compiler will probably allocate 1 byte of storage, but it is free to grab more.
Ref C11 6.7.2.1:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field.
The savings comes when you have multiple bitfields that are declared one after another. In this case, the storage allocated will be packed if possible.
Ref C11 6.7.2.1:
If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined.