What is the " : " (two dots) operator in c++ [duplicate] - c++

What does the following C++ code mean?
unsigned char a : 1;
unsigned char b : 7;
I guess it creates two char a and b, and both of them should be one byte long, but I have no idea what the ": 1" and ": 7" part does.

The 1 and the 7 are bit sizes to limit the range of the values. They're typically found in structures and unions. For example, on some systems (depends on char width and packing rules, etc), the code:
typedef struct {
unsigned char a : 1;
unsigned char b : 7;
} tOneAndSevenBits;
creates an 8-bit value, one bit for a and 7 bits for b.
Typically used in C to access "compressed" values such as a 4-bit nybble which might be contained in the top half of an 8-bit char:
typedef struct {
unsigned char leftFour : 4;
unsigned char rightFour : 4;
} tTwoNybbles;
For the language lawyers amongst us, the 9.6 section of the C++11 standard explains this in detail, slightly paraphrased:
Bit-fields [class.bit]
A member-declarator of the form
identifieropt attribute-specifieropt : constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. The optional attribute-specifier appertains to the entity being declared. The bit-field attribute is not part of the type of the class member.
The constant-expression shall be an integral constant expression with a value greater than or equal to zero. The value of the integral constant expression may be larger than the number of bits in the object representation of the bit-field’s type; in such cases the extra bits are used as padding bits and do not participate in the value representation of the bit-field.
Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.
Note: bit-fields straddle allocation units on some machines and not on others. Bit-fields are assigned right-to-left on some machines, left-to-right on others. - end note

I believe those would be bitfields.

Strictly speaking, a bitfield must be a int, unsigned int, or _Bool. Although most compilers will take any integral type.
Ref C11 6.7.2.1:
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.
Your compiler will probably allocate 1 byte of storage, but it is free to grab more.
Ref C11 6.7.2.1:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field.
The savings comes when you have multiple bitfields that are declared one after another. In this case, the storage allocated will be packed if possible.
Ref C11 6.7.2.1:
If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined.

Related

Size of a structure having unsigned short ints

I was surfing in one of our organisational data documents and I came across the following piece of code.
struct A {
unsigned short int i:1;
unsigned short int j:1;
unsigned short int k:14;
};
int main(){
A aa;
int n = sizeof(aa);
cout << n;
}
Initially I thought the size will be 6 bytes as the size of the unsigned short int is 2 bytes. but the output of the above code was 2 bytes(On visual studio 2008).
Is there a slight possibility that the i:1, j:1 and k:14 makes it a bit field or something? Its just a guess and I am not very sure about it. Can somebody please help me in this?
Yes, this is bitfield, indeed.
Well, i'm not very much sure about c++, but In c99 standard, as per chapter 6.7.2.1 (10):
An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next bits or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.
That makes your structure size (1 bit + 1 bit + 14 bits) = 16 bits = 2 bytes.
Note: No structure padding is considered here.
Edit:
As per C++14 standard, chapter §9.7,
A member-declarator of the form
identifieropt attribute-specifier-seqopt: constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. [...] Allocation of bit-fields within a class object is
implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.

what does ":1" after a member variable mean? [duplicate]

What does the following C++ code mean?
unsigned char a : 1;
unsigned char b : 7;
I guess it creates two char a and b, and both of them should be one byte long, but I have no idea what the ": 1" and ": 7" part does.
The 1 and the 7 are bit sizes to limit the range of the values. They're typically found in structures and unions. For example, on some systems (depends on char width and packing rules, etc), the code:
typedef struct {
unsigned char a : 1;
unsigned char b : 7;
} tOneAndSevenBits;
creates an 8-bit value, one bit for a and 7 bits for b.
Typically used in C to access "compressed" values such as a 4-bit nybble which might be contained in the top half of an 8-bit char:
typedef struct {
unsigned char leftFour : 4;
unsigned char rightFour : 4;
} tTwoNybbles;
For the language lawyers amongst us, the 9.6 section of the C++11 standard explains this in detail, slightly paraphrased:
Bit-fields [class.bit]
A member-declarator of the form
identifieropt attribute-specifieropt : constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. The optional attribute-specifier appertains to the entity being declared. The bit-field attribute is not part of the type of the class member.
The constant-expression shall be an integral constant expression with a value greater than or equal to zero. The value of the integral constant expression may be larger than the number of bits in the object representation of the bit-field’s type; in such cases the extra bits are used as padding bits and do not participate in the value representation of the bit-field.
Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.
Note: bit-fields straddle allocation units on some machines and not on others. Bit-fields are assigned right-to-left on some machines, left-to-right on others. - end note
I believe those would be bitfields.
Strictly speaking, a bitfield must be a int, unsigned int, or _Bool. Although most compilers will take any integral type.
Ref C11 6.7.2.1:
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.
Your compiler will probably allocate 1 byte of storage, but it is free to grab more.
Ref C11 6.7.2.1:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field.
The savings comes when you have multiple bitfields that are declared one after another. In this case, the storage allocated will be packed if possible.
Ref C11 6.7.2.1:
If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined.

When is uint8_t ≠ unsigned char?

According to C and C++, CHAR_BIT >= 8.
But whenever CHAR_BIT > 8, uint8_t can't even be represented as 8 bits.
It must be larger, because CHAR_BIT is the minimum number of bits for any data type on the system.
On what kind of a system can uint8_t be legally defined to be a type other than unsigned char?
(If the answer is different for C and C++ then I'd like to know both.)
If it exists, uint8_t must always have the same width as unsigned char. However, it need not be the same type; it may be a distinct extended integer type. It also need not have the same representation as unsigned char; for instance, the bits could be interpreted in the opposite order. This is a silly example, but it makes more sense for int8_t, where signed char might be ones complement or sign-magnitude while int8_t is required to be twos complement.
One further "advantage" of using a non-char extended integer type for uint8_t even on "normal" systems is C's aliasing rules. Character types are allowed to alias anything, which prevents the compiler from heavily optimizing functions that use both character pointers and pointers to other types, unless the restrict keyword has been applied well. However, even if uint8_t has the exact same size and representation as unsigned char, if the implementation made it a distinct, non-character type, the aliasing rules would not apply to it, and the compiler could assume that objects of types uint8_t and int, for example, can never alias.
On what kind of a system can uint8_t be legally defined to be a type other than unsigned char?
In summary, uint8_t can only be legally defined on systems where CHAR_BIT is 8. It's an addressable unit with exactly 8 value bits and no padding bits.
In detail, CHAR_BIT defines the width of the smallest addressable units, and uint8_t can't have padding bits; it can only exist when the smallest addressable unit is exactly 8 bits wide. Providing CHAR_BIT is 8, uint8_t can be defined by a type definition for any 8-bit unsigned integer type that has no padding bits.
Here's what the C11 standard draft (n1570.pdf) says:
5.2.4.2.1 Sizes of integer types
1 The values given below shall be replaced by constant expressions suitable for use in #if
preprocessing directives. ... Their implementation-defined values shall be equal or
greater in magnitude (absolute value) to those shown, with the same sign.
-- number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8
Thus the smallest objects must contain exactly CHAR_BIT bits.
6.5.3.4 The sizeof and _Alignof operators
...
4 When sizeof is applied to an operand that has type char, unsigned
char, or signed char, (or a qualified version thereof) the result is
1. ...
Thus, those are (some of) the smallest addressable units. Obviously int8_t and uint8_t may also be considered smallest addressable units, providing they exist.
7.20.1.1 Exact-width integer types
1 The typedef name intN_t designates a signed integer type with width
N, no padding bits, and a two’s complement representation. Thus,
int8_t denotes such a signed integer type with a width of exactly 8
bits.
2 The typedef name uintN_t designates an unsigned integer type with
width N and no padding bits. Thus, uint24_t denotes such an unsigned
integer type with a width of exactly 24 bits.
3 These types are optional. However, if an implementation provides
integer types with widths of 8, 16, 32, or 64 bits, no padding bits,
and (for the signed types) that have a two’s complement
representation, it shall define the corresponding typedef names.
The emphasis on "These types are optional" is mine. I hope this was helpful :)
A possibility that no one has so far mentioned: if CHAR_BIT==8 and unqualified char is unsigned, which it is in some ABIs, then uint8_t could be a typedef for char instead of unsigned char. This matters at least insofar as it affects overload choice (and its evil twin, name mangling), i.e. if you were to have both foo(char) and foo(unsigned char) in scope, calling foo with an argument of type uint8_t would prefer foo(char) on such a system.

How to merge two signed bit variables into one signed bit variable?

Suppose the following c++ code:
#include <iostream>
using namespace std;
typedef struct
{
int a: 5;
int b: 4;
int c: 1;
int d: 22;
} example;
int main()
{
example blah;
blah.a = -5; // 11011
blah.b = -3; // 1101
int result = blah.a << 4 | blah.b;
cout << "Result = " << result << endl; // equals 445 , but I am interested in this having a value of -67
return 0;
}
I am interested in having the variable result be of type int where the 9th bit is the most significant bit. I would like this to be the case so that result = -67 instead of 445. How is this done? Thanks.
See Sign Extending an int in C for a closely related question (but not a duplicate).
You need to be aware that almost everything about bit fields is 'implementation defined'. In particular, it is not clear that you can assign negative numbers to a 'plain int' bit-field; you have to know whether your implementation uses 'plain int is signed' or 'plain int is unsigned'. Which is the 9th bit gets tricky too; are you counting from 0 or 1, and which end of the set of bit-fields is at bit 0 and which at bit 31 (counting least significant bit (LSB) as bit 0 and most significant bit (MSB) as bit 31 of a 32-bit quantity). Indeed, the size of your structure need not be 32 bits; the compiler might have different rules for the layout.
With all those caveats out of the way, you have a 9-bit value formed from (blah.a << 4) | blah.b, and you want that sign-extended as if it was a 9-bit 2's complement number being promoted to (32-bit) int.
The function in the cross-referenced answer could do the job:
#include <assert.h>
#include <limits.h>
extern int getFieldSignExtended(int value, int hi, int lo);
enum { INT_BITS = CHAR_BIT * sizeof(int) };
int getFieldSignExtended(int value, int hi, int lo)
{
assert(lo >= 0);
assert(hi > lo);
assert(hi < INT_BITS - 1);
int bits = (value >> lo) & ((1 << (hi - lo + 1)) - 1);
if (bits & (1 << (hi - lo)))
return(bits | (~0 << (hi - lo)));
else
return(bits);
}
Invoke it as:
int result = getFieldSignExtended((blah.a << 4) | blah.b), 8, 0);
If you want to hard-wire the numbers, you can write:
int x = (blah.a << 4) | blah.b;
int result = (x & (1 << 8)) ? (x | (~0 << 8)) : x;
Note I'm assuming the 9th bit is bit 8 of a value with bits 0..8 in it. Adjust if you have some other interpretation in mind.
Working code
Compiled with g++ (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44) from a RHEL 5 x86/64 machine.
#include <iostream>
using namespace std;
typedef struct
{
int a: 5;
int b: 4;
int c: 1;
int d: 22;
} example;
int main()
{
example blah;
blah.a = -5; // 11011
blah.b = -3; // 1101
int result = blah.a << 4 | blah.b;
cout << "Result = " << result << endl;
int x = (blah.a << 4) | blah.b;
cout << "x = " << x << endl;
int result2 = (x & (1 << 8)) ? (x | (~0 << 8)) : x;
cout << "Result2 = " << result2 << endl;
return 0;
}
Sample output:
Result = 445
x = 445
Result2 = -67
ISO/IEC 14882:2011 — C++ Standard
§7.1.6.2 Simple type specifiers
¶3 ... [ Note: It is implementation-defined whether objects of char type and certain bit-fields (9.6) are
represented as signed or unsigned quantities. The signed specifier forces char objects and bit-fields to be
signed; it is redundant in other contexts. —end note ]
§9.6 Bit-fields [class.bit]
¶1 A member-declarator of the form
identifier<sub>opt</sub> attribute-specifier-seq<sub>opt</sub>: constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. The optional attribute-specifier-seq
appertains to the entity being declared. The bit-field attribute is not part of the type of the class
member. The constant-expression shall be an integral constant expression with a value greater than or equal
to zero. The value of the integral constant expression may be larger than the number of bits in the object
representation (3.9) of the bit-field’s type; in such cases the extra bits are used as padding bits and do not
participate in the value representation (3.9) of the bit-field. Allocation of bit-fields within a class object is
implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some
addressable allocation unit. [ Note: Bit-fields straddle allocation units on some machines and not on others.
Bit-fields are assigned right-to-left on some machines, left-to-right on others. —end note ]
¶2 A declaration for a bit-field that omits the identifier declares an unnamed bit-field. Unnamed bit-fields
are not members and cannot be initialized. [ Note: An unnamed bit-field is useful for padding to conform
to externally-imposed layouts. —end note ] As a special case, an unnamed bit-field with a width of zero
specifies alignment of the next bit-field at an allocation unit boundary. Only when declaring an unnamed
bit-field may the value of the constant-expression be equal to zero.
¶3 A bit-field shall not be a static member. A bit-field shall have integral or enumeration type (3.9.1). It is
implementation-defined whether a plain (neither explicitly signed nor unsigned) char, short, int, long,
or long long bit-field is signed or unsigned. A bool value can successfully be stored in a bit-field of any
nonzero size. The address-of operator & shall not be applied to a bit-field, so there are no pointers to bitfields.
A non-const reference shall not be bound to a bit-field (8.5.3). [ Note: If the initializer for a reference
of type const T& is an lvalue that refers to a bit-field, the reference is bound to a temporary initialized to
hold the value of the bit-field; the reference is not bound to the bit-field directly. See 8.5.3. —end note ]
¶4 If the value true or false is stored into a bit-field of type bool of any size (including a one bit bit-field),
the original bool value and the value of the bit-field shall compare equal. If the value of an enumerator is
stored into a bit-field of the same enumeration type and the number of bits in the bit-field is large enough
to hold all the values of that enumeration type (7.2), the original enumerator value and the value of the
bit-field shall compare equal. [ Example:
enum BOOL { FALSE=0, TRUE=1 };
struct A {
BOOL b:1;
};
A a;
void f() {
a.b = TRUE;
if (a.b == TRUE) // yields true
{ /* ... */ }
}
—end example ]
ISO/IEC 9899:2011 — C2011 Standard
The C standard has essentially the same effect, but the information is presented somewhat differently.
6.7.2.1 Structure and union specifiers
¶4 The expression that specifies the width of a bit-field shall be an integer constant
expression with a nonnegative value that does not exceed the width of an object of the
type that would be specified were the colon and expression omitted.122) If the value is
zero, the declaration shall have no declarator.
¶5 A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed
int, unsigned int, or some other implementation-defined type. It is
implementation-defined whether atomic types are permitted.
¶9 ... In addition, a member may be declared to consist of a
specified number of bits (including a sign bit, if any). Such a member is called a
bit-field;124) its width is preceded by a colon.
¶10 A bit-field is interpreted as having a signed or unsigned integer type consisting of the
specified number of bits.125) If the value 0 or 1 is stored into a nonzero-width bit-field of
type _Bool, the value of the bit-field shall compare equal to the value stored; a _Bool
bit-field has the semantics of a _Bool.
¶11 An implementation may allocate any addressable storage unit large enough to hold a bitfield.
If enough space remains, a bit-field that immediately follows another bit-field in a
structure shall be packed into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is
implementation-defined. The order of allocation of bit-fields within a unit (high-order to
low-order or low-order to high-order) is implementation-defined. The alignment of the
addressable storage unit is unspecified.
¶12 A bit-field declaration with no declarator, but only a colon and a width, indicates an
unnamed bit-field.126) As a special case, a bit-field structure member with a width of 0
indicates that no further bit-field is to be packed into the unit in which the previous bitfield,
if any, was placed.
122) While the number of bits in a _Bool object is at least CHAR_BIT, the width (number of sign and
value bits) of a _Bool may be just 1 bit.
124) The unary & (address-of) operator cannot be applied to a bit-field object; thus, there are no pointers to
or arrays of bit-field objects.
125) As specified in 6.7.2 above, if the actual type specifier used is int or a typedef-name defined as int,
then it is implementation-defined whether the bit-field is signed or unsigned.
126) An unnamed bit-field structure member is useful for padding to conform to externally imposed
layouts.
Annex J of the standard defines Portability Issues, and §J.3 defines Implementation-defined Behaviour. In part, it says:
J.3.9 Structures, unions, enumerations, and bit-fields
¶1 — Whether a ‘‘plain’’ int bit-field is treated as a signed int bit-field or as an
unsigned int bit-field (6.7.2, 6.7.2.1).
— Allowable bit-field types other than _Bool, signed int, and unsigned int
(6.7.2.1).
— Whether atomic types are permitted for bit-fields (6.7.2.1).
— Whether a bit-field can straddle a storage-unit boundary (6.7.2.1).
— The order of allocation of bit-fields within a unit (6.7.2.1).

Extracting two signed integers from one given integer?

I have the following structure:
struct
{
int a:4;
int b:7;
int c:21;
} example;
I would like to combine a and b to form an integer d in C++. For instance, I would like the bit values of a to be on the left of the bit values of b in order to form integer d. How is this implemented in c++?
Example:
a= 1001
b = 1010101
I would like int d = 10011010101 xxxxxxxxxxxxxxxxxxxxx
where x can be 21 bits that belonged to d previously. I would like the values of a and b to be put in bit positions 0-3 and 4-10 respectively since a occupies the first 4 bits and b occupies the next 7 bits in the struct "example".
The part that I am confused about is that variable a and variable b both have a "sign" bit at the most significant bit. Does this affect the outcome? Are all bits in variable a and variable b used in the end result for integer d? Will integer d look like a concatenation of variable a's bits and variable b's bits?
Thanks
Note that whether an int bit-field is signed or unsigned is implementation-defined. The C++ standard says this, and the C standard achieves the same net result with different wording:
ISO/IEC 14882:2011 — C++
§7.1.6.2 Simple type specifiers
¶3 ... [ Note: It is implementation-defined whether objects of char type and certain bit-fields (9.6) are
represented as signed or unsigned quantities. The signed specifier forces char objects and bit-fields to be
signed; it is redundant in other contexts. —end note ]
§9.6 Bit-fields
¶3 ... A bit-field shall have integral or enumeration type (3.9.1). It is
implementation-defined whether a plain (neither explicitly signed nor unsigned) char, short, int, long,
or long long bit-field is signed or unsigned.
ISO/IEC 9899:2011 — C
§6.7.2.1 Structure and union specifiers
¶10 A bit-field is interpreted as having a signed or unsigned integer type consisting of the specified number of bits.125)
125) As specified in 6.7.2 above, if the actual type specifier used is int or a typedef-name defined as int, then it is implementation-defined whether the bit-field is signed or unsigned.
§6.7.2 Type specifiers
¶5 ... for bit-fields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int.
The context of §6.7.2 shows that int can be combined with short, long etc and the rule will apply; C++ specifies that a bit more clearly. The signedness of plain char is implementation-defined already, of course.
Unsigned bit-fields
If the type of the bit-fields are unsigned, then the expression is fairly straight-forward:
int d = (example.a << 7) | example.b;
Signed bit-fields
If the values are signed, then you have a major interpretation exercise to undertake, deciding what the value should be if example.a is negative and example.b is positive, or vice versa. To some extent, the problem arises even if the values are both negative or both positive.
Suppose example.a = 7; and example.b = 12; — what should be the value of d? Probably the same expression applies, but you could argue that it would be better to shift by 1 fewer places:
assert(example.a >= 0 && example.b >= 0);
int d = (example.a << 6) | example.b; // Alternative interpretation
The other cases are left for you to decide; it depends on the interpretation you want to place on the values. For example:
int d = ((example.a & 0x0F) << 7) | (example.b & 0x7F);
This forces the signed values to be treated as unsigned. It probably isn't what you're after.
Modified question
example.a = 1001 // binary
example.b = 1010101 // binary
d = 10011010101 xxxxxxxxxxxxxxxxxxxxx
where x can be 21 bits that belonged to d previously.
For this to work, then you need:
d = (d & 0x001FFFFF) | ((((example.a & 0x0F) << 7) | (example.b & 0x7F)) << 21);
You probably can use fewer parentheses; I'm not sure I'd risk doing so.
Union
However, with this revised specification, you might well be tempted to look at a union such as:
union u
{
struct
{
int a:4;
int b:7;
int c:21;
} y;
int x;
} example;
However, the layout of the bits in the bit-fields w.r.t the bits in the int x; is not specified (they could be most significant bits first or least significant bits first), and there are always mutterings about 'if you access a value in a union that wasn't the last one assigned to you invoke undefined behaviour'. Thus you have multiple platform-defined aspects of the bit field to deal with. In fact, this sort of conundrum generally means that bit-fields are closely tied to one specific type of machine (CPU) and compiler and operating system. They are very, very non-portable at the level of detail you're after.