Extracting two signed integers from one given integer? - c++

I have the following structure:
struct
{
int a:4;
int b:7;
int c:21;
} example;
I would like to combine a and b to form an integer d in C++. For instance, I would like the bit values of a to be on the left of the bit values of b in order to form integer d. How is this implemented in c++?
Example:
a= 1001
b = 1010101
I would like int d = 10011010101 xxxxxxxxxxxxxxxxxxxxx
where x can be 21 bits that belonged to d previously. I would like the values of a and b to be put in bit positions 0-3 and 4-10 respectively since a occupies the first 4 bits and b occupies the next 7 bits in the struct "example".
The part that I am confused about is that variable a and variable b both have a "sign" bit at the most significant bit. Does this affect the outcome? Are all bits in variable a and variable b used in the end result for integer d? Will integer d look like a concatenation of variable a's bits and variable b's bits?
Thanks

Note that whether an int bit-field is signed or unsigned is implementation-defined. The C++ standard says this, and the C standard achieves the same net result with different wording:
ISO/IEC 14882:2011 — C++
§7.1.6.2 Simple type specifiers
¶3 ... [ Note: It is implementation-defined whether objects of char type and certain bit-fields (9.6) are
represented as signed or unsigned quantities. The signed specifier forces char objects and bit-fields to be
signed; it is redundant in other contexts. —end note ]
§9.6 Bit-fields
¶3 ... A bit-field shall have integral or enumeration type (3.9.1). It is
implementation-defined whether a plain (neither explicitly signed nor unsigned) char, short, int, long,
or long long bit-field is signed or unsigned.
ISO/IEC 9899:2011 — C
§6.7.2.1 Structure and union specifiers
¶10 A bit-field is interpreted as having a signed or unsigned integer type consisting of the specified number of bits.125)
125) As specified in 6.7.2 above, if the actual type specifier used is int or a typedef-name defined as int, then it is implementation-defined whether the bit-field is signed or unsigned.
§6.7.2 Type specifiers
¶5 ... for bit-fields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int.
The context of §6.7.2 shows that int can be combined with short, long etc and the rule will apply; C++ specifies that a bit more clearly. The signedness of plain char is implementation-defined already, of course.
Unsigned bit-fields
If the type of the bit-fields are unsigned, then the expression is fairly straight-forward:
int d = (example.a << 7) | example.b;
Signed bit-fields
If the values are signed, then you have a major interpretation exercise to undertake, deciding what the value should be if example.a is negative and example.b is positive, or vice versa. To some extent, the problem arises even if the values are both negative or both positive.
Suppose example.a = 7; and example.b = 12; — what should be the value of d? Probably the same expression applies, but you could argue that it would be better to shift by 1 fewer places:
assert(example.a >= 0 && example.b >= 0);
int d = (example.a << 6) | example.b; // Alternative interpretation
The other cases are left for you to decide; it depends on the interpretation you want to place on the values. For example:
int d = ((example.a & 0x0F) << 7) | (example.b & 0x7F);
This forces the signed values to be treated as unsigned. It probably isn't what you're after.
Modified question
example.a = 1001 // binary
example.b = 1010101 // binary
d = 10011010101 xxxxxxxxxxxxxxxxxxxxx
where x can be 21 bits that belonged to d previously.
For this to work, then you need:
d = (d & 0x001FFFFF) | ((((example.a & 0x0F) << 7) | (example.b & 0x7F)) << 21);
You probably can use fewer parentheses; I'm not sure I'd risk doing so.
Union
However, with this revised specification, you might well be tempted to look at a union such as:
union u
{
struct
{
int a:4;
int b:7;
int c:21;
} y;
int x;
} example;
However, the layout of the bits in the bit-fields w.r.t the bits in the int x; is not specified (they could be most significant bits first or least significant bits first), and there are always mutterings about 'if you access a value in a union that wasn't the last one assigned to you invoke undefined behaviour'. Thus you have multiple platform-defined aspects of the bit field to deal with. In fact, this sort of conundrum generally means that bit-fields are closely tied to one specific type of machine (CPU) and compiler and operating system. They are very, very non-portable at the level of detail you're after.

Related

What is the " : " (two dots) operator in c++ [duplicate]

What does the following C++ code mean?
unsigned char a : 1;
unsigned char b : 7;
I guess it creates two char a and b, and both of them should be one byte long, but I have no idea what the ": 1" and ": 7" part does.
The 1 and the 7 are bit sizes to limit the range of the values. They're typically found in structures and unions. For example, on some systems (depends on char width and packing rules, etc), the code:
typedef struct {
unsigned char a : 1;
unsigned char b : 7;
} tOneAndSevenBits;
creates an 8-bit value, one bit for a and 7 bits for b.
Typically used in C to access "compressed" values such as a 4-bit nybble which might be contained in the top half of an 8-bit char:
typedef struct {
unsigned char leftFour : 4;
unsigned char rightFour : 4;
} tTwoNybbles;
For the language lawyers amongst us, the 9.6 section of the C++11 standard explains this in detail, slightly paraphrased:
Bit-fields [class.bit]
A member-declarator of the form
identifieropt attribute-specifieropt : constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. The optional attribute-specifier appertains to the entity being declared. The bit-field attribute is not part of the type of the class member.
The constant-expression shall be an integral constant expression with a value greater than or equal to zero. The value of the integral constant expression may be larger than the number of bits in the object representation of the bit-field’s type; in such cases the extra bits are used as padding bits and do not participate in the value representation of the bit-field.
Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.
Note: bit-fields straddle allocation units on some machines and not on others. Bit-fields are assigned right-to-left on some machines, left-to-right on others. - end note
I believe those would be bitfields.
Strictly speaking, a bitfield must be a int, unsigned int, or _Bool. Although most compilers will take any integral type.
Ref C11 6.7.2.1:
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.
Your compiler will probably allocate 1 byte of storage, but it is free to grab more.
Ref C11 6.7.2.1:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field.
The savings comes when you have multiple bitfields that are declared one after another. In this case, the storage allocated will be packed if possible.
Ref C11 6.7.2.1:
If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined.

Is masking before unsigned left shift in C/C++ too paranoid?

This question is motivated by me implementing cryptographic algorithms (e.g. SHA-1) in C/C++, writing portable platform-agnostic code, and thoroughly avoiding undefined behavior.
Suppose that a standardized crypto algorithm asks you to implement this:
b = (a << 31) & 0xFFFFFFFF
where a and b are unsigned 32-bit integers. Notice that in the result, we discard any bits above the least significant 32 bits.
As a first naive approximation, we might assume that int is 32 bits wide on most platforms, so we would write:
unsigned int a = (...);
unsigned int b = a << 31;
We know this code won't work everywhere because int is 16 bits wide on some systems, 64 bits on others, and possibly even 36 bits. But using stdint.h, we can improve this code with the uint32_t type:
uint32_t a = (...);
uint32_t b = a << 31;
So we are done, right? That's what I thought for years. ... Not quite. Suppose that on a certain platform, we have:
// stdint.h
typedef unsigned short uint32_t;
The rule for performing arithmetic operations in C/C++ is that if the type (such as short) is narrower than int, then it gets widened to int if all values can fit, or unsigned int otherwise.
Let's say that the compiler defines short as 32 bits (signed) and int as 48 bits (signed). Then these lines of code:
uint32_t a = (...);
uint32_t b = a << 31;
will effectively mean:
unsigned short a = (...);
unsigned short b = (unsigned short)((int)a << 31);
Note that a is promoted to int because all of ushort (i.e. uint32) fits into int (i.e. int48).
But now we have a problem: shifting non-zero bits left into the sign bit of a signed integer type is undefined behavior. This problem happened because our uint32 was promoted to int48 - instead of being promoted to uint48 (where left-shifting would be okay).
Here are my questions:
Is my reasoning correct, and is this a legitimate problem in theory?
Is this problem safe to ignore because on every platform the next integer type is double the width?
Is a good idea to correctly defend against this pathological situation by pre-masking the input like this?: b = (a & 1) << 31;. (This will necessarily be correct on every platform. But this could make a speed-critical crypto algorithm slower than necessary.)
Clarifications/edits:
I'll accept answers for C or C++ or both. I want to know the answer for at least one of the languages.
The pre-masking logic may hurt bit rotation. For example, GCC will compile b = (a << 31) | (a >> 1); to a 32-bit bit-rotation instruction in assembly language. But if we pre-mask the left shift, it is possible that the new logic is not translated into bit rotation, which means now 4 operations are performed instead of 1.
Speaking to the C side of the problem,
Is my reasoning correct, and is this a legitimate problem in theory?
It is a problem that I had not considered before, but I agree with your analysis. C defines the behavior of the << operator in terms of the type of the promoted left operand, and it it conceivable that the integer promotions result in that being (signed) int when the original type of that operand is uint32_t. I don't expect to see that in practice on any modern machine, but I'm all for programming to the actual standard as opposed to my personal expectations.
Is this problem safe to ignore because on every platform the next integer type is double the width?
C does not require such a relationship between integer types, though it is ubiquitous in practice. If you are determined to rely only on the standard, however -- that is, if you are taking pains to write strictly conforming code -- then you cannot rely on such a relationship.
Is a good idea to correctly defend against this pathological situation by pre-masking the input like this?: b = (a & 1) << 31;.
(This will necessarily be correct on every platform. But this could
make a speed-critical crypto algorithm slower than necessary.)
The type unsigned long is guaranteed to have at least 32 value bits, and it is not subject to promotion to any other type under the integer promotions. On many common platforms it has exactly the same representation as uint32_t, and may even be the same type. Thus, I would be inclined to write the expression like this:
uint32_t a = (...);
uint32_t b = (unsigned long) a << 31;
Or if you need a only as an intermediate value in the computation of b, then declare it as an unsigned long to begin with.
Q1: Masking before the shift does prevent undefined behavior that OP has concern.
Q2: "... because on every platform the next integer type is double the width?" --> no. The "next" integer type could be less than 2x or even the same size.
The following is well defined for all compliant C compilers that have uint32_t.
uint32_t a;
uint32_t b = (a & 1) << 31;
Q3: uint32_t a; uint32_t b = (a & 1) << 31; is not expected to incur code that performs a mask - it is not needed in the executable - just in the source. If a mask does occur, get a better compiler should speed be an issue.
As suggested, better to emphasize the unsigned-ness with these shifts.
uint32_t b = (a & 1U) << 31;
#John Bollinger good answer well details how to handle OP's specific problem.
The general problem is how to form a number that is of at least n bits, a certain sign-ness and not subject to surprising integer promotions - the core of OP's dilemma. The below fulfills this by invoking an unsigned operation that does not change the value - effective a no-op other than type concerns. The product will be at least the width of unsigned or uint32_t. Casting, in general, may narrow the type. Casting needs to be avoided unless narrowing is certain to not occur. An optimization compiler will not create unnecessary code.
uint32_t a;
uint32_t b = (a + 0u) << 31;
uint32_t b = (a*1u) << 31;
Taking a clue from this question about possible UB in uint32 * uint32 arithmetic, the following simple approach should work in C and C++:
uint32_t a = (...);
uint32_t b = (uint32_t)((a + 0u) << 31);
The integer constant 0u has type unsigned int. This promotes the addition a + 0u to uint32_t or unsigned int, whichever is wider. Because the type has rank int or higher, no more promotion occurs, and the shift can be applied with the left operand being uint32_t or unsigned int.
The final cast back to uint32_t will just suppress potential warnings about a narrowing conversion (say if int is 64 bits).
A decent C compiler should be able to see that adding zero is a no-op, which is less onerous than seeing that a pre-mask has no effect after an unsigned shift.
To avoid unwanted promotion, you may use the greater type with some typedef, as
using my_uint_at_least32 = std::conditional_t<(sizeof(std::uint32_t) < sizeof(unsigned)),
unsigned,
std::uint32_t>;
For this segment of code:
uint32_t a = (...);
uint32_t b = a << 31;
To promote a to a unsigned type instead of signed type, use:
uint32_t b = a << 31u;
When both sides of << operator is an unsigned type, then this line in 6.3.1.8 (C standard draft n1570) applies:
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
The problem you are describing is caused you use 31 which is signed int type so another line in 6.3.1.8
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
forces a to promoted to a signed type
Update:
This answer is not correct because 6.3.1.1(2) (emphasis mine):
...
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the
integer promotions.58) All other types are unchanged by the integer
promotions.
and footnote 58 (emphasis mine):
58) The integer promotions are applied only: as part of the usual arithmetic conversions, to certain argument expressions, to the operands of the unary +, -, and ~ operators, and to both operands of the shift operators, as specified by their respective subclauses.
Since only integer promotion is happening and not common arithmetic conversion, using 31u does not guarantee a to be converted to unsigned int as stated above.

Should use of bit-fields of type int be discouraged? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
From the Draft C++ Standard (N3337):
9.6 Bit-fields
4 If the value true or false is stored into a bit-field of type bool of any size (including a one bit bit-field), the original bool value and the value of the bit-field shall compare equal. If the value of an enumerator is stored into a bit-field of the same enumeration type and the number of bits in the bit-field is large enough to hold all the values of that enumeration type (7.2), the original enumerator value and the value of the bit-field shall compare equal.
The standard is non-committal about any such behavior for bit-fields of other types. To understand how g++ (4.7.3) deals with other types of bit-fields, I used the following test program:
#include <iostream>
enum TestEnum
{
V1 = 0,
V2
};
struct Foo
{
bool d1:1;
TestEnum d2:1;
int d3:1;
unsigned int d4:1;
};
int main()
{
Foo foo;
foo.d1 = true;
foo.d2 = V2;
foo.d3 = 1;
foo.d4 = 1;
std::cout << std::boolalpha;
std::cout << "d1: " << foo.d1 << std::endl;
std::cout << "d2: " << foo.d2 << std::endl;
std::cout << "d3: " << foo.d3 << std::endl;
std::cout << "d4: " << foo.d4 << std::endl;
std::cout << std::endl;
std::cout << (foo.d1 == true) << std::endl;
std::cout << (foo.d2 == V2) << std::endl;
std::cout << (foo.d3 == 1) << std::endl;
std::cout << (foo.d4 == 1) << std::endl;
return 0;
}
The output:
d1: true
d2: 1
d3: -1
d4: 1
true
true
false
true
I was surprised by the lines of the output corresponding to Foo::d3. The output is the same at ideone.com.
Since the standard is non-committal about the comparision of bit-fields of type int, g++ does not seem to be in violation of the standard. That brings me to my questions.
Is use of bit-fields of type int a bad idea? Should it be discouraged?
Yes, bit fields of type int are a bad idea, because their signedness is implementation-defined. Use signed int or unsigned int instead.
For non-bitfield declarations, the type name int is exactly equivalent to signed int (or int signed, or signed). The same pattern is followed for short, long, and long long: the unadorned type name is the signed version, and you have to add the unsigned keyword to name the corresponding unsigned type.
Bit fields, for historical reasons, are a special case. A bit-field defined with the type int is equivalent either to the same declaration with signed int, or to the same declaration with unsigned int. The choice is implementation-defined (i.e., it's up to the compiler, not to the programmer). A bit field is the only context in which int and signed int aren't (necessarily) synonymous. The same applies to char, short, long, and long long.
Quoting the C++11 standard, section 9.6 [class.bit]:
It is implementation-defined whether a plain (neither explicitly
signed nor unsigned) char, short, int, long, or long long bit-field is
signed or unsigned.
(I'm not entirely sure of the rationale for this. Very old versions of C didn't have the unsigned keyword, and unsigned bit fields are usually more useful than signed bit fields. It may be that some early C compilers implemented bit fields before the unsigned keyword was introduced. Making bit fields unsigned by default, even when declared as int, may have been just a matter of convenience. There's no real reason to keep the rule other than to avoid breaking old code.)
Most bit fields are intended to be unsigned, which of course means that they should be defined that way.
If you want a signed bit field (say, a 4-bit field that can represent values from -8 to +7, or from -7 to +7 on a non-two's-complement system), then you should explicitly define it as signed int. If you define it as int, then some compilers will treat it as unsigned int.
If you don't care whether your bit field is signed or unsigned, then you can define it as int -- but if you're defining a bit field, then you almost certainly do care whether it's signed or unsigned.
You can absolutely use unsigned bit-fields of any size no greater than the size of an unsigned int. While signed bit-fields are legal (at least if the width is greater than one), I personally prefer not to use them. If, however, you do want to use a signed bit-field, you should explicitly mark it as signed because it is implementation-dependent as to whether an unqualified int bit-field is signed or unsigned. (This is similar to char, but without the complicating feature of explicitly unqualified char* literals.)
So to that extent, I agree that int bit-fields should be discouraged. [Note 1] While I don't know of any implementation in which an int bitfield is implicitly unsigned, it is certainly allowed by the standard, and consequently there is lots of opportunity for implementation-specific unanticipated behaviour if you are not explicit about signs.
The standards specify that a signed integer representation consists of optional padding bits, exactly one sign bit, and value bits. While the standard does not guarantee that there is at least one value bit, -- and as the example in the OP shows, gcc does not insist that there be -- I think it is a plausible interpretation of the standard, since it explicitly allows there to be no padding bits, and does not have any such wording corresponding to value bits.
In any case, there are only three possible signed representations allowed:
2's complement, in which a single-bit field consisting of a 1 should be interpreted as -1
1's complement and sign-magnitude. In both of these case, a single-bit field consisting of a 1 is allowed to be a trap representation, so the only number which can be represented in a 1-bit signed bit-field is 0.
Since portable code cannot assume that a 1-bit signed bit-field can represent any non-zero value, it seems reasonable to insist that a signed bit-field have at least 2 bits, regardless of whether you interpret the standard(s) to actually require that or not.
Notes:
Indeed, if it were not for the fact that string literals are explicitly unqualified, I would prefer to always specify unsigned char. But there's no way to roll-back history on that point.)
int is signed, and in C++ Two's complement can be used, so in first int's byte sign may be stored. When there are 2 bits for an signed int, it can be equal to 1, see it working.
This is perfectly logical. int is a signed integer type, and if the underlying architecture uses two's complement to represent signed integers (as all modern architectures do), then the high-order bit is the sign bit. So a 1-bit signed integer bitfield can take the values 0 or -1. And a 3-bit signed integer bitfield, for instance, can take values between -4 and 3 inclusive.
There is no reason for a blanket ban on signed integer bitfields, as long as you understand two's complement representation.

what does ":1" after a member variable mean? [duplicate]

What does the following C++ code mean?
unsigned char a : 1;
unsigned char b : 7;
I guess it creates two char a and b, and both of them should be one byte long, but I have no idea what the ": 1" and ": 7" part does.
The 1 and the 7 are bit sizes to limit the range of the values. They're typically found in structures and unions. For example, on some systems (depends on char width and packing rules, etc), the code:
typedef struct {
unsigned char a : 1;
unsigned char b : 7;
} tOneAndSevenBits;
creates an 8-bit value, one bit for a and 7 bits for b.
Typically used in C to access "compressed" values such as a 4-bit nybble which might be contained in the top half of an 8-bit char:
typedef struct {
unsigned char leftFour : 4;
unsigned char rightFour : 4;
} tTwoNybbles;
For the language lawyers amongst us, the 9.6 section of the C++11 standard explains this in detail, slightly paraphrased:
Bit-fields [class.bit]
A member-declarator of the form
identifieropt attribute-specifieropt : constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. The optional attribute-specifier appertains to the entity being declared. The bit-field attribute is not part of the type of the class member.
The constant-expression shall be an integral constant expression with a value greater than or equal to zero. The value of the integral constant expression may be larger than the number of bits in the object representation of the bit-field’s type; in such cases the extra bits are used as padding bits and do not participate in the value representation of the bit-field.
Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.
Note: bit-fields straddle allocation units on some machines and not on others. Bit-fields are assigned right-to-left on some machines, left-to-right on others. - end note
I believe those would be bitfields.
Strictly speaking, a bitfield must be a int, unsigned int, or _Bool. Although most compilers will take any integral type.
Ref C11 6.7.2.1:
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.
Your compiler will probably allocate 1 byte of storage, but it is free to grab more.
Ref C11 6.7.2.1:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field.
The savings comes when you have multiple bitfields that are declared one after another. In this case, the storage allocated will be packed if possible.
Ref C11 6.7.2.1:
If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined.

How to merge two signed bit variables into one signed bit variable?

Suppose the following c++ code:
#include <iostream>
using namespace std;
typedef struct
{
int a: 5;
int b: 4;
int c: 1;
int d: 22;
} example;
int main()
{
example blah;
blah.a = -5; // 11011
blah.b = -3; // 1101
int result = blah.a << 4 | blah.b;
cout << "Result = " << result << endl; // equals 445 , but I am interested in this having a value of -67
return 0;
}
I am interested in having the variable result be of type int where the 9th bit is the most significant bit. I would like this to be the case so that result = -67 instead of 445. How is this done? Thanks.
See Sign Extending an int in C for a closely related question (but not a duplicate).
You need to be aware that almost everything about bit fields is 'implementation defined'. In particular, it is not clear that you can assign negative numbers to a 'plain int' bit-field; you have to know whether your implementation uses 'plain int is signed' or 'plain int is unsigned'. Which is the 9th bit gets tricky too; are you counting from 0 or 1, and which end of the set of bit-fields is at bit 0 and which at bit 31 (counting least significant bit (LSB) as bit 0 and most significant bit (MSB) as bit 31 of a 32-bit quantity). Indeed, the size of your structure need not be 32 bits; the compiler might have different rules for the layout.
With all those caveats out of the way, you have a 9-bit value formed from (blah.a << 4) | blah.b, and you want that sign-extended as if it was a 9-bit 2's complement number being promoted to (32-bit) int.
The function in the cross-referenced answer could do the job:
#include <assert.h>
#include <limits.h>
extern int getFieldSignExtended(int value, int hi, int lo);
enum { INT_BITS = CHAR_BIT * sizeof(int) };
int getFieldSignExtended(int value, int hi, int lo)
{
assert(lo >= 0);
assert(hi > lo);
assert(hi < INT_BITS - 1);
int bits = (value >> lo) & ((1 << (hi - lo + 1)) - 1);
if (bits & (1 << (hi - lo)))
return(bits | (~0 << (hi - lo)));
else
return(bits);
}
Invoke it as:
int result = getFieldSignExtended((blah.a << 4) | blah.b), 8, 0);
If you want to hard-wire the numbers, you can write:
int x = (blah.a << 4) | blah.b;
int result = (x & (1 << 8)) ? (x | (~0 << 8)) : x;
Note I'm assuming the 9th bit is bit 8 of a value with bits 0..8 in it. Adjust if you have some other interpretation in mind.
Working code
Compiled with g++ (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44) from a RHEL 5 x86/64 machine.
#include <iostream>
using namespace std;
typedef struct
{
int a: 5;
int b: 4;
int c: 1;
int d: 22;
} example;
int main()
{
example blah;
blah.a = -5; // 11011
blah.b = -3; // 1101
int result = blah.a << 4 | blah.b;
cout << "Result = " << result << endl;
int x = (blah.a << 4) | blah.b;
cout << "x = " << x << endl;
int result2 = (x & (1 << 8)) ? (x | (~0 << 8)) : x;
cout << "Result2 = " << result2 << endl;
return 0;
}
Sample output:
Result = 445
x = 445
Result2 = -67
ISO/IEC 14882:2011 — C++ Standard
§7.1.6.2 Simple type specifiers
¶3 ... [ Note: It is implementation-defined whether objects of char type and certain bit-fields (9.6) are
represented as signed or unsigned quantities. The signed specifier forces char objects and bit-fields to be
signed; it is redundant in other contexts. —end note ]
§9.6 Bit-fields [class.bit]
¶1 A member-declarator of the form
identifier<sub>opt</sub> attribute-specifier-seq<sub>opt</sub>: constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. The optional attribute-specifier-seq
appertains to the entity being declared. The bit-field attribute is not part of the type of the class
member. The constant-expression shall be an integral constant expression with a value greater than or equal
to zero. The value of the integral constant expression may be larger than the number of bits in the object
representation (3.9) of the bit-field’s type; in such cases the extra bits are used as padding bits and do not
participate in the value representation (3.9) of the bit-field. Allocation of bit-fields within a class object is
implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some
addressable allocation unit. [ Note: Bit-fields straddle allocation units on some machines and not on others.
Bit-fields are assigned right-to-left on some machines, left-to-right on others. —end note ]
¶2 A declaration for a bit-field that omits the identifier declares an unnamed bit-field. Unnamed bit-fields
are not members and cannot be initialized. [ Note: An unnamed bit-field is useful for padding to conform
to externally-imposed layouts. —end note ] As a special case, an unnamed bit-field with a width of zero
specifies alignment of the next bit-field at an allocation unit boundary. Only when declaring an unnamed
bit-field may the value of the constant-expression be equal to zero.
¶3 A bit-field shall not be a static member. A bit-field shall have integral or enumeration type (3.9.1). It is
implementation-defined whether a plain (neither explicitly signed nor unsigned) char, short, int, long,
or long long bit-field is signed or unsigned. A bool value can successfully be stored in a bit-field of any
nonzero size. The address-of operator & shall not be applied to a bit-field, so there are no pointers to bitfields.
A non-const reference shall not be bound to a bit-field (8.5.3). [ Note: If the initializer for a reference
of type const T& is an lvalue that refers to a bit-field, the reference is bound to a temporary initialized to
hold the value of the bit-field; the reference is not bound to the bit-field directly. See 8.5.3. —end note ]
¶4 If the value true or false is stored into a bit-field of type bool of any size (including a one bit bit-field),
the original bool value and the value of the bit-field shall compare equal. If the value of an enumerator is
stored into a bit-field of the same enumeration type and the number of bits in the bit-field is large enough
to hold all the values of that enumeration type (7.2), the original enumerator value and the value of the
bit-field shall compare equal. [ Example:
enum BOOL { FALSE=0, TRUE=1 };
struct A {
BOOL b:1;
};
A a;
void f() {
a.b = TRUE;
if (a.b == TRUE) // yields true
{ /* ... */ }
}
—end example ]
ISO/IEC 9899:2011 — C2011 Standard
The C standard has essentially the same effect, but the information is presented somewhat differently.
6.7.2.1 Structure and union specifiers
¶4 The expression that specifies the width of a bit-field shall be an integer constant
expression with a nonnegative value that does not exceed the width of an object of the
type that would be specified were the colon and expression omitted.122) If the value is
zero, the declaration shall have no declarator.
¶5 A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed
int, unsigned int, or some other implementation-defined type. It is
implementation-defined whether atomic types are permitted.
¶9 ... In addition, a member may be declared to consist of a
specified number of bits (including a sign bit, if any). Such a member is called a
bit-field;124) its width is preceded by a colon.
¶10 A bit-field is interpreted as having a signed or unsigned integer type consisting of the
specified number of bits.125) If the value 0 or 1 is stored into a nonzero-width bit-field of
type _Bool, the value of the bit-field shall compare equal to the value stored; a _Bool
bit-field has the semantics of a _Bool.
¶11 An implementation may allocate any addressable storage unit large enough to hold a bitfield.
If enough space remains, a bit-field that immediately follows another bit-field in a
structure shall be packed into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is
implementation-defined. The order of allocation of bit-fields within a unit (high-order to
low-order or low-order to high-order) is implementation-defined. The alignment of the
addressable storage unit is unspecified.
¶12 A bit-field declaration with no declarator, but only a colon and a width, indicates an
unnamed bit-field.126) As a special case, a bit-field structure member with a width of 0
indicates that no further bit-field is to be packed into the unit in which the previous bitfield,
if any, was placed.
122) While the number of bits in a _Bool object is at least CHAR_BIT, the width (number of sign and
value bits) of a _Bool may be just 1 bit.
124) The unary & (address-of) operator cannot be applied to a bit-field object; thus, there are no pointers to
or arrays of bit-field objects.
125) As specified in 6.7.2 above, if the actual type specifier used is int or a typedef-name defined as int,
then it is implementation-defined whether the bit-field is signed or unsigned.
126) An unnamed bit-field structure member is useful for padding to conform to externally imposed
layouts.
Annex J of the standard defines Portability Issues, and §J.3 defines Implementation-defined Behaviour. In part, it says:
J.3.9 Structures, unions, enumerations, and bit-fields
¶1 — Whether a ‘‘plain’’ int bit-field is treated as a signed int bit-field or as an
unsigned int bit-field (6.7.2, 6.7.2.1).
— Allowable bit-field types other than _Bool, signed int, and unsigned int
(6.7.2.1).
— Whether atomic types are permitted for bit-fields (6.7.2.1).
— Whether a bit-field can straddle a storage-unit boundary (6.7.2.1).
— The order of allocation of bit-fields within a unit (6.7.2.1).