Signed bit field in C++14 - c++

I believe that until C++14 a bit field of a struct declared as int was still interpreted as either signed or unsigned, the interpretation being implementation defined. Reference: http://en.cppreference.com/w/cpp/language/bit_field.
Is this still the case in C++14? I.e., is the code below guaranteed to work as inteded?
#include <iostream>
struct X
{
int f:3;
};
int main()
{
X x;
x.f = -2; // is this going to be indeed signed? It seems so.
std::cout << x.f << std::endl; // displays -2
}

According to C++11 standard §9.6/p3 Bit-fields [class.bit] (Emphasis Mine):
A bit-field shall not be a static member. A bit-field shall have
integral or enumeration type (3.9.1). It is implementation-defined
whether a plain (neither explicitly signed nor unsigned) char, short,
int, long, or long long bit-field is signed or unsigned. A bool value
can successfully be stored in a bit-field of any nonzero size. The
address-of operator & shall not be applied to a bit-field, so there
are no pointers to bitfields. A non-const reference shall not be bound
to a bit-field (8.5.3). [ Note: If the initializer for a reference of
type const T& is an lvalue that refers to a bit-field, the reference
is bound to a temporary initialized to hold the value of the
bit-field; the reference is not bound to the bit-field directly. See
8.5.3. —end note ]
So you're correct for the first part. Indeed until C++14 a bit field of a struct declared as signed was still interpreted as either signed or unsigned, the interpretation being implementation defined.
As already mentioned in this comments by #T.C. Defect reports referring to the issue were made DR739, DR675. Resulting in the following resolutions in C++14 standard:
The wording "It is implementation-defined whether a plain (neither explicitly signed nor unsigned) char, short, int, long, or long long bit-field is signed or unsigned.", was removed, and the C++14 wording now is:
A bit-field shall not be a static member. A bit-field shall have
integral or enumeration type (3.9.1). A bool value can successfully be
stored in a bit-field of any nonzero size. The address-of operator &
shall not be applied to a bit-field, so there are no pointers to
bit-fields. A non-const reference shall not be bound to a bit-field
(8.5.3). [ Note: If the initializer for a reference of type const T&
is an lvalue that refers to a bit-field, the reference is bound to a
temporary initialized to hold the value of the bit-field; the
reference is not bound to the bit-field directly. See 8.5.3. —end note
]
Also in §C.1.8 Clause 9: classes [diff.class] the following section was added:
9.6
Change: Bit-fields of type plain int are signed.
Rationale: Leaving the choice of signedness to implementations could lead to inconsistent definitions of
template specializations. For consistency, the implementation freedom was eliminated for non-dependent
types, too.
Effect on original feature: The choice is implementation-defined in C, but not so in C++.
Difficulty of converting: Syntactic transformation.
How widely used: Seldom.
Consequently, in C++14 bit-fields of type plain int are signed and the code posted is guaranteed to work as intended.

Related

Is casting to const implied when casting to a narrower const location?

I am trying to word this as best as possible, but an example is a good way to demonstrate my question. Consider the following scenario where variable long a goes into a narrower array element - essentially const int b[0]:
long a = 584;
const int b[4] = {(const int) a, 0, 0, 0};
Is the following snippet equivalent considering that the const isn't explicitly defined:
long a = 584;
const int b[4] = {(int) a, 0, 0, 0};
Both compile, but does the standard define this scenario and outcomes?
Casting to const int produces a value of type int. There are no cv-qualified prvalues of non-class type. See [expr.cast]/1:
The result of the expression (T) cast-expression is of type T. The result is an lvalue if T is an lvalue reference
type or an rvalue reference to function type and an xvalue if T is an rvalue reference to object type; otherwise
the result is a prvalue. [ Note: if T is a non-class type that is cv-qualified, the cv-qualifiers are ignored when
determining the type of the resulting prvalue; see 3.10. — end note ]
and [basic.lval]/4:
Class prvalues can have cv-qualified types; non-class prvalues always have cv-unqualified types. Unless
otherwise indicated (5.2.2), prvalues shall always have complete types or the void type; in addition to these
types, glvalues can also have incomplete types.
So even though you write a cast to const int, the resulting value will have type int.
However, a language lawyer might ask whether the (int) cast and the (const int) cast are guaranteed to produce the same value. Obviously in your case 584 fits into int so the value is guaranteed to be 584. In the general case where the long value might not fit into an int, the last bullet point of [dcl.init]/16 guarantees that the result of casting to const int will still be the same as casting to int:
... Otherwise, the initial value of the object being initialized is the (possibly converted) value of the ini-
tializer expression. Standard conversions (Clause 4) will be used, if necessary, to convert the initializer
expression to the cv-unqualified version of the destination type;
(All wording is from the C++14 standard; emphasis is mine.)
No, const is not implicitly added by the compiler, because it doesn't change anything. Both of your snippets are equivalent.
I don't think the standard defines this scenario, because it's a bit contrived.
Your question is equivalent whether it matters whether a is const here or not (in the example bellow). The answer is no, it doesn't, because you are copying a. It doesn't matter that you can't write to a, because you are only doing a read, not a write.
/*const*/ int a = 10;
const int b = a;

Creating an invalid reference via reinterpret cast

I am trying to determine whether the following code invokes undefined behavior:
#include <iostream>
class A;
void f(A& f)
{
char* x = reinterpret_cast<char*>(&f);
for (int i = 0; i < 5; ++i)
std::cout << x[i];
}
int main(int argc, char** argue)
{
A* a = reinterpret_cast<A*>(new char[5])
f(*a);
}
My understanding is that reinterpret_casts to and from char* are compliant because the standard permits aliasing with char and unsigned char pointers (emphasis mine):
If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char or unsigned char type.
However, I am not sure whether f(*a) invokes undefined behavior by creating a A& reference to the invalid pointer. The deciding factor seems to be what "attempts to access" verbiage means in the context of the C++ standard.
My intuition is that this does not constitute an access, since an access would require A to be defined (it is declared, but not defined in this example). Unfortunately, I cannot find a concrete definition of "access" in the C++ standard:
Does f(*a) invoke undefined behavior? What constitutes "access" in the C++ standard?
I understand that, regardless of the answer, it is likely a bad idea to rely on this behavior in production code. I am asking this question primarily out of a desire to improve my understanding of the language.
[Edit] #SergeyA cited this section of the standard. I've included it here for easy reference (emphasis mine):
5.3.1/1 [expr.unary.op]
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.” [Note: indirection through a pointer to an incomplete type (other than cv void) is valid. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to a prvalue, see 4.1. — end note ]
Tracing the reference to 4.1, we find:
4.1/1 [conv.lval]
A glvalue (3.10) of a non-function, non-array type T can be converted to a prvalue. If T is an incomplete type, a program that necessitates this conversion is ill-formed. If T is a non-class type, the type of the prvalue is the cv-unqualified version of T. Otherwise, the type of the prvalue is T.
When an lvalue-to-rvalue conversion is applied to an expression e, and either:
e is not potentially evaluated, or
the evaluation of e results in the evaluation of a member ex of the set of potential results of e, and ex names a variable x that is not odr-used by ex (3.2)
the value contained in the referenced object is not accessed.
I think our answer lies in whether *a satisfies the second bullet point. I am having trouble parsing that condition, so I am not sure.
char* x = reinterpret_cast<char*>(&f); is valid. Or, more specifically, access through x is allowed - the cast itself is always valid.
A* a = reinterpret_cast<A*>(new char[5]) is not valid - or, to be precise, access through a will trigger undefined behaviour.
The reason for this is that while it's OK to access object through a char*, it's not OK to access array of chars through a random object. Standard allows first, but not the second.
Or, in layman terms, you can alias a type* through char*, but you can't alias char* through type*.
EDIT
I just noticed I didn't answer direct question ("What constitutes "access" in the C++ standard"). Apparently, Standard does not define access (at least, I was not able to find the formal definition), but dereferencing the pointer is commonly understood to qualify for access.

Should bit-fields less than int in size be the subject of integral promotion?

Let's say I have following struct:
struct A
{
unsigned int a : 1;
unsigned int b : 1;
};
What interests me is the type of expression a + b. While technically bit-fields have "type" with size less than int so integral promotion probably should happen and then result is int like it happens to be in gcc and clang.
But since it's impossible to extract the exact type of bit-field itself and it will always be deduced to be its "big" type (i.e. unsigned int in this case) is it correct that integral promotion should happen? Because we can't actually talk about exact types and their sizes for bit-fields except them being deduced as unsigned int in which case integral promotion shouldn't happen.
(Once again my question stems from the fact that MSVC happens to think that unsigned int is type of such expression)
If we go to the draft C++ standard: N4140 section 5 it says:
Many binary operators that expect operands of arithmetic or
enumeration type cause conversions and yield result types in a similar
way. The purpose is to yield a common type, which is also the type of
the result. This pattern is called the usual arithmetic conversions,
which are defined as follows
and the following bullet applies:
Otherwise, the integral promotions (4.5) shall be performed on both operands.61 Then the following rules shall be applied to the promoted
operands:
and section 4.5 which says (emphasis mine):
A prvalue for an integral bit-field (9.6) can be converted to a
prvalue of type int if int can represent all the values of the
bit-field; otherwise, it can be converted to unsigned int if unsigned
int can represent all the values of the bit-field. If the bit-field is
larger yet, no integral promotion applies to it. If the bit-field has
an enumerated type, it is treated as any other value of that type for
promotion purposes.
So gcc and clang are correct, a and b should be promoted to int.

g++ compiles array with size given at runtime by const value (not constexpr)

Can someone clarify why is this legal C++ code? (Yes, I'm asking why my code works ;) )
#include <iostream>
#include <vector>
int main()
{
const std::size_t N = 10;
int a[N]{}; // value-initialize it to get rid of annoying un-initialized warnings in the following line
std::cout << a[5] << std::endl; // got a zero
}
The size of the array is declared as const (NOT constexpr), still the program compiles with no warnings (-Wall, -Wextra, -Wpedantic) in both g++ and clang++. I thought that the C++ standard explicitly specified that the size of the array should be a compile-time constant. It is absolutely not the case here.
N4140 §5.19 [expr.const]/p2, bullet 2.7.1, and p3:
2 A conditional-expression e is a core constant expression unless the
evaluation of e, following the rules of the abstract machine (1.9),
would evaluate one of the following expressions:
[...]
an lvalue-to-rvalue conversion (4.1) unless it is applied to
a non-volatile glvalue of integral or enumeration type that refers to a non-volatile const object with a preceding initialization,
initialized with a constant expression [ Note: a string literal
(2.14.5) corresponds to an array of such objects. —end note ]
a non-volatile glvalue that refers to a non-volatile object defined with constexpr, or that refers to a non-mutable sub-object of such an object, or
a non-volatile glvalue of literal type that refers to a non-volatile object whose lifetime began
within the evaluation of e;
[...]
3 An integral constant expression is an expression of integral or unscoped enumeration type, implicitly converted to a prvalue, where the converted expression is a core constant expression. [ Note: Such expressions may be used as array bounds (8.3.4, 5.3.4), as bit-field lengths (9.6), as enumerator initializers if the underlying type is not fixed (7.2), and as alignments (7.6.2). —end note ]
In your code, N is a "non-volatile glvalue of integral or enumeration type", it refers to a "non-volatile const object with a preceding initialization", so applying the lvalue-to-rvalue conversion to it does not prevent the expression from being a core constant expression despite the absence of constexpr.
Where did you get that strange idea that N is "absolutely NOT a compile-time constant", as you state in the code comments?
Since the beginning of times, a const integral object declared with an integral constant expression initializer by itself forms an integral constant expression. I.e. is a compile-time constant in C++.
This applies equally to namespace declarations, local declarations and static class member declarations.
(It would not be a compile-time constant in C. But it has always been a compile-time constant in C++.)
Well - N is constant during compilation so it is equivalent to
int a[10]{};
A const int initialized with a literal is considered a constant expression
From N1905 5.19
An integral constant-expression can involve only literals of arithmetic types enumerators, non-volatile const variables or static data members of integral or enumeration types initialized with constant expressions
Note the "non-volatile," indicating your original code should have been rejected by g++.

How to set the alignment in a platform independent way?

In the latest draft of the c++11 standard, chapter 3.11 talks about the alignment.
Later, the chapter 7.6.1 defines how to define an aligned structures (or variables?)
If I define a structure like this :
alignas(16) struct A
{
int n;
unsigned char[ 1020 ];
};
does it means that all instances of the class A are going to be aligned to 16 bytes?
Or, do I have to do it like in the next code?
struct A
{
char data[300];
};
alignas(16) A a;
If both examples are wrong, how to do it properly?
PS I am not looking for a compiler dependent solution.
Alignment is first and foremost a property of types.
It can be overridden for a type with alignas; alignas can also be used to assign a new alignment value to a specific object.
So, both examples are valid, and will have the semantics that you've presumed.
[n3290: 3.11/1]: Object types have alignment requirements (3.9.1,
3.9.2) which place restrictions on the addresses at which an object of
that type may be allocated. An alignment is an implementation-defined
integer value representing the number of bytes between successive
addresses at which a given object can be allocated. An object type
imposes an alignment requirement on every object of that type;
stricter alignment can be requested using the alignment specifier
(7.6.2).
[n3290: 7.6.2/1]: An alignment-specifier may be applied to a
variable or to a class data member, but it shall not be applied to a
bit-field, a function parameter, the formal parameter of a catch
clause (15.3), or a variable declared with the register storage class
specifier. An alignment-specifier may also be applied to the
declaration of a class or enumeration type. An alignment-specifier
with an ellipsis is a pack expansion (14.5.3).
[n3290: 7.6.2/2]: When the alignment-specifier is of the form
alignas( assignment-expression ):
the assignment-expression shall be an integral constant expression
if the constant expression evaluates
to a fundamental alignment, the alignment requirement of the declared
entity shall be the specified fundamental alignment
if the constant
expression evaluates to an extended alignment and the implementation
supports that alignment in the context of the declaration, the
alignment of the declared entity shall be that alignment
if the
constant expression evaluates to an extended alignment and the
implementation does not support that alignment in the context of the
declaration, the program is ill-formed
if the constant expression
evaluates to zero, the alignment specifier shall have no effect
otherwise, the program is ill-formed.