C++ casting bool to int - standard - c++

I am interrested wheather standard says anything about possible values of bool type type after casting it to integer type.
For example following code:
#include <iostream>
using namespace std;
int main() {
bool someValue=false;
*((int*)(&someValue)) = 50;
cout << someValue << endl;
return 0;
}
prints 1 even though it's forced to store value 50. Does standard specify anything about it? Or is compiler generating some method for type bool as:
operator int(){
return myValue !=0 ? 1 : 0;
}
Also why is casting like following:
reinterpret_cast<int>(someValue) = 50;
forbidden with error
error: invalid cast from type 'bool' to type 'int'
(For all above I user GCC 5.1 compiler.)

The way you are using it exhibits UB, because you write outside of the bool variable's boundaries AND you break strict aliasing rule.
However, if you have a bool and want to use it as a an int (this usually happens when you want to index into an array based on some condition), the standard mandates that a true bool converts into 1 and false bool converts into 0, no matter what (UB obviously excluded).
For example, this is guaranteed to output 52 as long as should_add == true.
int main(){
int arr[] = {0, 10};
bool should_add = 123;
int result = 42 + arr[should_add];
std::cout << result << '\n';
}

This line *((int*)(&someValue)) = 50; is at least non standard. The implementation could use a lesser rank for bool (say 1 or 2 bytes) that for int (say 4 bytes). In that case, you would write past the variable possibly erasing an other variable.
And anyway, as you were said in comment, thanks to the strict aliasing rule almost any access through a casted pointer can be seen as Undefined Behaviour by a compiler. The only almost legal one (for the strict aliasing rule) would be:
*((char *) &someValue) = 50;
on a little endian system, and
*(((char *) &someValue) + sizeof(bool) - 1) = 50;
on a big endian one (byte access has still not be forbidden).
Anyway, as the representation of bool is not specified by the standard directly writing something in a bool can lead to true or false depending on implementation. For example an implementation could considere only the lowest level bit (true if val&1 is 1, else 0), another one could considere all bits (true for any non 0 value, false for only 0). The only thing that standard says is that a conversion of a 0 leads to false and of a non 0 leads to true.
But was is mandated by standard is the conversion from bool to int:
4.5 Integral promotions [conv.prom]
...A prvalue of type bool can be converted to a prvalue of type int, with false becoming zero and true
becoming one.
So this fully explains that displaying a bool can only give 0 or 1 - even if as the previous operation invoked UB, anything could have happen here including this display
You invoked Undefined Behaviour - shame on you

Related

Is it UB to write values other than 0 or 1 in a bool? If yes, how do they compare? [duplicate]

This question already has answers here:
Setting extra bits in a bool makes it true and false at the same time
(2 answers)
Closed 3 years ago.
Consider the program below.
All comparisons are true with a recent gcc but only the value 1 compares equal with the Visual Studio commandline compiler v. 19.16.27031.1 for x86.
I believe that it's generally OK to write into PODs through char pointers; but is there wording in the standard about writing funny values into bool variables? If it is allowed, is there wording about the behavior in comparisons?
#include <iostream>
using namespace std;
void f()
{
if(sizeof(bool) != 1)
{
cout << "sizeof(bool) != 1\n";
return;
}
bool b;
*(char *)&b = 1;
if(b == true) { cout << (int) *(char *)&b << " is true\n"; }
*(char *)&b = 2;
if(b == true) { cout << (int) *(char *)&b << " is true\n"; }
*(char *)&b = 3;
if(b == true) { cout << (int) *(char *)&b << " is true\n"; }
}
int main()
{
f();
}
P.S. gcc 8.3 uses a test instruction to effectively check for non-zero while gcc 9.1 explicitly compares with 1, making only that comparison true. Perhaps this godbolt link works.
No. This is not OK.
Writting arbitrary data in a bool is much UB (see What is the strict aliasing rule?) and similar to Does the C++ standard allow for an uninitialized bool to crash a program?
*(char *)&b = 2;
This type punning hack invoke UB. According to your compiler implementation for bool and the optimization it is allowed to do, you could have demons flying off your nose.
Consider:
bool b;
b = char{2}; // 1
(char&)b = 2; // 2
*(char*)&b = 2; // 3
Here, lines 2 and 3 have the same meaning, but 1 has a different meaning. In line 1, since the value being assigned to the bool object is nonzero, the result is guaranteed to be true. However, in lines 2 and 3, the object representation of the bool object is being written to directly.
It is indeed legal to write to an object of any non-const type through an lvalue of type char, but:
In C++17, the standard does not specify the representation of bool objects. The bool type may have padding bits, and may even be larger than char. Thus, any attempt to write directly to a bool value in this way may yield an invalid (or "trap") object representation, which means that subsequently reading that value will yield undefined behaviour. Implementations may (but are not required by the standard to) define the representation of bool objects.
In C++20, my understanding is that thanks to P1236R1, there are no longer any trap representations, but the representation of bool is still not completely specified. The bool object may still be larger than char, so if you write to only the first byte of it, it can still contain an indeterminate value, yielding UB when accessed. If bool is 1 byte (which is likely), then the result is unspecified---it must yield some valid value of the underlying type (which will most likely be char or its signed or unsigned cousin) but the mapping of such values to true and false remains unspecified.
Writing any integer values into a bool through a pointer to a type other than bool is undefined behavior, because those may not match the compiler's representation of the type. And yes, writing something other than 0 or 1 will absolutely break things: compilers often rely on the exact internal representation of boolean true.
But bool b = 3 is fine, and just sets b to true (the rule for converting from integer types to bool is, any nonzero value becomes true and zero becomes false).
It's OK to assign values other than true and false to a variable of type bool.
The RHS is converted to a bool by using the standard conversion sequence to true/false before the value is assigned.
However, what you are trying to do is not OK.
*(char *)&b = 2; // Not OK
*(char *)&b = 3; // Not OK
Even assigning 1 and 0 by using that mechanism is not OK.
*(char *)&b = 1; // Not OK
*(char *)&b = 0; // Not OK
The following statements are OK.
b = 2; // OK
b = 3; // OK
Update, in response to OP's comment.
From the standard/basic.types#basic.fundamental-6:
Values of type bool are either true or false.
The standard does not mandate that true be represented as 1 and/or false be represented as 0. An implementation can choose a representation that best suits their needs.
The standard goes on to say this about value of bool types:
Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false.
Storing the value char(1) or char(0) in its memory location indirectly does not guarantee that the values will be properly converted to true/false. Since theose value may not represent either true or false in an implementation, accessing those values would lead to undefined behavior.
In general, it's perfectly find to assign values other than 0 or 1 to a bool:
7.3.14 Boolean conversions
[conv.bool]
1 A prvalue of arithmetic, unscoped enumeration, pointer, or pointer-to-member type can be converted to a prvalue of type bool. A zero value, null pointer value, or null member pointer value is converted to false; any other value is converted to true.
But your casting is another question entirely.
Be careful thinking it's ok to write to types through pointers to something else. You can get very surprising results, and the optimizer is allowed to assume certain such things are not done. I don't know all the rules for it, but the optimizer doesn't always follow writes through pointers to different types (it is allowed to do all sorts of things in the presence of undefined behavior!) But beware, code like this:
bool f()
{
bool a = true;
bool b = true;
*reinterpret_cast<char*>(&a) = 1;
*reinterpret_cast<char*>(&b) = 2;
return a == b;
}
Live: https://godbolt.org/z/hJnuSi
With optimizations:
g++: -> true (but the value is actually 2)
clang: -> false
main() {
std::cout << f() << "\n"; // g++ prints 2!!!
}
Though f() returns a bool, g++ actually prints out 2 in main here. Probably not expected.

Casting int to bool in C/C++

I know that in C and C++, when casting bools to ints, (int)true == 1 and (int)false == 0. I'm wondering about casting in the reverse direction...
In the code below, all of the following assertions held true for me in .c files compiled with Visual Studio 2013 and Keil µVision 5. Notice (bool)2 == true.
What do the C and C++ standards say about casting non-zero, non-one integers to bools? Is this behavior specified? Please include citations.
#include <stdbool.h>
#include <assert.h>
void TestBoolCast(void)
{
int i0 = 0, i1 = 1, i2 = 2;
assert((bool)i0 == false);
assert((bool)i1 == true);
assert((bool)i2 == true);
assert(!!i0 == false);
assert(!!i1 == true);
assert(!!i2 == true);
}
Not a duplicate of Can I assume (bool)true == (int)1 for any C++ compiler?:
Casting in the reverse direction (int --> bool).
No discussion there of non-zero, non-one values.
0 values of basic types (1)(2)map to false.
Other values map to true.
This convention was established in original C, via its flow control statements; C didn't have a boolean type at the time.
It's a common error to assume that as function return values, false indicates failure. But in particular from main it's false that indicates success. I've seen this done wrong many times, including in the Windows starter code for the D language (when you have folks like Walter Bright and Andrei Alexandrescu getting it wrong, then it's just dang easy to get wrong), hence this heads-up beware beware.
There's no need to cast to bool for built-in types because that conversion is implicit. However, Visual C++ (Microsoft's C++ compiler) has a tendency to issue a performance warning (!) for this, a pure silly-warning. A cast doesn't suffice to shut it up, but a conversion via double negation, i.e. return !!x, works nicely. One can read !! as a “convert to bool” operator, much as --> can be read as “goes to”. For those who are deeply into readability of operator notation. ;-)
1) C++14 §4.12/1 “A zero value, null pointer value, or null member pointer value is converted to false; any other value is converted to true. For direct-initialization (8.5), a prvalue of type std::nullptr_t can be converted to a prvalue of type bool; the resulting value is false.”
2) C99 and C11 §6.3.1.2/1 “When any scalar value is converted to _Bool, the result is 0 if the value compares equal to 0; otherwise, the result is 1.”
The following cites the C11 standard (final draft).
6.3.1.2: When any scalar value is converted to _Bool, the result is 0 if the value compares equal to 0; otherwise, the result is 1.
bool (mapped by stdbool.h to the internal name _Bool for C) itself is an unsigned integer type:
... The type _Bool and the unsigned integer types that correspond to the standard signed integer types are the standard unsigned integer types.
According to 6.2.5p2:
An object declared as type _Bool is large enough to store the values 0 and 1.
AFAIK these definitions are semantically identical to C++ - with the minor difference of the built-in(!) names. bool for C++ and _Bool for C.
Note that C does not use the term rvalues as C++ does. However, in C pointers are scalars, so assigning a pointer to a _Bool behaves as in C++.
There some kind of old school 'Marxismic' way to the cast int -> bool without C4800 warnings of Microsoft's cl compiler - is to use negation of negation.
int i = 0;
bool bi = !!i;
int j = 1;
bool bj = !!j;

Is it safe to compare boolean variable with 1 and 0 in C, C++? [duplicate]

This question already has answers here:
Can I assume (bool)true == (int)1 for any C++ compiler?
(5 answers)
Closed 8 years ago.
Consider the code
bool f() { return 42; }
if (f() == 1)
printf("hello");
Does C (C99+ with stdbool.h) and C++ standards guarantee that "hello" will printed? Does
bool a = x;
is always equivalent to
bool a = x ? 1 : 0;
Yes. You are missing a step though. "0" is false and every other int is true, but f() always returns true ("1"). It doesn't return 42, the casting occurs in "return 42;".
In C macro bool (we are speaking about the macro defined in stdbool.h) expands to _Bool that has only two values 0 and 1.
In C++ the value of f() in expression f() == 1 is implicitly converted to int 1 according to the integral promotion.
So in my opinion this code
bool f() { return 42; }
if (f() == 1)
printf("hello");
is safe.
In C++, bool is a built-in type. Conversions from any type to bool always yield false (0) or true (1).
Prior to the 1999 ISO C standard, C did not have a built-in Boolean type. It was (and still is) common for programmers to define their own Boolean types, for example:
typedef int BOOL;
#define FALSE 0
#define TRUE 1
or
typedef enum { false, true } bool;
Any such type is at least 1 byte in size, and can store values other than 0 or 1, so equality comparisons to 0 or 1 are potentially unsafe.
C99 added a built-in type _Bool, with conversion semantics similar to those for bool in C++; it can also be referred to as bool if you have #include <stdbool.h>.
In either C or C++, code whose behavior is undefined can potentially store a value other than 0 or 1 in a bool object. For example, this:
bool b;
*(char*)&b = 2;
will (probably) store the value 2 in b, but a C++ compiler may assume that its value is either 0 or 1; a comparison like b == 0 or b == true may either succeed or fail.
My advice:
Don't write code that stores strange values in bool objects.
Don't compare bool values for equality or inequality to 0, 1, false, or true.
In your example:
bool f() { return 42; }
Assuming this is either C++ or C with <stdbool.h>, this function will return true or, equivalently, 1, since the conversion of 42 to bool yields 1.
if (f() == 1)
printf("hello");
Since you haven't constructed any strange bool values, this is well behaved and will print "hello".
But there's no point in making the comparison explicitly. f() is already of type bool, so it's already usable as a condition. You can (and probably should) just write:
if (f())
printf("hello");
Writing f() == 1 is no more helpful than writing (f() == 1) == 1).
In a real program, presumably you'll have given your function a meaningful name that makes it clear that its value represents a condition:
if (greeting_required())
printf("hello");
The only real trick that I know of, that is useful with pre-C99 environments is the double negation
int a = 42;
if ( (!!a) != 0 ) printf("Hello\n");
this will print Hello because the result of the !! operation is a boolean that is true when the value is non-zero, false otherwise. But this is gonna cost you 2 negation to get the boolean that you want, in modern standards this is redundant because you will get the same result without the !! and it's a result granted by the language.

Engineered bool compares equal to both true and false, why?

The example bellows compiles, but the output is rather strange :
#include <iostream>
#include <cstring>
struct A
{
int a;
char b;
bool c;
};
int main()
{
A v;
std::memset( &v, 0xff, sizeof(v) );
std::cout << std::boolalpha << ( true == v.c ) << std::endl;
std::cout << std::boolalpha << ( false == v.c ) << std::endl;
}
the output is :
true
true
Can someone explains why?
If it matters, I am using g++ 4.3.0
Found this in the C++ standard, section 3.9.1 "Fundamental types" (note the magic footnote 42):
6. Values of type bool are either true or false. 42)
42) Using a bool value in ways described by this International Standard as ‘‘undefined,’’ such as by examining the value of an uninitialized automatic variable, might cause it to behave as if it is neither true nor false.
This is not perfectly clear for me, but seems to answer the question.
The result of overwriting memory location used by v is undefined behaviour.
Everything may happen, according to the standard (including your computer flying off and eating your breakfast).
A boolean value whose memory is set to a value that is not one or zero has undefined behaviour.
I thing I found the answer. 3.9.1-6 says :
Values of type bool are either true or
false.42) [Note: there are no signed,
unsigned, short, or long bool types or
values. ] As described below, bool
values behave as integral types.
Values of type bool participate in
integral promotions (4.5).
Where the note 42 says :
42) Using a bool value in ways
described by this International
Standard as ‘‘undefined,’’ such as by
examining the value of an
uninitialized automatic variable,
might cause it to behave as if it is
neither true nor false.
I can't seem to find anything in the standard that indicates why this would happen (most possibly my fault here) -- this does include the reference provided by 7vies, which is not in itself very helpful. It is definitely undefined behavior, but I can't explain the specific behavior that is observed by the OP.
As a practical matter, I 'm very surprised that the output is
true
true
Using VS2010, the output is the much more easy to explain:
false
false
In this latter case, what happens is:
comparisons to boolean true are implemented by the compiler as tests for equality to 0x01, and since 0xff != 0x01 the result is false.
same goes for comparisons to boolean false, only the value compared with is now 0x00.
I can't think of any implementation detail that would cause false to compared equal to the value 0xff when interpreted as bool. Anyone have any ideas about that?

MSVC++: Strangeness with unsigned ints and overflow

I've got the following code:
#include <iostream>
using namespace std;
int main(int argc, char *argv[])
{
string a = "a";
for(unsigned int i=a.length()-1; i+1 >= 1; --i)
{
if(i >= a.length())
{
cerr << (signed int)i << "?" << endl;
return 0;
}
}
}
If I compile in MSVC with full optimizations, the output I get is "-1?". If I compile in Debug mode (no optimizations), I get no output (expected.)
I thought the standard guaranteed that unsigned integers overflowed in a predictable way, so that when i = (unsigned int)(-1), i+1 = 0, and the loop condition i + 1 >= 1 fails. Instead, the test is somehow passing. Is this a compiler bug, or am I doing something undefined somewhere?
I remember having this problem in 2001. I'm amazed it's still there. Yes, this is a compiler bug.
The optimiser is seeing
i + 1 >= 1;
Theoretically, we can optimise this by putting all of the constants on the same side:
i >= (1-1);
Because i is unsigned, it will always be greater than or equal to zero.
See this newsgroup discussion here.
ISO14882:2003, section 5, paragraph 5:
If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined, unless such an expression is a constant expression (5.19), in which case the program is ill-formed.
(Emphasis mine.) So, yes, the behavior is undefined. The standard makes no guarantees of behavior in the case of integer over/underflow.
Edit: The standard seems slightly conflicted on the matter elsewhere.
Section 3.9.1.4 says:
Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2 n where n is the number of bits in the value representation of that particular size of integer.
But section 4.7.2 and .3 says:
2) If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2 n where n is the number of bits used to represent the unsigned type). [Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ]
3) If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.
(Emphasis mine.)
I'm not certain, but I think you are probably running foul of a bug.
I suspect the trouble is in how the compiler is treating the for control. I could imagine the optimizer doing:
for(unsigned int i=a.length()-1; i+1 >= 1; --i) // As written
for (unsigned int i = a.length()-1; i >= 0; --i) // Noting 1 appears twice
for (unsigned int i = a.length()-1; ; --i) // Because i >= 0 at all times
Whether that is what is happening is another matter, but it might be enough to confuse the optimizer.
You would probably be better off using a more standard loop formulation:
for (unsigned i = a.length()-1; i-- > 0; )
Yup, I just tested this on Visual Studio 2005, it definitely behaves differently in Debug and Release. I wonder if 2008 fixes it.
Interestingly it complained about your implicit cast from size_t (.length's result) to unsigned int, but has no problem generating bad code.