The example bellows compiles, but the output is rather strange :
#include <iostream>
#include <cstring>
struct A
{
int a;
char b;
bool c;
};
int main()
{
A v;
std::memset( &v, 0xff, sizeof(v) );
std::cout << std::boolalpha << ( true == v.c ) << std::endl;
std::cout << std::boolalpha << ( false == v.c ) << std::endl;
}
the output is :
true
true
Can someone explains why?
If it matters, I am using g++ 4.3.0
Found this in the C++ standard, section 3.9.1 "Fundamental types" (note the magic footnote 42):
6. Values of type bool are either true or false. 42)
42) Using a bool value in ways described by this International Standard as ‘‘undefined,’’ such as by examining the value of an uninitialized automatic variable, might cause it to behave as if it is neither true nor false.
This is not perfectly clear for me, but seems to answer the question.
The result of overwriting memory location used by v is undefined behaviour.
Everything may happen, according to the standard (including your computer flying off and eating your breakfast).
A boolean value whose memory is set to a value that is not one or zero has undefined behaviour.
I thing I found the answer. 3.9.1-6 says :
Values of type bool are either true or
false.42) [Note: there are no signed,
unsigned, short, or long bool types or
values. ] As described below, bool
values behave as integral types.
Values of type bool participate in
integral promotions (4.5).
Where the note 42 says :
42) Using a bool value in ways
described by this International
Standard as ‘‘undefined,’’ such as by
examining the value of an
uninitialized automatic variable,
might cause it to behave as if it is
neither true nor false.
I can't seem to find anything in the standard that indicates why this would happen (most possibly my fault here) -- this does include the reference provided by 7vies, which is not in itself very helpful. It is definitely undefined behavior, but I can't explain the specific behavior that is observed by the OP.
As a practical matter, I 'm very surprised that the output is
true
true
Using VS2010, the output is the much more easy to explain:
false
false
In this latter case, what happens is:
comparisons to boolean true are implemented by the compiler as tests for equality to 0x01, and since 0xff != 0x01 the result is false.
same goes for comparisons to boolean false, only the value compared with is now 0x00.
I can't think of any implementation detail that would cause false to compared equal to the value 0xff when interpreted as bool. Anyone have any ideas about that?
Related
This question already has answers here:
Setting extra bits in a bool makes it true and false at the same time
(2 answers)
Closed 3 years ago.
Consider the program below.
All comparisons are true with a recent gcc but only the value 1 compares equal with the Visual Studio commandline compiler v. 19.16.27031.1 for x86.
I believe that it's generally OK to write into PODs through char pointers; but is there wording in the standard about writing funny values into bool variables? If it is allowed, is there wording about the behavior in comparisons?
#include <iostream>
using namespace std;
void f()
{
if(sizeof(bool) != 1)
{
cout << "sizeof(bool) != 1\n";
return;
}
bool b;
*(char *)&b = 1;
if(b == true) { cout << (int) *(char *)&b << " is true\n"; }
*(char *)&b = 2;
if(b == true) { cout << (int) *(char *)&b << " is true\n"; }
*(char *)&b = 3;
if(b == true) { cout << (int) *(char *)&b << " is true\n"; }
}
int main()
{
f();
}
P.S. gcc 8.3 uses a test instruction to effectively check for non-zero while gcc 9.1 explicitly compares with 1, making only that comparison true. Perhaps this godbolt link works.
No. This is not OK.
Writting arbitrary data in a bool is much UB (see What is the strict aliasing rule?) and similar to Does the C++ standard allow for an uninitialized bool to crash a program?
*(char *)&b = 2;
This type punning hack invoke UB. According to your compiler implementation for bool and the optimization it is allowed to do, you could have demons flying off your nose.
Consider:
bool b;
b = char{2}; // 1
(char&)b = 2; // 2
*(char*)&b = 2; // 3
Here, lines 2 and 3 have the same meaning, but 1 has a different meaning. In line 1, since the value being assigned to the bool object is nonzero, the result is guaranteed to be true. However, in lines 2 and 3, the object representation of the bool object is being written to directly.
It is indeed legal to write to an object of any non-const type through an lvalue of type char, but:
In C++17, the standard does not specify the representation of bool objects. The bool type may have padding bits, and may even be larger than char. Thus, any attempt to write directly to a bool value in this way may yield an invalid (or "trap") object representation, which means that subsequently reading that value will yield undefined behaviour. Implementations may (but are not required by the standard to) define the representation of bool objects.
In C++20, my understanding is that thanks to P1236R1, there are no longer any trap representations, but the representation of bool is still not completely specified. The bool object may still be larger than char, so if you write to only the first byte of it, it can still contain an indeterminate value, yielding UB when accessed. If bool is 1 byte (which is likely), then the result is unspecified---it must yield some valid value of the underlying type (which will most likely be char or its signed or unsigned cousin) but the mapping of such values to true and false remains unspecified.
Writing any integer values into a bool through a pointer to a type other than bool is undefined behavior, because those may not match the compiler's representation of the type. And yes, writing something other than 0 or 1 will absolutely break things: compilers often rely on the exact internal representation of boolean true.
But bool b = 3 is fine, and just sets b to true (the rule for converting from integer types to bool is, any nonzero value becomes true and zero becomes false).
It's OK to assign values other than true and false to a variable of type bool.
The RHS is converted to a bool by using the standard conversion sequence to true/false before the value is assigned.
However, what you are trying to do is not OK.
*(char *)&b = 2; // Not OK
*(char *)&b = 3; // Not OK
Even assigning 1 and 0 by using that mechanism is not OK.
*(char *)&b = 1; // Not OK
*(char *)&b = 0; // Not OK
The following statements are OK.
b = 2; // OK
b = 3; // OK
Update, in response to OP's comment.
From the standard/basic.types#basic.fundamental-6:
Values of type bool are either true or false.
The standard does not mandate that true be represented as 1 and/or false be represented as 0. An implementation can choose a representation that best suits their needs.
The standard goes on to say this about value of bool types:
Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false.
Storing the value char(1) or char(0) in its memory location indirectly does not guarantee that the values will be properly converted to true/false. Since theose value may not represent either true or false in an implementation, accessing those values would lead to undefined behavior.
In general, it's perfectly find to assign values other than 0 or 1 to a bool:
7.3.14 Boolean conversions
[conv.bool]
1 A prvalue of arithmetic, unscoped enumeration, pointer, or pointer-to-member type can be converted to a prvalue of type bool. A zero value, null pointer value, or null member pointer value is converted to false; any other value is converted to true.
But your casting is another question entirely.
Be careful thinking it's ok to write to types through pointers to something else. You can get very surprising results, and the optimizer is allowed to assume certain such things are not done. I don't know all the rules for it, but the optimizer doesn't always follow writes through pointers to different types (it is allowed to do all sorts of things in the presence of undefined behavior!) But beware, code like this:
bool f()
{
bool a = true;
bool b = true;
*reinterpret_cast<char*>(&a) = 1;
*reinterpret_cast<char*>(&b) = 2;
return a == b;
}
Live: https://godbolt.org/z/hJnuSi
With optimizations:
g++: -> true (but the value is actually 2)
clang: -> false
main() {
std::cout << f() << "\n"; // g++ prints 2!!!
}
Though f() returns a bool, g++ actually prints out 2 in main here. Probably not expected.
I am interrested wheather standard says anything about possible values of bool type type after casting it to integer type.
For example following code:
#include <iostream>
using namespace std;
int main() {
bool someValue=false;
*((int*)(&someValue)) = 50;
cout << someValue << endl;
return 0;
}
prints 1 even though it's forced to store value 50. Does standard specify anything about it? Or is compiler generating some method for type bool as:
operator int(){
return myValue !=0 ? 1 : 0;
}
Also why is casting like following:
reinterpret_cast<int>(someValue) = 50;
forbidden with error
error: invalid cast from type 'bool' to type 'int'
(For all above I user GCC 5.1 compiler.)
The way you are using it exhibits UB, because you write outside of the bool variable's boundaries AND you break strict aliasing rule.
However, if you have a bool and want to use it as a an int (this usually happens when you want to index into an array based on some condition), the standard mandates that a true bool converts into 1 and false bool converts into 0, no matter what (UB obviously excluded).
For example, this is guaranteed to output 52 as long as should_add == true.
int main(){
int arr[] = {0, 10};
bool should_add = 123;
int result = 42 + arr[should_add];
std::cout << result << '\n';
}
This line *((int*)(&someValue)) = 50; is at least non standard. The implementation could use a lesser rank for bool (say 1 or 2 bytes) that for int (say 4 bytes). In that case, you would write past the variable possibly erasing an other variable.
And anyway, as you were said in comment, thanks to the strict aliasing rule almost any access through a casted pointer can be seen as Undefined Behaviour by a compiler. The only almost legal one (for the strict aliasing rule) would be:
*((char *) &someValue) = 50;
on a little endian system, and
*(((char *) &someValue) + sizeof(bool) - 1) = 50;
on a big endian one (byte access has still not be forbidden).
Anyway, as the representation of bool is not specified by the standard directly writing something in a bool can lead to true or false depending on implementation. For example an implementation could considere only the lowest level bit (true if val&1 is 1, else 0), another one could considere all bits (true for any non 0 value, false for only 0). The only thing that standard says is that a conversion of a 0 leads to false and of a non 0 leads to true.
But was is mandated by standard is the conversion from bool to int:
4.5 Integral promotions [conv.prom]
...A prvalue of type bool can be converted to a prvalue of type int, with false becoming zero and true
becoming one.
So this fully explains that displaying a bool can only give 0 or 1 - even if as the previous operation invoked UB, anything could have happen here including this display
You invoked Undefined Behaviour - shame on you
I have the following program:
#include <iostream>
#include <cmath>
int main() {
double a = 1;
double b = nan("");
std::cout << (a > b) << std::endl;
std::cout << (b > a) << std::endl;
return 0;
}
Output:
0
0
In general from the meaning of nan - not a number it is clear that any operation with nan is essentially pointless. From IEEE-754 that I found in internet I found what if in FPU at least one of operands is nan the result is also nan, but I found nothing about comparison between normal value and nan as in example above.
What does standard say about it?
What does standard say about it?
The C++ standard does not say how operations on NaNs behave. It is left unspecified. So, as far as C++ is concerned, any result is possible and allowed.
ANSI/IEEE Std 754–1985 says:
5.7. Comparison
... Every NaN shall compare unordered with everything, including itself. ...
What unordered means exactly is shown in the Table 4 in the same section. But in short, this means that the comparison shall return false, if any of the operands is NaN, except != shall return true.
The 0 you're seeing means false in this case as that's what the stream shows for false by default. If you want to see it as true or false use std::boolalpha:
std::cout << std::boolalpha << (a > b) << std::endl;
Whey comparing floating point values where one of the values is nan then x<y, x>y, x<=y, x>=y, and x==y will all evaluate to false, whereas x!=y will always be true. Andrew Koenig has a good article on this on the Dr Dobbs website.
When you think about it the result cannot be nan since comparison operators need to return a boolean which can only have 2 states.
0 here means false.
Nan is not equal or comparable to any value so the result of operation is false(0).
Well, on top of #user2079303 pretty good answer, there are two NaNs: quiet NaN and signaling NaN. You could check std::numeric_limits<T>::has_signaling_NaN on your platform is signaling NaN is available. If it is true and value contains std::numeric_limits<T>::signaling_NaN, then
When a signaling NaN is used as an argument to an arithmetic expression, the appropriate floating-point exception may be raised and the NaN is "quieted", that is, the expression returns a quiet NaN.
To really get FP exception you might want to set FPU control word (for x87 unit) or MXCSR register (for SSE2+ unit). This is true for x86/x64 platform, check your platform docs for similar functionality
While doing a conversion test I encounter some strange behavior in C++.
Context
The online C++ reference indicates that the return value of std::numeric_limits<double>::max() (defined in limit.h) should be DBL_MAX (defined in float.h). In my test, when I print these values out, both are indeed exactly the same. However, when I cast them from double to int, strange things came out.
'Same' input, different results?
int32_t t1 = (int) std::numeric_limits<double>::max(); sets t1 to INT_MIN, but int32_t t2 = (int) DBL_MAX; sets t2 to INT_MAX. The same is true when the cast is done using static_cast<int>.
'Same' input, same results in similar situation
However, if I define a function
int32_t doubleToInt(double dvalue) {
return (int) value;
}
both doubleToInt(std::numeric_limits<double>::max()) and doubleToInt(DBL_MAX) return INT_MIN.
To help make sense of things, I implemented a similar program in Java. There, all casts returned the value of INT_MAX, regardless of being in a function or not.
Can someone point out the reason why in C++ the result is INT_MIN in some cases, and INT_MAX in the others? What should the expected behaviour be like when casting DBL_MAX to int in C++?
Sample Code for C++
#include <iostream>
#include <limits>
#include <float.h>
#include <stdlib.h>
#include <stdio.h>
using namespace std;
template <typename T, typename D> D cast(T a, D b) { return (D) a;}
int main()
{
int32_t t1 = 9;
std::cout << std::numeric_limits<double>::max() << std::endl;
std::cout << DBL_MAX << std::endl;
std::cout << (int32_t) std::numeric_limits<double>::max() << std::endl;
std::cout << (int32_t) DBL_MAX << std::endl;
std::cout << cast(std::numeric_limits<double>::max(), t1) << std::endl;
std::cout << cast(DBL_MAX, t1) << std::endl;
return 0;
}
For completeness: I am using cygwin gcc and java 8.
Attempting to convert a floating point number greater than INT_MAX to an int is undefined behaviour:
A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be
represented in the destination type. (§4.9 [conv.fpint], para. 1)
So a compiler can produce any value (or even do something else, like throw an exception) for the conversion. Different compilers can do different things. The same compiler can do different things at different times.
There is no real point attempting to understand why a particular instance of undefined behaviour shows the result it shows (unless you are trying to reverse engineer the compiler, and even then UB is not usually particularly interesting). Rather, you need to concentrate on avoiding undefined behaviour.
For example, since any out-of-range cast of a floating value to an integer is undefined, you need to ensure that such casts do not involve out-of-range values. Unlike some other languages [note 1], the C++ standard does not provide an easily-recognizable result which can be tested for, so you need to test before doing the cast.
Note that DBL_MAX is a macro, whose substitution is a string representing an approximation of the largest representable floating point number. std::numeric_limits<double>::max(), on the other hand, is the precise largest representable floating point number.
The difference should not normally be noticeable, but (as indicated by a note in the standard in §5.20 [expr.const], para 6):
Since this International Standard imposes no restrictions on the accuracy of floating-point operations, it is unspecified whether the evaluation of a floating-point expression during translation yields the same result as the evaluation of the same expression (or the same operations on the same values) during program execution.
Although std::numeric_limits<double>::max() is declared a constexpr, the cast to int is not a constant expression (as per §5.20/p2.5) precisely because its behaviour is undefined.
Notes
In Java, for example, the conversion is well defined. See the Java Language Specification for details.
This question already has answers here:
Engineered bool compares equal to both true and false, why?
(5 answers)
Closed 8 years ago.
This is my code:
#include <cstring>
#include <iostream>
int main() {
bool a;
memset(&a, 0x03, sizeof(bool));
if (a) {
std::cout << "a is true!" << std::endl;
}
if (!a) {
std::cout << "!a is true!" << std::endl;
}
}
It outputs:
a is true!
!a is true!
It seems that the ! operator on bool only inverts the last bit, but every value that does not equal 0 is treated as true. This leads to the shown behavior, which is logically wrong. Is that a fault in the implementation, or does the specification allow this? Note that the memset can be omitted, and the behavior would probably be the same because a contains memory garbage.
I'm on gcc 4.4.5, other compilers might do it differently.
The standard (3.9.1/6 Fundamental types) says:
Values of type bool are either true or false.
....
Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an
uninitialized automatic object, might cause it to behave as if it is neither true nor false.
Your program's use of memset leads to undefined behaviour. The consequence of which might be that the value is neither true nor false.
It's not "logically wrong", it's undefined behaviour. bool is only supposed to contain one of two values, true or false. Assigning a value to it will cause a conversion to one of these values. Breaking type-safety by writing an arbitrary byte value on top of its memory (or, as you mention, leaving it unintialised) will not, so you might well end up with a value that's neither true nor false.
Internally it is likely using a bitwise not (~ operator) to invert it, which would work when the bool was either zero or all ones:
a = 00000000 (false)
!a = 11111111 (true)
However if you set it to three:
a = 00000011 (true)
!a = 11111100 (also true)