Say I have a class C that I want to be able to implicitly cast to bool to use in if statements.
class C {
public:
...
operator bool() { return data ? true : false; }
private:
void * data;
};
and
C c;
...
if (c) ...
But the cast operator has a conditional which is technically overhead (even if relatively insignificant). If data was public I could do if (c.data) instead which is entirely possible and does not involve any conditionals. I doubt that the compiler will do any implicit conversion involving a conditional in the latter scenario, since it will likely generate a "jump if zero" or "jump if not zero" which doesn't really need any Boolean value, which the CPU will most likely have no notion of anyway.
My question is whether the typecast operator overload will indeed be less efficient than directly using the data member.
Note that I did establish that if the typecast directly returns data it also works, probably using the same type of implicit (hypothetical and not really happening in practice) conversion that would be used in the case of if (c.data).
Edit: Just to clarify, the point of the matter is actually a bit hypothetical. The dilemma is that Boolean is itself a hypothetical construct (which didn't initially exist in C/C++), in reality it is just integers. As I mentioned, the typecast can directly return data or use != instead, but it is really not very readable, but even that is not the issue. I don't really know how to word it to make sense of it better, the C class has a void * that is an integer, the CPU has conditional jumps which use integers, the issue is that abiding to the hypothetical Boolean construct that sits in the middle mandates the extra conditional. Dunno if that "clarification" made things any more clear though...
My question is whether the typecast operator overload will indeed be less efficient than directly using the data member.
Only examining your compiler output - with the specific optimisation flags you'd like to use - can tell you for sure, and then it might change after some seemingly irrelevant change like adding an extra variable somewhere in the calling context, or perhaps with the next compiler release etc....
More generally, C++ wouldn't be renowned for speed if the optimisers didn't tend to handle this kind of situation perfectly, so your odds are very good.
Further, write working code then profile it and you'll learn a lot more about what performance problems are actually significant.
It depends on how smart your compiler's optimizer is. I think they should be smart enough to remove the useless ? true: false operation, because the typecast operation should be inlined.
Or you could just write this and not worry about it:
operator bool() { return data; }
Since there's a built-in implicit typecast from void* to bool, data gets typecast on the way out the function.
I don't remember if the conditional in if expects bool or void*; at one point, before C++ added bool, it was the latter. (operator! in the iostream classes returned void* back then.)
On modern compilers these two functions produce the same machine code:
bool toBool1(void* ptr) {
return ptr ? true : false;
}
bool toBool2(void* ptr) {
return ptr;
}
Demo
So it really doesn't matter.
Related
I want to overload the == operator for a simple struct
struct MyStruct {
public:
int a;
float b;
bool operator==( ) { }
}
All the examples I'm seeing seem to pass the value by reference using a &.
But I really want to pass these structs by value.
Is there anything wrong with me writing this as
bool operator== (MyStruct another) { return ( (a==another.a) && (b==another.b) ); }
It should really not matter expect that you pay the penalty of a copy when you pass by value. This applies if the struct is really heavy. In the simple example you quote, there may not be a big difference.
That being said, passing by const reference makes more sense since it expresses the intent of the overloaded function == clearly. const makes sure that the overloaded function accidentally doesn't modify the object and passing by reference saves you from making a copy. For == operator, there is no need to pass a copy just for comparison purposes.
If you are concerned about consistency, it's better to switch the other pass by value instances to pass by const ref.
While being consistent is laudable goal, one shouldn't overdo it. A program containing only 'A' characters would be very consistent, but hardly useful. Argument passing mechanism is not something you do out of consistency, it is a technical decision based on certain technical aspects.
For example, in your case, passing by value could potentially lead to better performance, since the struct is small enough and on AMD64 ABI (the one which is used on any 64bit Intel/AMD chip) it will be passed in a register, thus saving time normally associated with dereferencing.
On the hand, in your case, it is reasonable to assume that the function will be inlined, and passing scheme will not matter at all (since it won't be passed). This is proven by codegen here (no call to operator== exist in generated assembly): https://gcc.godbolt.org/z/G7oEgE
I just came onto a project with a pretty huge code base.
I'm mostly dealing with C++ and a lot of the code they write uses double negation for their boolean logic.
if (!!variable && (!!api.lookup("some-string"))) {
do_some_stuff();
}
I know these guys are intelligent programmers, it's obvious they aren't doing this by accident.
I'm no seasoned C++ expert, my only guess at why they are doing this is that they want to make absolutely positive that the value being evaluated is the actual boolean representation. So they negate it, then negate that again to get it back to its actual boolean value.
Is this correct, or am I missing something?
It's a trick to convert to bool.
It's actually a very useful idiom in some contexts. Take these macros (example from the Linux kernel). For GCC, they're implemented as follows:
#define likely(cond) (__builtin_expect(!!(cond), 1))
#define unlikely(cond) (__builtin_expect(!!(cond), 0))
Why do they have to do this? GCC's __builtin_expect treats its parameters as long and not bool, so there needs to be some form of conversion. Since they don't know what cond is when they're writing those macros, it is most general to simply use the !! idiom.
They could probably do the same thing by comparing against 0, but in my opinion, it's actually more straightforward to do the double-negation, since that's the closest to a cast-to-bool that C has.
This code can be used in C++ as well... it's a lowest-common-denominator thing. If possible, do what works in both C and C++.
The coders think that it will convert the operand to bool, but because the operands of && are already implicitly converted to bool, it's utterly redundant.
Yes it is correct and no you are not missing something. !! is a conversion to bool. See this question for more discussion.
It's a technique to avoid writing (variable != 0) - i.e. to convert from whatever type it is to a bool.
IMO Code like this has no place in systems that need to be maintained - because it is not immediately readable code (hence the question in the first place).
Code must be legible - otherwise you leave a time debt legacy for the future - as it takes time to understand something that is needlessly convoluted.
It side-steps a compiler warning. Try this:
int _tmain(int argc, _TCHAR* argv[])
{
int foo = 5;
bool bar = foo;
bool baz = !!foo;
return 0;
}
The 'bar' line generates a "forcing value to bool 'true' or 'false' (performance warning)" on MSVC++, but the 'baz' line sneaks through fine.
Legacy C developers had no Boolean type, so they often #define TRUE 1 and #define FALSE 0 and then used arbitrary numeric data types for Boolean comparisons. Now that we have bool, many compilers will emit warnings when certain types of assignments and comparisons are made using a mixture of numeric types and Boolean types. These two usages will eventually collide when working with legacy code.
To work around this problem, some developers use the following Boolean identity: !num_value returns bool true if num_value == 0; false otherwise. !!num_value returns bool false if num_value == 0; true otherwise. The single negation is sufficient to convert num_value to bool; however, the double negation is necessary to restore the original sense of the Boolean expression.
This pattern is known as an idiom, i.e., something commonly used by people familiar with the language. Therefore, I don't see it as an anti-pattern, as much as I would static_cast<bool>(num_value). The cast might very well give the correct results, but some compilers then emit a performance warning, so you still have to address that.
The other way to address this is to say, (num_value != FALSE). I'm okay with that too, but all in all, !!num_value is far less verbose, may be clearer, and is not confusing the second time you see it.
Is operator! overloaded?
If not, they're probably doing this to convert the variable to a bool without producing a warning. This is definitely not a standard way of doing things.
!! was used to cope with original C++ which did not have a boolean type (as neither did C).
Example Problem:
Inside if(condition), the condition needs to evaluate to some type like double, int, void*, etc., but not bool as it does not exist yet.
Say a class existed int256 (a 256 bit integer) and all integer conversions/casts were overloaded.
int256 x = foo();
if (x) ...
To test if x was "true" or non-zero, if (x) would convert x to some integer and then assess if that int was non-zero. A typical overload of (int) x would return only the LSbits of x. if (x) was then only testing the LSbits of x.
But C++ has the ! operator. An overloaded !x would typically evaluate all the bits of x. So to get back to the non-inverted logic if (!!x) is used.
Ref Did older versions of C++ use the `int` operator of a class when evaluating the condition in an `if()` statement?
As Marcin mentioned, it might well matter if operator overloading is in play. Otherwise, in C/C++ it doesn't matter except if you're doing one of the following things:
direct comparison to true (or in C something like a TRUE macro), which is almost always a bad idea. For example:
if (api.lookup("some-string") == true) {...}
you simply want something converted to a strict 0/1 value. In C++ an assignment to a bool will do this implicitly (for those things that are implicitly convertible to bool). In C or if you're dealing with a non-bool variable, this is an idiom that I've seen, but I prefer the (some_variable != 0) variety myself.
I think in the context of a larger boolean expression it simply clutters things up.
If variable is of object type, it might have a ! operator defined but no cast to bool (or worse an implicit cast to int with different semantics. Calling the ! operator twice results in a convert to bool that works even in strange cases.
This may be an example of the double-bang trick (see The Safe Bool Idiom for more details). Here I summarize the first page of the article.
In C++ there are a number of ways to provide Boolean tests for classes.
An obvious way is the operator bool conversion operator.
// operator bool version
class Testable {
bool ok_;
public:
explicit Testable(bool b = true) : ok_(b) {}
operator bool() const { // use bool conversion operator
return ok_;
}
};
We can test the class as thus:
Testable test;
if (test) {
std::cout << "Yes, test is working!\n";
}
else {
std::cout << "No, test is not working!\n";
}
However, operator bool is considered unsafe because it allows nonsensical operations such as test << 1; or int i = test.
Using operator! is safer because we avoid implicit conversion or overloading issues.
The implementation is trivial,
bool operator!() const { // use operator!
return !ok_;
}
The two idiomatic ways to test Testable object are
Testable test;
if (!!test) {
std::cout << "Yes, test is working!\n";
}
if (!test) {
std::cout << "No, test is not working!\n";
}
The first version if (!!test) is what some people call the double-bang trick.
It's correct but, in C, pointless here -- 'if' and '&&' would treat the expression the same way without the '!!'.
The reason to do this in C++, I suppose, is that '&&' could be overloaded. But then, so could '!', so it doesn't really guarantee you get a bool, without looking at the code for the types of variable and api.call. Maybe someone with more C++ experience could explain; perhaps it's meant as a defense-in-depth sort of measure, not a guarantee.
Maybe the programmers were thinking something like this...
!!myAnswer is boolean. In context, it should become boolean, but I just love to bang bang things to make sure, because once upon a time there was a mysterious bug that bit me, and bang bang, I killed it.
So suppose I do something like
class Double {
double m_double;
public:
Double() { }
Double(double d) : m_double(d) { }
operator double() const { return m_double; }
operator double&() { return m_double; }
};
Maybe I later want to extend this to make NaN a bit more friendly (by adding a bool say), etc.
My question is, do you think off the top of your head that this Double (and possible extensions on it) would be "measurably" slower than using the built-in double directly?
If you have some experience in working with large data sets, with vectors, copying/moving around vectors of such data, etc. - I hope you can give me some concrete insights/pointers/tips regarding this topic based on your experience.
Modern compilers do excellent optimizations. Single scalar en-wrapped in a class usually performs as well as plain one. If you add additional checking into member operators then it will cost just what you added. If you need to apply additional limitations to double then that is good idea.
Example: std::array<T,N> it is ... in essence just an array. I have failed to find test that demonstrates any overhead when comparing to raw array. The added limitations and container-like functionality make it valuable.
Avoid having operators that convert silently to scalars like operator double(). Compilers apply implicit conversions to typos with amazing ease and sometimes achieve that defects compile. Later it takes some time to realize why it is working like it is. Make the conversions more explicit like double raw() const. The resulting code is easier to understand and runs as fast.
Example: std::array<T,N> does not convert to raw pointer to first element like ordinary array. User has to use &a[0] to get raw pointer to first element. That makes it lot safer and easier to understand in code. It works as quickly (cost of the operation is 0 in optimized code).
I'm thinking of replacing all the instances of safe bool idiom by explicit operator bool in code which already uses C++11 features (so the fact that older compilers don't recognized explicit conversion operators will not matter), so I'd like to know if it can cause some subtle problems.
Thus, what are all the possible incompatibilities (even the most minute ones) that can be caused by switching from old and dull safe bool idiom to new and shiny explicit operator bool?
EDIT: I know that switching is a good idea anyway, for the latter is a language feature, well-understood by the compiler, so it'll work no worse than what's in fact just a hack. I simply want to know the possible differences.
Probably the biggest difference, assuming your code is free of bugs (I know, not a safe assumption), will be that in some cases, you may want an implicit conversion to exactly bool. An explicit conversion function will not match.
struct S1
{
operator S1*() { return 0; } /* I know, not the best possible type */
} s1;
struct S2
{
explicit operator bool() { return false; }
} s2;
void f()
{
bool b1 = s1; /* okay */
bool b2 = s2; /* not okay */
}
If you've used safe-bool conversion incorrectly in your code, only then explicit operator bool be incompatible, as it wouldn't allow you to do things incorrectly that easily. Otherwise, it should be just fine without any problem. In fact, even if there is problem, you should still switch to explicit operator bool, because if you do so, then you could identify the problem in the usage of the safe-bool conversion.
According to this article, some compilers emit inefficient instructions for safe-bool implementation using member function pointer,
When people started using this idiom, it was discovered that there was an efficiency penalty on some compilers — the member function pointer caused a compiler headache resulting in slower execution when the address was fetched. Although the difference is marginal, the current practice is typically to use a member data pointer instead of a member function pointer.
I just came onto a project with a pretty huge code base.
I'm mostly dealing with C++ and a lot of the code they write uses double negation for their boolean logic.
if (!!variable && (!!api.lookup("some-string"))) {
do_some_stuff();
}
I know these guys are intelligent programmers, it's obvious they aren't doing this by accident.
I'm no seasoned C++ expert, my only guess at why they are doing this is that they want to make absolutely positive that the value being evaluated is the actual boolean representation. So they negate it, then negate that again to get it back to its actual boolean value.
Is this correct, or am I missing something?
It's a trick to convert to bool.
It's actually a very useful idiom in some contexts. Take these macros (example from the Linux kernel). For GCC, they're implemented as follows:
#define likely(cond) (__builtin_expect(!!(cond), 1))
#define unlikely(cond) (__builtin_expect(!!(cond), 0))
Why do they have to do this? GCC's __builtin_expect treats its parameters as long and not bool, so there needs to be some form of conversion. Since they don't know what cond is when they're writing those macros, it is most general to simply use the !! idiom.
They could probably do the same thing by comparing against 0, but in my opinion, it's actually more straightforward to do the double-negation, since that's the closest to a cast-to-bool that C has.
This code can be used in C++ as well... it's a lowest-common-denominator thing. If possible, do what works in both C and C++.
The coders think that it will convert the operand to bool, but because the operands of && are already implicitly converted to bool, it's utterly redundant.
Yes it is correct and no you are not missing something. !! is a conversion to bool. See this question for more discussion.
It's a technique to avoid writing (variable != 0) - i.e. to convert from whatever type it is to a bool.
IMO Code like this has no place in systems that need to be maintained - because it is not immediately readable code (hence the question in the first place).
Code must be legible - otherwise you leave a time debt legacy for the future - as it takes time to understand something that is needlessly convoluted.
It side-steps a compiler warning. Try this:
int _tmain(int argc, _TCHAR* argv[])
{
int foo = 5;
bool bar = foo;
bool baz = !!foo;
return 0;
}
The 'bar' line generates a "forcing value to bool 'true' or 'false' (performance warning)" on MSVC++, but the 'baz' line sneaks through fine.
Legacy C developers had no Boolean type, so they often #define TRUE 1 and #define FALSE 0 and then used arbitrary numeric data types for Boolean comparisons. Now that we have bool, many compilers will emit warnings when certain types of assignments and comparisons are made using a mixture of numeric types and Boolean types. These two usages will eventually collide when working with legacy code.
To work around this problem, some developers use the following Boolean identity: !num_value returns bool true if num_value == 0; false otherwise. !!num_value returns bool false if num_value == 0; true otherwise. The single negation is sufficient to convert num_value to bool; however, the double negation is necessary to restore the original sense of the Boolean expression.
This pattern is known as an idiom, i.e., something commonly used by people familiar with the language. Therefore, I don't see it as an anti-pattern, as much as I would static_cast<bool>(num_value). The cast might very well give the correct results, but some compilers then emit a performance warning, so you still have to address that.
The other way to address this is to say, (num_value != FALSE). I'm okay with that too, but all in all, !!num_value is far less verbose, may be clearer, and is not confusing the second time you see it.
Is operator! overloaded?
If not, they're probably doing this to convert the variable to a bool without producing a warning. This is definitely not a standard way of doing things.
!! was used to cope with original C++ which did not have a boolean type (as neither did C).
Example Problem:
Inside if(condition), the condition needs to evaluate to some type like double, int, void*, etc., but not bool as it does not exist yet.
Say a class existed int256 (a 256 bit integer) and all integer conversions/casts were overloaded.
int256 x = foo();
if (x) ...
To test if x was "true" or non-zero, if (x) would convert x to some integer and then assess if that int was non-zero. A typical overload of (int) x would return only the LSbits of x. if (x) was then only testing the LSbits of x.
But C++ has the ! operator. An overloaded !x would typically evaluate all the bits of x. So to get back to the non-inverted logic if (!!x) is used.
Ref Did older versions of C++ use the `int` operator of a class when evaluating the condition in an `if()` statement?
As Marcin mentioned, it might well matter if operator overloading is in play. Otherwise, in C/C++ it doesn't matter except if you're doing one of the following things:
direct comparison to true (or in C something like a TRUE macro), which is almost always a bad idea. For example:
if (api.lookup("some-string") == true) {...}
you simply want something converted to a strict 0/1 value. In C++ an assignment to a bool will do this implicitly (for those things that are implicitly convertible to bool). In C or if you're dealing with a non-bool variable, this is an idiom that I've seen, but I prefer the (some_variable != 0) variety myself.
I think in the context of a larger boolean expression it simply clutters things up.
If variable is of object type, it might have a ! operator defined but no cast to bool (or worse an implicit cast to int with different semantics. Calling the ! operator twice results in a convert to bool that works even in strange cases.
This may be an example of the double-bang trick (see The Safe Bool Idiom for more details). Here I summarize the first page of the article.
In C++ there are a number of ways to provide Boolean tests for classes.
An obvious way is the operator bool conversion operator.
// operator bool version
class Testable {
bool ok_;
public:
explicit Testable(bool b = true) : ok_(b) {}
operator bool() const { // use bool conversion operator
return ok_;
}
};
We can test the class as thus:
Testable test;
if (test) {
std::cout << "Yes, test is working!\n";
}
else {
std::cout << "No, test is not working!\n";
}
However, operator bool is considered unsafe because it allows nonsensical operations such as test << 1; or int i = test.
Using operator! is safer because we avoid implicit conversion or overloading issues.
The implementation is trivial,
bool operator!() const { // use operator!
return !ok_;
}
The two idiomatic ways to test Testable object are
Testable test;
if (!!test) {
std::cout << "Yes, test is working!\n";
}
if (!test) {
std::cout << "No, test is not working!\n";
}
The first version if (!!test) is what some people call the double-bang trick.
It's correct but, in C, pointless here -- 'if' and '&&' would treat the expression the same way without the '!!'.
The reason to do this in C++, I suppose, is that '&&' could be overloaded. But then, so could '!', so it doesn't really guarantee you get a bool, without looking at the code for the types of variable and api.call. Maybe someone with more C++ experience could explain; perhaps it's meant as a defense-in-depth sort of measure, not a guarantee.
Maybe the programmers were thinking something like this...
!!myAnswer is boolean. In context, it should become boolean, but I just love to bang bang things to make sure, because once upon a time there was a mysterious bug that bit me, and bang bang, I killed it.