I'm thinking of replacing all the instances of safe bool idiom by explicit operator bool in code which already uses C++11 features (so the fact that older compilers don't recognized explicit conversion operators will not matter), so I'd like to know if it can cause some subtle problems.
Thus, what are all the possible incompatibilities (even the most minute ones) that can be caused by switching from old and dull safe bool idiom to new and shiny explicit operator bool?
EDIT: I know that switching is a good idea anyway, for the latter is a language feature, well-understood by the compiler, so it'll work no worse than what's in fact just a hack. I simply want to know the possible differences.
Probably the biggest difference, assuming your code is free of bugs (I know, not a safe assumption), will be that in some cases, you may want an implicit conversion to exactly bool. An explicit conversion function will not match.
struct S1
{
operator S1*() { return 0; } /* I know, not the best possible type */
} s1;
struct S2
{
explicit operator bool() { return false; }
} s2;
void f()
{
bool b1 = s1; /* okay */
bool b2 = s2; /* not okay */
}
If you've used safe-bool conversion incorrectly in your code, only then explicit operator bool be incompatible, as it wouldn't allow you to do things incorrectly that easily. Otherwise, it should be just fine without any problem. In fact, even if there is problem, you should still switch to explicit operator bool, because if you do so, then you could identify the problem in the usage of the safe-bool conversion.
According to this article, some compilers emit inefficient instructions for safe-bool implementation using member function pointer,
When people started using this idiom, it was discovered that there was an efficiency penalty on some compilers — the member function pointer caused a compiler headache resulting in slower execution when the address was fetched. Although the difference is marginal, the current practice is typically to use a member data pointer instead of a member function pointer.
Related
While I was on a short break, my workplace switched to using a static code analyzer.
They ran it over the project I am working on and one particular problem flagged by the analyzer goes like this (simplified example):
struct calcSomething
{
int result;
calcSomething() : result(0) {}
void operator()(const int v) { /*does something*/ }
operator int() const { return result; }
};
void foo()
{
std::vector<int> myvector(10);
// exercise for reader: stick some values in `myvector`
int result = std::for_each(myvector.begin(), myvector.end(), calcSomething());
}
The analyzer flags the following issues:
warning: CodeChecker: 'operator int' must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
operator int() const { return result; }
The suggested fix to the functor reads:
struct calcSomething
{
...
explicit operator int() const { return result; }
};
But if I fix my functor as suggested, the static analyzer quickly flags the following issue:
warning: CodeChecker: no viable conversion from '(anonymous namespace)::calcSomething' to 'int' [clang-diagnostic-error]
I now need to add the explicit cast:
void foo()
{
...
int total = static_cast<int>(std::for_each(myvector.begin(), myvector.end(), calcSomething()));
}
The above example is a mere simplification of the real problem which would otherwise just add filler and no substance.
I have seen plenty of examples of functors like the one I describe here in text books and programming reference web pages.
I have never considered these unsafe. I have never seen anyone flag these as unsafe.
So does the code analyzer have a point?
Or is it a little overzealous to make my functor's conversion operator explicit and as a result make me add the static cast?
Purely from an aesthetic point, I feel that a simple problem with an elegant solution now accrues a lot of ugly syntactic padding.
But perhaps that is the price we pay for writing safe(r) code.
side note: TIL that explicit applies not only to ctors
Edit
It seems some people are unable to read beyond the example code I provided (pretty textbook stuff) and still suggest other algorithms/idioms, completely failing to see that the actual question is about conversion operators on functors whose sole purpose is to calculate and return an algorithm's result.
If the question was about how to improve on an adding algorithm, then the title would have said so.
So I decided to hide any implementation details in this edit to make it easier for these people.
Sorry that some of the comments below now no longer make any sense, but the record got stuck so I had to move the needle a bit in order to move things forward (hopefully).
I would do away with any conversion operators altogether. What's wrong with:
int result = std::for_each(...).get();
where get() does the same as your current operator int.
You know that the result of for_each is not an integer, it's your function object. Why, why would you want to avoid making the conversion from a function to a value explicit? It is, by all means, a questionable idea. Sure, you can still do it, but you want clean warning-free code right? Well, clean, warning free code, in my book, should not auto-convert functions to integers. I agree static_cast is almost equally ugly, that's why I am suggesting a named function
I noticed today that boost::optional::is_initialized() is marked as deprecated in the Boost 1.64.0 reference. My projects are liberally sprinkled with is_initialized() to check if the boost::optional contains a value.
I don't see any other way to properly test if a boost::optional is initialized, am I missing something?
The boost::optional has a explicit operator bool(), meaning that I can do if(foo){...} if foo is a boost::optional. However, this would give wrong results if foo is a boost::optional<bool> or some other boost::optional<T> where T is convertible to bool.
What does Boost expect users to do?
However, this would give wrong results if foo is a boost::optional or some other boost::optional where T is convertible to bool.
No, because there is no implicit conversion to the underlying type. The "truthiness"¹ of an optional always refers to its initialized state.
The only time you may have gotten the impression that implicit conversions happen is in relational operators. However, that's not doing implicit conversion to the underlying type, instead does lifting of the operators, explicitly.
¹ by which I mean contextual (explicit) boolean conversion
Update
Indeed for boost::optional<bool> there's the caveat in pre-c++11 mode:
Second, although optional<> provides a contextual conversion to bool in C++11, this falls back to an implicit conversion on older compilers
In that case it is clearly better to explicitly compare to boost::none.
As of this writing, Boost 1.72 supports a "has_value" method, which is not deprecated.
Under the hood, it just calls "is_initialized" though.
See the code:
bool has_value() const BOOST_NOEXCEPT { return this->is_initialized() ; }
That aside, another convenient trick I've seen is the !! idiom.
Eg:
boost::optional<Foo> x = ...
MY_ASSERT(!!x, "x must be set");
It's essentially the same as writing (bool)x or the even more prohibitively verbose static_cast<bool>(x).
Aside: it is a bit odd that is_initialized was deprecated, then an exactly equivalent function with a different name got added later. I suspect it was for compatibility with C++17's std::optional.
For future references, as stated on boost documentation, you can compare like this from now on:
boost::optional<int> oN = boost::none;
boost::optional<int> o0 = 0;
boost::optional<int> o1 = 1;
assert(oN != o0);
assert(o1 != oN);
assert(o0 != o1);
assert(oN == oN);
assert(o0 == o0);
You could even do:
if(oN != 2){}
or simply to check if the value is set:
if(oN){}
I just came onto a project with a pretty huge code base.
I'm mostly dealing with C++ and a lot of the code they write uses double negation for their boolean logic.
if (!!variable && (!!api.lookup("some-string"))) {
do_some_stuff();
}
I know these guys are intelligent programmers, it's obvious they aren't doing this by accident.
I'm no seasoned C++ expert, my only guess at why they are doing this is that they want to make absolutely positive that the value being evaluated is the actual boolean representation. So they negate it, then negate that again to get it back to its actual boolean value.
Is this correct, or am I missing something?
It's a trick to convert to bool.
It's actually a very useful idiom in some contexts. Take these macros (example from the Linux kernel). For GCC, they're implemented as follows:
#define likely(cond) (__builtin_expect(!!(cond), 1))
#define unlikely(cond) (__builtin_expect(!!(cond), 0))
Why do they have to do this? GCC's __builtin_expect treats its parameters as long and not bool, so there needs to be some form of conversion. Since they don't know what cond is when they're writing those macros, it is most general to simply use the !! idiom.
They could probably do the same thing by comparing against 0, but in my opinion, it's actually more straightforward to do the double-negation, since that's the closest to a cast-to-bool that C has.
This code can be used in C++ as well... it's a lowest-common-denominator thing. If possible, do what works in both C and C++.
The coders think that it will convert the operand to bool, but because the operands of && are already implicitly converted to bool, it's utterly redundant.
Yes it is correct and no you are not missing something. !! is a conversion to bool. See this question for more discussion.
It's a technique to avoid writing (variable != 0) - i.e. to convert from whatever type it is to a bool.
IMO Code like this has no place in systems that need to be maintained - because it is not immediately readable code (hence the question in the first place).
Code must be legible - otherwise you leave a time debt legacy for the future - as it takes time to understand something that is needlessly convoluted.
It side-steps a compiler warning. Try this:
int _tmain(int argc, _TCHAR* argv[])
{
int foo = 5;
bool bar = foo;
bool baz = !!foo;
return 0;
}
The 'bar' line generates a "forcing value to bool 'true' or 'false' (performance warning)" on MSVC++, but the 'baz' line sneaks through fine.
Legacy C developers had no Boolean type, so they often #define TRUE 1 and #define FALSE 0 and then used arbitrary numeric data types for Boolean comparisons. Now that we have bool, many compilers will emit warnings when certain types of assignments and comparisons are made using a mixture of numeric types and Boolean types. These two usages will eventually collide when working with legacy code.
To work around this problem, some developers use the following Boolean identity: !num_value returns bool true if num_value == 0; false otherwise. !!num_value returns bool false if num_value == 0; true otherwise. The single negation is sufficient to convert num_value to bool; however, the double negation is necessary to restore the original sense of the Boolean expression.
This pattern is known as an idiom, i.e., something commonly used by people familiar with the language. Therefore, I don't see it as an anti-pattern, as much as I would static_cast<bool>(num_value). The cast might very well give the correct results, but some compilers then emit a performance warning, so you still have to address that.
The other way to address this is to say, (num_value != FALSE). I'm okay with that too, but all in all, !!num_value is far less verbose, may be clearer, and is not confusing the second time you see it.
Is operator! overloaded?
If not, they're probably doing this to convert the variable to a bool without producing a warning. This is definitely not a standard way of doing things.
!! was used to cope with original C++ which did not have a boolean type (as neither did C).
Example Problem:
Inside if(condition), the condition needs to evaluate to some type like double, int, void*, etc., but not bool as it does not exist yet.
Say a class existed int256 (a 256 bit integer) and all integer conversions/casts were overloaded.
int256 x = foo();
if (x) ...
To test if x was "true" or non-zero, if (x) would convert x to some integer and then assess if that int was non-zero. A typical overload of (int) x would return only the LSbits of x. if (x) was then only testing the LSbits of x.
But C++ has the ! operator. An overloaded !x would typically evaluate all the bits of x. So to get back to the non-inverted logic if (!!x) is used.
Ref Did older versions of C++ use the `int` operator of a class when evaluating the condition in an `if()` statement?
As Marcin mentioned, it might well matter if operator overloading is in play. Otherwise, in C/C++ it doesn't matter except if you're doing one of the following things:
direct comparison to true (or in C something like a TRUE macro), which is almost always a bad idea. For example:
if (api.lookup("some-string") == true) {...}
you simply want something converted to a strict 0/1 value. In C++ an assignment to a bool will do this implicitly (for those things that are implicitly convertible to bool). In C or if you're dealing with a non-bool variable, this is an idiom that I've seen, but I prefer the (some_variable != 0) variety myself.
I think in the context of a larger boolean expression it simply clutters things up.
If variable is of object type, it might have a ! operator defined but no cast to bool (or worse an implicit cast to int with different semantics. Calling the ! operator twice results in a convert to bool that works even in strange cases.
This may be an example of the double-bang trick (see The Safe Bool Idiom for more details). Here I summarize the first page of the article.
In C++ there are a number of ways to provide Boolean tests for classes.
An obvious way is the operator bool conversion operator.
// operator bool version
class Testable {
bool ok_;
public:
explicit Testable(bool b = true) : ok_(b) {}
operator bool() const { // use bool conversion operator
return ok_;
}
};
We can test the class as thus:
Testable test;
if (test) {
std::cout << "Yes, test is working!\n";
}
else {
std::cout << "No, test is not working!\n";
}
However, operator bool is considered unsafe because it allows nonsensical operations such as test << 1; or int i = test.
Using operator! is safer because we avoid implicit conversion or overloading issues.
The implementation is trivial,
bool operator!() const { // use operator!
return !ok_;
}
The two idiomatic ways to test Testable object are
Testable test;
if (!!test) {
std::cout << "Yes, test is working!\n";
}
if (!test) {
std::cout << "No, test is not working!\n";
}
The first version if (!!test) is what some people call the double-bang trick.
It's correct but, in C, pointless here -- 'if' and '&&' would treat the expression the same way without the '!!'.
The reason to do this in C++, I suppose, is that '&&' could be overloaded. But then, so could '!', so it doesn't really guarantee you get a bool, without looking at the code for the types of variable and api.call. Maybe someone with more C++ experience could explain; perhaps it's meant as a defense-in-depth sort of measure, not a guarantee.
Maybe the programmers were thinking something like this...
!!myAnswer is boolean. In context, it should become boolean, but I just love to bang bang things to make sure, because once upon a time there was a mysterious bug that bit me, and bang bang, I killed it.
There is a conversion function operator void*() const in C++ stream classes. so that all stream objects can be implicitly converted to void*. During the interaction with programmers on SO they suggest me to don't use void* unless you've a good reason to use it. void* is a technique of removing type safety & error checking. So, due to the existance of this conversion function following program is perfectly valid. This is a flaw in the C++ standard library.
#include <iostream>
int main()
{
delete std::cout;
delete std::cin;
}
See live demo here.
The above program is valid in C++03 but fails in compilation in C++11 & later compilers, because this conversion function is removed. But question is why it was part of C++ standard library if it is dangerous? What was the purpose of allowing conversion of stream objects to void*? What is the use of it?
A feature of std::stringstream is that it is intended that if the stream is used as a bool, it gets converted to true if the stream is still valid and false if it's not. For instance this lets you use a simple syntax if you implement your own version of lexical cast.
(For reference, boost contains a template function called lexical_cast which does something similar to the following simple template function.)
template <typename T, typename U>
T lexical_cast(const U & u) {
T t;
std::stringstream ss;
if (ss << u && ss >> t) {
return t;
} else {
throw bad_lexical_cast();
}
}
If for instance, you work on a project where exceptions are disallowed, you might want to roll the following version of this instead.
template <typename T, typename U>
boost::optional<T> lexical_cast(const U & u) {
T t;
std::stringstream ss;
if (ss << u && ss >> t) {
return t;
} else {
return boost::none;
}
}
(There are various ways the above could be improved, this code is just to give an example.)
The operator void * is being used in the above as follows:
ss << u returns a reference to ss.
The ss is implicitly converted to void * which is nullptr if the stream operation failed, and non-null if the stream is still good.
The && operator aborts quickly, depending on the truth value of that pointer.
The second part ss >> t runs, and returns and is also converted to void *.
If both operations succeeded then we can return the streamed result t. If either failed then we signal an error.
The bool conversion feature is basically just syntactic sugar that allows this (and many other things) to be written concisely.
The drawback of actually introducing an implicit bool conversion is that then, std::stringstream becomes implicitly convertible to int and many other types also, because bool is implicitly convertible to int. This ultimately causes syntactic nightmares elsewhere.
For instance, if std::stringstream had an implicit operator bool conversion, suppose you have this simple code:
std::stringstream ss;
int x = 5;
ss << x;
Now in overload resolution, you have two potential overloads to consider (!), the normal one operator<<(std::stringstream &, int), and the one in which ss is converted to bool, then promoted to int, and the bit shift operator may apply operator<<(int, int), all because of the implicit conversion to bool...
The workaround is to use an implicit conversion to void * instead, which can be used contextually as a bool, but isn't actually implicitly convertible to bool or int.
In C++11 we don't need this workaround any more, and there's no reason that anyone would have explicitly used the void * conversion, so it was just removed.
(I'm searching for a reference, but I believe that the value of this function was only specified up to when it should be nullptr vs. when it should not be nullptr, and that it was implementation defined what nonzero pointer values it might yield. So any code that relied on the void * version and cannot be trivially refactored to use the bool version would have been incorrect anyways.)
The void * workaround is still not without problems. In boost a more sophisticated "safe bool" idiom was developed for pre-C++11 code, which iiuc is based on something like T* where T is either a "type-used once", like a struct which is defined in the class implementing the idiom, or using a pointer-to-member function for that class and using return values which are either a particular private member function of that class, or nullptr. The "safe bool" idiom is used for instance in all of the boost smart pointers.
Regardless this whole mess was cleaned up in C++11 by introducing explicit operator bool.
More info here: Is the safe-bool idiom obsolete in C++11?
This is a flaw in the C++ standard library.
This was a flaw in older versions (1998 and 2003) of the C++ standard library. This flaw no longer exists in 2011 and later versions of the standard. A new feature, the ability to mark conversion operators as explicit, was added to the language to make a conversion operator to bool safe.
The developers of the original version of the C++ standard explicitly chose to use a conversion to void* rather than a conversion to bool because a non-explicit conversion to bool was quite unsafe in a number of ways. In comparison, while operator void*() was rather obviously kludgy, it worked, at least so long as you didn't cast that pointer to something else or try to delete it. (An alternative, operator!, was arguably safer than either of those conversion operators, but that would have required the non-intuitive and abstruse while (!!stream) {...}. )
The concept of the "safe bool" idiom was developed after the original 1998/1999 version of the standard was released. Whether it was developed prior to 2003 is a bit irrelevant; the 2003 version of the standard was intended to be a bug fix to that original standard. That operator void*() let delete std::cin compile wasn't deemed a bug so much as a "don't do that then" kind of problem.
The development of the "safe bool" idiom showed that alternatives did exist that made operator bool() safe, but if you look at any of the implementations, they are all massively convoluted and massively kludgy. The C++11 solution was amazingly simple: Allow conversion operators to be qualified with the explicit keyword. The C++11 solution removed the void* conversion operator and added an explicit bool conversion operator. This made the "safe bool" idiom obsolete, at least so long as you are using a compiler that is C++11 (or later) compliant.
Say I have a class C that I want to be able to implicitly cast to bool to use in if statements.
class C {
public:
...
operator bool() { return data ? true : false; }
private:
void * data;
};
and
C c;
...
if (c) ...
But the cast operator has a conditional which is technically overhead (even if relatively insignificant). If data was public I could do if (c.data) instead which is entirely possible and does not involve any conditionals. I doubt that the compiler will do any implicit conversion involving a conditional in the latter scenario, since it will likely generate a "jump if zero" or "jump if not zero" which doesn't really need any Boolean value, which the CPU will most likely have no notion of anyway.
My question is whether the typecast operator overload will indeed be less efficient than directly using the data member.
Note that I did establish that if the typecast directly returns data it also works, probably using the same type of implicit (hypothetical and not really happening in practice) conversion that would be used in the case of if (c.data).
Edit: Just to clarify, the point of the matter is actually a bit hypothetical. The dilemma is that Boolean is itself a hypothetical construct (which didn't initially exist in C/C++), in reality it is just integers. As I mentioned, the typecast can directly return data or use != instead, but it is really not very readable, but even that is not the issue. I don't really know how to word it to make sense of it better, the C class has a void * that is an integer, the CPU has conditional jumps which use integers, the issue is that abiding to the hypothetical Boolean construct that sits in the middle mandates the extra conditional. Dunno if that "clarification" made things any more clear though...
My question is whether the typecast operator overload will indeed be less efficient than directly using the data member.
Only examining your compiler output - with the specific optimisation flags you'd like to use - can tell you for sure, and then it might change after some seemingly irrelevant change like adding an extra variable somewhere in the calling context, or perhaps with the next compiler release etc....
More generally, C++ wouldn't be renowned for speed if the optimisers didn't tend to handle this kind of situation perfectly, so your odds are very good.
Further, write working code then profile it and you'll learn a lot more about what performance problems are actually significant.
It depends on how smart your compiler's optimizer is. I think they should be smart enough to remove the useless ? true: false operation, because the typecast operation should be inlined.
Or you could just write this and not worry about it:
operator bool() { return data; }
Since there's a built-in implicit typecast from void* to bool, data gets typecast on the way out the function.
I don't remember if the conditional in if expects bool or void*; at one point, before C++ added bool, it was the latter. (operator! in the iostream classes returned void* back then.)
On modern compilers these two functions produce the same machine code:
bool toBool1(void* ptr) {
return ptr ? true : false;
}
bool toBool2(void* ptr) {
return ptr;
}
Demo
So it really doesn't matter.