Enumeration relying on integer boolean conversion - c++

In my compiler project, I have an enumeration that goes like
enum Result {
No,
Maybe,
Yes
};
I have put No explicitly at the first position, so that i can rely on the boolean evaluation to false. If my compiler is not sure about something, and has to wait for facts until runtime, its analysis functions will return Maybe. Used like
if(!typesEqual(t1, t2)) {
diagnose(types_unequal) << t1 << t2;
}
I wonder whether you or your company considers it bad style not to compare to No explicitly
if(typesEqual(t1, t2) == No) { /* ... */ }
Comparing explicitly seems wordy to me, but relying on the implicit boolean conversion somehow makes me feel guilty. Have you had that feeling before, and how have you dealt with it?

I'd feel guilty about it as well, because from reading the code above what would you expect the boolean typesEqual() expression to return for a Maybe? Would it return true? Maybe! Would it return false? Maybe! We don't know - that's the entire point of the enum. That's why it makes sense to explicitly compare to No, even though it's more verbose.

I don't usually use an explicit comparison with zero for integral or pointer types since the conversion-to-boolean is well-defined and obvious.
However, I always use an explicit comparison for enumerations, just in case someone changes the definition of the enumeration. You can never be sure that someone won't change the enumeration later, after all.
Relying on the underlying numeric value of an enumerator just seems like a bad idea.

This seems similar to Boost.Tribool. Tribool supports conversions to bool for use in conditional statements, which would seem to suggest that implicit conversion to bool in a case such as this is not bad, at least according to the Boost organization (whom I have found to be fairly reasonable).

What I don't like is that it is asymmetric.
E.g. what you really wanted somewhere was
if(typesEqual(t1, t2) == Yes) {
do_something();
}
but by accident you wrote
if(typesEqual(t1, t2)) {
do_something();
}
It seems somehow strange/ugly that you can use the boolean trick for No but not for Yes.
I think I would solve it by renaming the function to tryCompareTypes(t1, t2), and change your enum to
enum Result {
Maybe,
No,
Yes
};
So tryCompareTypes() returns 0 if it "failed" to definitively decide whether the types are equal, otherwise it returns either No or Yes, both of which are nonzero and hence indicate "success".

Related

How should I check whether an underlying-type value is an enumerated value?

Let I be some integral type. Now suppose I have a enum class my_enum_class : I, with values which may not be consecutive. And now I get some I value. How do I check whether it's a value enumerated in my_enum_class?
An answer to a similar question (for the C language) makes the assumption that values are contiguous, and that one can add a "dummy" upper-bound value, and check the range between 0 and that value; that's not relevant in my case. Is there another way to do it?
There is currently no way to do this.
There are reflection proposals that may make it into c++20 and/or c++23 that let you iterate (at compile, and hence run, time) over the enumerated values in an enum. Using that the check would be relatively easy.
Sometimes people do manual enum reflection, often using macros.
There is no built-in way to do this. All Is are "valid" values of my_enum_class, so you can't do anything with the underlying type. As for validating Is against the list of enumerators, without reflection there is simply no way to do it.
Depending on the context, I tend to either build a static std::unordered_set (and do lookups into that), or have a function listing all my enumerators in a switch (and returning false iff the input matches none of them), or just not bother, instead documenting somewhere that passing an unenumerated my_enum_class value to my functions shall be deemed impish trickery and have unspecified behaviour.
Ultimately this all stems from the fact that enums are supposed to list "common conveniently named values" within a wider range of totally valid states, rather than a type comprised only of a fully constrained set of constants. We pretty much all abuse enums.
Though the standard doesn't yet allow you to do introspection, there is a small workaround you could use, that can possibly be improved with ADL. Courtesy to this older answer.
namespace sparse {
template<typename E>
constexpr bool in_(std::underlying_type_t<E> i) { return false; }
template<typename E, E value, E...values>
constexpr bool in_(std::underlying_type_t<E> e) {
return static_cast<E>(e) == value || in_<E, values...>(e);
}
}
To be used like this:
enum class my_enum: int { a=3, b=4 };
template<>
constexpr auto sparse::in<my_enum> =
in_<my_enum, my_enum::a, my_enum::b>;
static_assert(sparse::in<my_enum>(3));
static_assert(sparse::in<my_enum>(4));
static_assert(!sparse::in<my_enum>(5))

What is the purpose of using !! for Pointers in C Conditions? [duplicate]

I just came onto a project with a pretty huge code base.
I'm mostly dealing with C++ and a lot of the code they write uses double negation for their boolean logic.
if (!!variable && (!!api.lookup("some-string"))) {
do_some_stuff();
}
I know these guys are intelligent programmers, it's obvious they aren't doing this by accident.
I'm no seasoned C++ expert, my only guess at why they are doing this is that they want to make absolutely positive that the value being evaluated is the actual boolean representation. So they negate it, then negate that again to get it back to its actual boolean value.
Is this correct, or am I missing something?
It's a trick to convert to bool.
It's actually a very useful idiom in some contexts. Take these macros (example from the Linux kernel). For GCC, they're implemented as follows:
#define likely(cond) (__builtin_expect(!!(cond), 1))
#define unlikely(cond) (__builtin_expect(!!(cond), 0))
Why do they have to do this? GCC's __builtin_expect treats its parameters as long and not bool, so there needs to be some form of conversion. Since they don't know what cond is when they're writing those macros, it is most general to simply use the !! idiom.
They could probably do the same thing by comparing against 0, but in my opinion, it's actually more straightforward to do the double-negation, since that's the closest to a cast-to-bool that C has.
This code can be used in C++ as well... it's a lowest-common-denominator thing. If possible, do what works in both C and C++.
The coders think that it will convert the operand to bool, but because the operands of && are already implicitly converted to bool, it's utterly redundant.
Yes it is correct and no you are not missing something. !! is a conversion to bool. See this question for more discussion.
It's a technique to avoid writing (variable != 0) - i.e. to convert from whatever type it is to a bool.
IMO Code like this has no place in systems that need to be maintained - because it is not immediately readable code (hence the question in the first place).
Code must be legible - otherwise you leave a time debt legacy for the future - as it takes time to understand something that is needlessly convoluted.
It side-steps a compiler warning. Try this:
int _tmain(int argc, _TCHAR* argv[])
{
int foo = 5;
bool bar = foo;
bool baz = !!foo;
return 0;
}
The 'bar' line generates a "forcing value to bool 'true' or 'false' (performance warning)" on MSVC++, but the 'baz' line sneaks through fine.
Legacy C developers had no Boolean type, so they often #define TRUE 1 and #define FALSE 0 and then used arbitrary numeric data types for Boolean comparisons. Now that we have bool, many compilers will emit warnings when certain types of assignments and comparisons are made using a mixture of numeric types and Boolean types. These two usages will eventually collide when working with legacy code.
To work around this problem, some developers use the following Boolean identity: !num_value returns bool true if num_value == 0; false otherwise. !!num_value returns bool false if num_value == 0; true otherwise. The single negation is sufficient to convert num_value to bool; however, the double negation is necessary to restore the original sense of the Boolean expression.
This pattern is known as an idiom, i.e., something commonly used by people familiar with the language. Therefore, I don't see it as an anti-pattern, as much as I would static_cast<bool>(num_value). The cast might very well give the correct results, but some compilers then emit a performance warning, so you still have to address that.
The other way to address this is to say, (num_value != FALSE). I'm okay with that too, but all in all, !!num_value is far less verbose, may be clearer, and is not confusing the second time you see it.
Is operator! overloaded?
If not, they're probably doing this to convert the variable to a bool without producing a warning. This is definitely not a standard way of doing things.
!! was used to cope with original C++ which did not have a boolean type (as neither did C).
Example Problem:
Inside if(condition), the condition needs to evaluate to some type like double, int, void*, etc., but not bool as it does not exist yet.
Say a class existed int256 (a 256 bit integer) and all integer conversions/casts were overloaded.
int256 x = foo();
if (x) ...
To test if x was "true" or non-zero, if (x) would convert x to some integer and then assess if that int was non-zero. A typical overload of (int) x would return only the LSbits of x. if (x) was then only testing the LSbits of x.
But C++ has the ! operator. An overloaded !x would typically evaluate all the bits of x. So to get back to the non-inverted logic if (!!x) is used.
Ref Did older versions of C++ use the `int` operator of a class when evaluating the condition in an `if()` statement?
As Marcin mentioned, it might well matter if operator overloading is in play. Otherwise, in C/C++ it doesn't matter except if you're doing one of the following things:
direct comparison to true (or in C something like a TRUE macro), which is almost always a bad idea. For example:
if (api.lookup("some-string") == true) {...}
you simply want something converted to a strict 0/1 value. In C++ an assignment to a bool will do this implicitly (for those things that are implicitly convertible to bool). In C or if you're dealing with a non-bool variable, this is an idiom that I've seen, but I prefer the (some_variable != 0) variety myself.
I think in the context of a larger boolean expression it simply clutters things up.
If variable is of object type, it might have a ! operator defined but no cast to bool (or worse an implicit cast to int with different semantics. Calling the ! operator twice results in a convert to bool that works even in strange cases.
This may be an example of the double-bang trick (see The Safe Bool Idiom for more details). Here I summarize the first page of the article.
In C++ there are a number of ways to provide Boolean tests for classes.
An obvious way is the operator bool conversion operator.
// operator bool version
class Testable {
bool ok_;
public:
explicit Testable(bool b = true) : ok_(b) {}
operator bool() const { // use bool conversion operator
return ok_;
}
};
We can test the class as thus:
Testable test;
if (test) {
std::cout << "Yes, test is working!\n";
}
else {
std::cout << "No, test is not working!\n";
}
However, operator bool is considered unsafe because it allows nonsensical operations such as test << 1; or int i = test.
Using operator! is safer because we avoid implicit conversion or overloading issues.
The implementation is trivial,
bool operator!() const { // use operator!
return !ok_;
}
The two idiomatic ways to test Testable object are
Testable test;
if (!!test) {
std::cout << "Yes, test is working!\n";
}
if (!test) {
std::cout << "No, test is not working!\n";
}
The first version if (!!test) is what some people call the double-bang trick.
It's correct but, in C, pointless here -- 'if' and '&&' would treat the expression the same way without the '!!'.
The reason to do this in C++, I suppose, is that '&&' could be overloaded. But then, so could '!', so it doesn't really guarantee you get a bool, without looking at the code for the types of variable and api.call. Maybe someone with more C++ experience could explain; perhaps it's meant as a defense-in-depth sort of measure, not a guarantee.
Maybe the programmers were thinking something like this...
!!myAnswer is boolean. In context, it should become boolean, but I just love to bang bang things to make sure, because once upon a time there was a mysterious bug that bit me, and bang bang, I killed it.

Overloading typecast to bool and efficiency

Say I have a class C that I want to be able to implicitly cast to bool to use in if statements.
class C {
public:
...
operator bool() { return data ? true : false; }
private:
void * data;
};
and
C c;
...
if (c) ...
But the cast operator has a conditional which is technically overhead (even if relatively insignificant). If data was public I could do if (c.data) instead which is entirely possible and does not involve any conditionals. I doubt that the compiler will do any implicit conversion involving a conditional in the latter scenario, since it will likely generate a "jump if zero" or "jump if not zero" which doesn't really need any Boolean value, which the CPU will most likely have no notion of anyway.
My question is whether the typecast operator overload will indeed be less efficient than directly using the data member.
Note that I did establish that if the typecast directly returns data it also works, probably using the same type of implicit (hypothetical and not really happening in practice) conversion that would be used in the case of if (c.data).
Edit: Just to clarify, the point of the matter is actually a bit hypothetical. The dilemma is that Boolean is itself a hypothetical construct (which didn't initially exist in C/C++), in reality it is just integers. As I mentioned, the typecast can directly return data or use != instead, but it is really not very readable, but even that is not the issue. I don't really know how to word it to make sense of it better, the C class has a void * that is an integer, the CPU has conditional jumps which use integers, the issue is that abiding to the hypothetical Boolean construct that sits in the middle mandates the extra conditional. Dunno if that "clarification" made things any more clear though...
My question is whether the typecast operator overload will indeed be less efficient than directly using the data member.
Only examining your compiler output - with the specific optimisation flags you'd like to use - can tell you for sure, and then it might change after some seemingly irrelevant change like adding an extra variable somewhere in the calling context, or perhaps with the next compiler release etc....
More generally, C++ wouldn't be renowned for speed if the optimisers didn't tend to handle this kind of situation perfectly, so your odds are very good.
Further, write working code then profile it and you'll learn a lot more about what performance problems are actually significant.
It depends on how smart your compiler's optimizer is. I think they should be smart enough to remove the useless ? true: false operation, because the typecast operation should be inlined.
Or you could just write this and not worry about it:
operator bool() { return data; }
Since there's a built-in implicit typecast from void* to bool, data gets typecast on the way out the function.
I don't remember if the conditional in if expects bool or void*; at one point, before C++ added bool, it was the latter. (operator! in the iostream classes returned void* back then.)
On modern compilers these two functions produce the same machine code:
bool toBool1(void* ptr) {
return ptr ? true : false;
}
bool toBool2(void* ptr) {
return ptr;
}
Demo
So it really doesn't matter.

C++ logical operators return value

Here is some code I'm writing in C++. There's a call to an addAVP() function
dMessage.addAVP(AVP_DESTINATION_HOST, peer->getDestinationHost() || peer->getHost());
which has two versions: one overloaded in the second parameter to addAVP(int, char*) and another to addAVP(int, int). I find the C++ compiler I use calls the addAVP(int, int) version which is not what I wanted since getDestinationHost() and getHost() both return char*.
Nonetheless the || operator is defined to return bool so I can see where my error is. Bool somehow counts as an integer and this compiles cleanly and calls the second addAVP().
Lately I'm using a lot of dynamically typed languages, i.e. lisp, where the above code is correct can be written without worries. Clearly, clearly the above code in C++ is a big error, but still have some questions:
Should I be using this kind of shortcut, i.e. using the ||-operator's return value, at all in C++. Is this compiler dependent?
Imagine that I really, really had to write the nice a || b syntax, could this be done cleanly in C++? By writing an operator redefinition? Without losing performance?
As a followup to my original request, or my own answer to 2 :-) I was thinking along the lines of using a class to encapsulate the (evil?) rawpointer:
class char_ptr_w {
const char* wrapped_;
public:
char_ptr_w(const char* wrapped) : wrapped_(wrapped) {}
char_ptr_w(char_ptr_w const& orig) { wrapped_=orig.wrapped(); }
~char_ptr_w() {}
inline const char* wrapped() const { return wrapped_; }
};
inline char_ptr_w operator||(char_ptr_w &lhs, char_ptr_w& rhs) {
if (lhs.wrapped() != NULL)
return char_ptr_w(lhs.wrapped());
else
return char_ptr_w(rhs.wrapped());
};
Then I could use:
char_ptr_w a(getDestinationHost());
char_ptr_w b(getHost());
addAVP(AVP_DESTINATION_HOST, a || b);
In which this addAVP would be overloaded for char_ptr_w. According to my tests, this generates at most the same assembly code as ternary a?b:c solution, particularly because of the NRVO optimization in the operator, which does not, in most compilers, call the copy-constructor (although you have to include it).
Naturally, in this particular example I agree that the ternary solution is the best. I also agree that operator redefinition is something to be taken with care, and not always beneficial. But is there anything conceptually wrong, in a C++ sense, with the above solution?
It is legal in C++ to overload the logic operators, but only if one or both of the arguments are of a class type, and anyway it's a very bad idea. Overloaded logic operators do not short circuit, so this may cause apparently valid code elsewhere in your program to crash.
return p && p->q; // this can't possibly dereference a null pointer... can it?
As you discovered, a bool is really an int. The compiler is picking the correct function for your footprint. If you want to keep similar syntax, you might try
char*gdh=0;
dMessage.addAVP(AVP\_DESTINATION\_HOST,
(gdh=peer->getDestinationHost()) ? gdh : peer->getHost());
I would strongly recommend against redefining the operator. From a maintenance perspective, this is very likely to confuse later developers.
Why are you using an "or" operator on two char pointers?
I am assuming that peer->getDestinationHost() or peer->getHost() can return a NULL, and you are trying to use the one that returns a valid string, right?
In that case you need to do this separately:
char *host = peer->getDestinationHost();
if(host == NULL)
host = peer->getHost();
dMessage.addAVP(AVP\_DESTINATION\_HOST, host);
It makes no sense to pass a boolean to a function that expects a char *.
In C++ || returns a bool, not one of its operands. It is usually a bad idea to fight the language.
1) Should I be using this kind of shortcut, i.e. using the ||-operator's return value, at all in C++. Is this compiler dependent?
It's not compiler dependent, but it doesn't do the same as what the || operator does in languages such as JavaScript or or in common lisp. It coerces it first operand to a boolean values, and if that operand is true returns true. If the first operand is false, the second is evaluated and coerced to a boolean value, and this boolean value is returned.
So what it is doing is the same as ( peer->getDestinationHost() != 0 ) || ( peer->getHost() != 0 ). This behaviour is not compiler dependent.
2) Imagine that I really, really had to write the nice a || b syntax, could this be done cleanly in C++? By writing an operator redefinition? Without losing performance?
Since you are using pointers to chars, you can't overload the operator ( overloading requires one formal parameter of a class type, and you've got two pointers ). The equivalent statement C++ would be to store the first value in a temporary variable and then use the ?: ternary operator, or you can write it inline with the cost of evaluating the first expression twice.
You could instead do something like:
dMessage.addAVP(AVP_DESTINATION_HOST, (peer->getDestinationHost())? peer->getDestinationHost() : peer->getHost());
This is not as neat as || but near to it.
Well, you're right about what the problem is with your code: a || b will return a bool, which is converted to int (0 for false, != 0 for true).
As for your questions:
I'm not sure whether the return value is actually defined in the standard or not, but I wouldn't use the return value of || in any context other than a bool (since it's just not going to be clear).
I would use the ? operator instead. The syntax is: (Expression) ? (execute if true) : (execute if false). So in your case, I'd write: (peer->getDestinationHost() =! NULL) ? peer->getDestinationHost() : peer->getHost(). Of course, this will call getDestinationHost() twice, which might not be desirable. If it's not, you're going to have to save the return value of getDestinationHost(), in which case I'd just forget about making it short and neat, and just use a plain old "if" outside of the function call. That's the best way to keep it working, efficient, and most importantly, readable.

Double Negation in C++

I just came onto a project with a pretty huge code base.
I'm mostly dealing with C++ and a lot of the code they write uses double negation for their boolean logic.
if (!!variable && (!!api.lookup("some-string"))) {
do_some_stuff();
}
I know these guys are intelligent programmers, it's obvious they aren't doing this by accident.
I'm no seasoned C++ expert, my only guess at why they are doing this is that they want to make absolutely positive that the value being evaluated is the actual boolean representation. So they negate it, then negate that again to get it back to its actual boolean value.
Is this correct, or am I missing something?
It's a trick to convert to bool.
It's actually a very useful idiom in some contexts. Take these macros (example from the Linux kernel). For GCC, they're implemented as follows:
#define likely(cond) (__builtin_expect(!!(cond), 1))
#define unlikely(cond) (__builtin_expect(!!(cond), 0))
Why do they have to do this? GCC's __builtin_expect treats its parameters as long and not bool, so there needs to be some form of conversion. Since they don't know what cond is when they're writing those macros, it is most general to simply use the !! idiom.
They could probably do the same thing by comparing against 0, but in my opinion, it's actually more straightforward to do the double-negation, since that's the closest to a cast-to-bool that C has.
This code can be used in C++ as well... it's a lowest-common-denominator thing. If possible, do what works in both C and C++.
The coders think that it will convert the operand to bool, but because the operands of && are already implicitly converted to bool, it's utterly redundant.
Yes it is correct and no you are not missing something. !! is a conversion to bool. See this question for more discussion.
It's a technique to avoid writing (variable != 0) - i.e. to convert from whatever type it is to a bool.
IMO Code like this has no place in systems that need to be maintained - because it is not immediately readable code (hence the question in the first place).
Code must be legible - otherwise you leave a time debt legacy for the future - as it takes time to understand something that is needlessly convoluted.
It side-steps a compiler warning. Try this:
int _tmain(int argc, _TCHAR* argv[])
{
int foo = 5;
bool bar = foo;
bool baz = !!foo;
return 0;
}
The 'bar' line generates a "forcing value to bool 'true' or 'false' (performance warning)" on MSVC++, but the 'baz' line sneaks through fine.
Legacy C developers had no Boolean type, so they often #define TRUE 1 and #define FALSE 0 and then used arbitrary numeric data types for Boolean comparisons. Now that we have bool, many compilers will emit warnings when certain types of assignments and comparisons are made using a mixture of numeric types and Boolean types. These two usages will eventually collide when working with legacy code.
To work around this problem, some developers use the following Boolean identity: !num_value returns bool true if num_value == 0; false otherwise. !!num_value returns bool false if num_value == 0; true otherwise. The single negation is sufficient to convert num_value to bool; however, the double negation is necessary to restore the original sense of the Boolean expression.
This pattern is known as an idiom, i.e., something commonly used by people familiar with the language. Therefore, I don't see it as an anti-pattern, as much as I would static_cast<bool>(num_value). The cast might very well give the correct results, but some compilers then emit a performance warning, so you still have to address that.
The other way to address this is to say, (num_value != FALSE). I'm okay with that too, but all in all, !!num_value is far less verbose, may be clearer, and is not confusing the second time you see it.
Is operator! overloaded?
If not, they're probably doing this to convert the variable to a bool without producing a warning. This is definitely not a standard way of doing things.
!! was used to cope with original C++ which did not have a boolean type (as neither did C).
Example Problem:
Inside if(condition), the condition needs to evaluate to some type like double, int, void*, etc., but not bool as it does not exist yet.
Say a class existed int256 (a 256 bit integer) and all integer conversions/casts were overloaded.
int256 x = foo();
if (x) ...
To test if x was "true" or non-zero, if (x) would convert x to some integer and then assess if that int was non-zero. A typical overload of (int) x would return only the LSbits of x. if (x) was then only testing the LSbits of x.
But C++ has the ! operator. An overloaded !x would typically evaluate all the bits of x. So to get back to the non-inverted logic if (!!x) is used.
Ref Did older versions of C++ use the `int` operator of a class when evaluating the condition in an `if()` statement?
As Marcin mentioned, it might well matter if operator overloading is in play. Otherwise, in C/C++ it doesn't matter except if you're doing one of the following things:
direct comparison to true (or in C something like a TRUE macro), which is almost always a bad idea. For example:
if (api.lookup("some-string") == true) {...}
you simply want something converted to a strict 0/1 value. In C++ an assignment to a bool will do this implicitly (for those things that are implicitly convertible to bool). In C or if you're dealing with a non-bool variable, this is an idiom that I've seen, but I prefer the (some_variable != 0) variety myself.
I think in the context of a larger boolean expression it simply clutters things up.
If variable is of object type, it might have a ! operator defined but no cast to bool (or worse an implicit cast to int with different semantics. Calling the ! operator twice results in a convert to bool that works even in strange cases.
This may be an example of the double-bang trick (see The Safe Bool Idiom for more details). Here I summarize the first page of the article.
In C++ there are a number of ways to provide Boolean tests for classes.
An obvious way is the operator bool conversion operator.
// operator bool version
class Testable {
bool ok_;
public:
explicit Testable(bool b = true) : ok_(b) {}
operator bool() const { // use bool conversion operator
return ok_;
}
};
We can test the class as thus:
Testable test;
if (test) {
std::cout << "Yes, test is working!\n";
}
else {
std::cout << "No, test is not working!\n";
}
However, operator bool is considered unsafe because it allows nonsensical operations such as test << 1; or int i = test.
Using operator! is safer because we avoid implicit conversion or overloading issues.
The implementation is trivial,
bool operator!() const { // use operator!
return !ok_;
}
The two idiomatic ways to test Testable object are
Testable test;
if (!!test) {
std::cout << "Yes, test is working!\n";
}
if (!test) {
std::cout << "No, test is not working!\n";
}
The first version if (!!test) is what some people call the double-bang trick.
It's correct but, in C, pointless here -- 'if' and '&&' would treat the expression the same way without the '!!'.
The reason to do this in C++, I suppose, is that '&&' could be overloaded. But then, so could '!', so it doesn't really guarantee you get a bool, without looking at the code for the types of variable and api.call. Maybe someone with more C++ experience could explain; perhaps it's meant as a defense-in-depth sort of measure, not a guarantee.
Maybe the programmers were thinking something like this...
!!myAnswer is boolean. In context, it should become boolean, but I just love to bang bang things to make sure, because once upon a time there was a mysterious bug that bit me, and bang bang, I killed it.