I'm learning about operator overloading in C++, and I see that == and != are simply some special functions which can be customized for user-defined types. My concern is, though, why are there two separate definitions needed? I thought that if a == b is true, then a != b is automatically false, and vice versa, and there is no other possibility, because, by definition, a != b is !(a == b). And I couldn't imagine any situation in which this wasn't true. But perhaps my imagination is limited or I am ignorant of something?
I know that I can define one in terms of the other, but this is not what I'm asking about. I'm also not asking about the distinction between comparing objects by value or by identity. Or whether two objects could be equal and non-equal at the same time (this is definitely not an option! these things are mutually exclusive). What I'm asking about is this:
Is there any situation possible in which asking questions about two objects being equal does make sense, but asking about them not being equal doesn't make sense? (either from the user's perspective, or the implementer's perspective)
If there is no such possibility, then why on Earth does C++ have these two operators being defined as two distinct functions?
You would not want the language to automatically rewrite a != b as !(a == b) when a == b returns something other than a bool. And there are a few reasons why you might make it do that.
You may have expression builder objects, where a == b doesn't and isn't intended to perform any comparison, but simply builds some expression node representing a == b.
You may have lazy evaluation, where a == b doesn't and isn't intended to perform any comparison directly, but instead returns some kind of lazy<bool> that can be converted to bool implicitly or explicitly at some later time to actually perform the comparison. Possibly combined with the expression builder objects to allow complete expression optimisation before evaluation.
You may have some custom optional<T> template class, where given optional variables t and u, you want to allow t == u, but make it return optional<bool>.
There's probably more that I didn't think of. And even though in these examples the operation a == b and a != b do both make sense, still a != b isn't the same thing as !(a == b), so separate definitions are needed.
If there is no such possibility, then why on Earth does C++ have these two operators being defined as two distinct functions?
Because you can overload them, and by overloading them you can give them a totally different meaning from their original one.
Take, for example, operator <<, originally the bitwise left shift operator, now commonly overloaded as an insertion operator, like in std::cout << something; totally different meaning from the original one.
So, if you accept that the meaning of an operator changes when you overload it, then there is no reason to prevent user from giving a meaning to operator == that is not exactly the negation of operator !=, though this might be confusing.
My concern is, though, why are there two separate definitions needed?
You don't have to define both.
If they are mutually exclusive, you can still be concise by only defining == and < alongside std::rel_ops
Fom cppreference:
#include <iostream>
#include <utility>
struct Foo {
int n;
};
bool operator==(const Foo& lhs, const Foo& rhs)
{
return lhs.n == rhs.n;
}
bool operator<(const Foo& lhs, const Foo& rhs)
{
return lhs.n < rhs.n;
}
int main()
{
Foo f1 = {1};
Foo f2 = {2};
using namespace std::rel_ops;
//all work as you would expect
std::cout << "not equal: : " << (f1 != f2) << '\n';
std::cout << "greater: : " << (f1 > f2) << '\n';
std::cout << "less equal: : " << (f1 <= f2) << '\n';
std::cout << "greater equal: : " << (f1 >= f2) << '\n';
}
Is there any situation possible in which asking questions about two
objects being equal does make sense, but asking about them not being
equal doesn't make sense?
We often associate these operators to equality.
Although that is how they behave on fundamental types, there is no obligation that this be their behaviour on custom data types.
You don't even have to return a bool if you don't want to.
I've seen people overload operators in bizarre ways, only to find that it makes sense for their domain specific application. Even if the interface appears to show that they are mutually exclusive, the author may want to add specific internal logic.
(either from the user's perspective, or the implementer's perspective)
I know you want a specific example,
so here is one from the Catch testing framework that I thought was practical:
template<typename RhsT>
ResultBuilder& operator == ( RhsT const& rhs ) {
return captureExpression<Internal::IsEqualTo>( rhs );
}
template<typename RhsT>
ResultBuilder& operator != ( RhsT const& rhs ) {
return captureExpression<Internal::IsNotEqualTo>( rhs );
}
These operators are doing different things, and it would not make sense to define one method as a !(not) of the other. The reason this is done, is so that the framework can print out the comparison made. In order to do that, it needs to capture the context of what overloaded operator was used.
There are some very well-established conventions in which (a == b) and (a != b) are both false not necessarily opposites. In particular, in SQL, any comparison to NULL yields NULL, not true or false.
It's probably not a good idea to create new examples of this if at all possible, because it's so unintuitive, but if you're trying to model an existing convention, it's nice to have the option to make your operators behave "correctly" for that context.
I will only answer the second part of your question, namely:
If there is no such possibility, then why on Earth does C++ have these two operators being defined as two distinct functions?
One reason why it makes sense to allow the developer to overload both is performance. You might allow optimizations by implementing both == and !=. Then x != y might be cheaper than !(x == y) is. Some compilers may be able to optimize it for you, but perhaps not, especially if you have complex objects with a lot of branching involved.
Even in Haskell, where developers take laws and mathematical concepts very seriously, one is still allowed to overload both == and /=, as you can see here (http://hackage.haskell.org/package/base-4.9.0.0/docs/Prelude.html#v:-61--61-):
$ ghci
GHCi, version 7.10.2: http://www.haskell.org/ghc/ :? for help
λ> :i Eq
class Eq a where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool
-- Defined in `GHC.Classes'
This would probably be considered micro-optimization, but it might be warranted for some cases.
Is there any situation possible in which asking questions about two
objects being equal does make sense, but asking about them not being
equal doesn't make sense? (either from the user's perspective, or the
implementer's perspective)
That's an opinion. Maybe it doesn't. But the language designers, not being omniscient, decided not to restrict people who might come up with situations in which it might make sense (at least to them).
In response to the edit;
That is, if it is possible for some type to have the operator == but not the !=, or vice versa, and when does it make sense to do so.
In general, no, it doesn't make sense. Equality and relational operators generally come in sets. If there is the equality, then the inequality as well; less than, then greater than and so on with the <= etc. A similar approach is applied to the arithmetic operators as well, they also generally come in natural logical sets.
This is evidenced in the std::rel_ops namespace. If you implement the equality and less than operators, using that namespace gives you the others, implemented in terms of your original implemented operators.
That all said, are there conditions or situations where the one would not immediately mean the other, or could not be implemented in terms of the others? Yes there are, arguably few, but they are there; again, as evidenced in the rel_ops being a namespace of its own. For that reason, allowing them to be implemented independently allows you to leverage the language to get the semantics you require or need in a way that is still natural and intuitive for the user or client of the code.
The lazy evaluation already mentioned is an excellent example of this. Another good example is giving them semantics that don't mean equality or in-equality at all. A similar example to this is the bit shift operators << and >> being used for stream insertion and extraction. Although it may be frowned upon in general circles, in some domain specific areas it may make sense.
If the == and != operators don't actually imply equality, in the same way that the << and >> stream operators don't imply bit-shifting. If you treat the symbols as if they mean some other concept, they don't have to be mutually exclusive.
In terms of equality, it could make sense if your use-case warrants treating objects as non-comparable, so that every comparison should return false (or a non-comparable result type, if your operators return non-bool). I can't think of a specific situation where this would be warranted, but I could see it being reasonable enough.
With great power comes great responsibly, or at least really good style guides.
== and != can be overloaded to do whatever the heck you want. It's both a blessing and a curse. There's no guarantee that != means !(a==b).
enum BoolPlus {
kFalse = 0,
kTrue = 1,
kFileNotFound = -1
}
BoolPlus operator==(File& other);
BoolPlus operator!=(File& other);
I can't justify this operator overloading, but in the example above it is impossible to define operator!= as the "opposite" of operator==.
In the end, what you are checking with those operators is that the expression a == b or a != b is returning a Boolean value (true or false). These expression returns a Boolean value after comparison rather than being mutually exclusive.
[..] why are there two separate definitions needed?
One thing to consider is that there might be the possibility of implementing one of these operators more efficiently than just using the negation of the other.
(My example here was rubbish, but the point still stands, think of bloom filters, for example: They allow fast testing if something is not in a set, but testing if it's in may take a lot more time.)
[..] by definition, a != b is !(a == b).
And it's your responsibility as programmer to make that hold. Probably a good thing to write a test for.
By customizing the behavior of the operators, you can make them do what you want.
You may wish to customize things. For instance, you may wish to customize a class. Objects of this class can be compared just by checking a specific property. Knowing that this is the case, you can write some specific code that only checks the minimum things, instead of checking every single bit of every single property in the whole object.
Imagine a case where you can figure out that something is different just as fast, if not faster, than you can find out something is the same. Granted, once you figure out whether something is the same or different, then you can know the opposite simply by flipping a bit. However, flipping that bit is an extra operation. In some cases, when code gets re-executed a lot, saving one operation (multiplied by many times) can have an overall speed increase. (For instance, if you save one operation per pixel of a megapixel screen, then you've just saved a million operations. Multiplied by 60 screens per second, and you save even more operations.)
hvd's answer provides some additional examples.
Yes, because one means "equivalent" and another means "non-equivalent" and this terms are mutually exclusive. Any other meaning for this operators is confusing and should be avoided by all means.
Maybe an uncomparable rule, where a != b was false and a == b was false like a stateless bit.
if( !(a == b || a != b) ){
// Stateless
}
Related
I just came onto a project with a pretty huge code base.
I'm mostly dealing with C++ and a lot of the code they write uses double negation for their boolean logic.
if (!!variable && (!!api.lookup("some-string"))) {
do_some_stuff();
}
I know these guys are intelligent programmers, it's obvious they aren't doing this by accident.
I'm no seasoned C++ expert, my only guess at why they are doing this is that they want to make absolutely positive that the value being evaluated is the actual boolean representation. So they negate it, then negate that again to get it back to its actual boolean value.
Is this correct, or am I missing something?
It's a trick to convert to bool.
It's actually a very useful idiom in some contexts. Take these macros (example from the Linux kernel). For GCC, they're implemented as follows:
#define likely(cond) (__builtin_expect(!!(cond), 1))
#define unlikely(cond) (__builtin_expect(!!(cond), 0))
Why do they have to do this? GCC's __builtin_expect treats its parameters as long and not bool, so there needs to be some form of conversion. Since they don't know what cond is when they're writing those macros, it is most general to simply use the !! idiom.
They could probably do the same thing by comparing against 0, but in my opinion, it's actually more straightforward to do the double-negation, since that's the closest to a cast-to-bool that C has.
This code can be used in C++ as well... it's a lowest-common-denominator thing. If possible, do what works in both C and C++.
The coders think that it will convert the operand to bool, but because the operands of && are already implicitly converted to bool, it's utterly redundant.
Yes it is correct and no you are not missing something. !! is a conversion to bool. See this question for more discussion.
It's a technique to avoid writing (variable != 0) - i.e. to convert from whatever type it is to a bool.
IMO Code like this has no place in systems that need to be maintained - because it is not immediately readable code (hence the question in the first place).
Code must be legible - otherwise you leave a time debt legacy for the future - as it takes time to understand something that is needlessly convoluted.
It side-steps a compiler warning. Try this:
int _tmain(int argc, _TCHAR* argv[])
{
int foo = 5;
bool bar = foo;
bool baz = !!foo;
return 0;
}
The 'bar' line generates a "forcing value to bool 'true' or 'false' (performance warning)" on MSVC++, but the 'baz' line sneaks through fine.
Legacy C developers had no Boolean type, so they often #define TRUE 1 and #define FALSE 0 and then used arbitrary numeric data types for Boolean comparisons. Now that we have bool, many compilers will emit warnings when certain types of assignments and comparisons are made using a mixture of numeric types and Boolean types. These two usages will eventually collide when working with legacy code.
To work around this problem, some developers use the following Boolean identity: !num_value returns bool true if num_value == 0; false otherwise. !!num_value returns bool false if num_value == 0; true otherwise. The single negation is sufficient to convert num_value to bool; however, the double negation is necessary to restore the original sense of the Boolean expression.
This pattern is known as an idiom, i.e., something commonly used by people familiar with the language. Therefore, I don't see it as an anti-pattern, as much as I would static_cast<bool>(num_value). The cast might very well give the correct results, but some compilers then emit a performance warning, so you still have to address that.
The other way to address this is to say, (num_value != FALSE). I'm okay with that too, but all in all, !!num_value is far less verbose, may be clearer, and is not confusing the second time you see it.
Is operator! overloaded?
If not, they're probably doing this to convert the variable to a bool without producing a warning. This is definitely not a standard way of doing things.
!! was used to cope with original C++ which did not have a boolean type (as neither did C).
Example Problem:
Inside if(condition), the condition needs to evaluate to some type like double, int, void*, etc., but not bool as it does not exist yet.
Say a class existed int256 (a 256 bit integer) and all integer conversions/casts were overloaded.
int256 x = foo();
if (x) ...
To test if x was "true" or non-zero, if (x) would convert x to some integer and then assess if that int was non-zero. A typical overload of (int) x would return only the LSbits of x. if (x) was then only testing the LSbits of x.
But C++ has the ! operator. An overloaded !x would typically evaluate all the bits of x. So to get back to the non-inverted logic if (!!x) is used.
Ref Did older versions of C++ use the `int` operator of a class when evaluating the condition in an `if()` statement?
As Marcin mentioned, it might well matter if operator overloading is in play. Otherwise, in C/C++ it doesn't matter except if you're doing one of the following things:
direct comparison to true (or in C something like a TRUE macro), which is almost always a bad idea. For example:
if (api.lookup("some-string") == true) {...}
you simply want something converted to a strict 0/1 value. In C++ an assignment to a bool will do this implicitly (for those things that are implicitly convertible to bool). In C or if you're dealing with a non-bool variable, this is an idiom that I've seen, but I prefer the (some_variable != 0) variety myself.
I think in the context of a larger boolean expression it simply clutters things up.
If variable is of object type, it might have a ! operator defined but no cast to bool (or worse an implicit cast to int with different semantics. Calling the ! operator twice results in a convert to bool that works even in strange cases.
This may be an example of the double-bang trick (see The Safe Bool Idiom for more details). Here I summarize the first page of the article.
In C++ there are a number of ways to provide Boolean tests for classes.
An obvious way is the operator bool conversion operator.
// operator bool version
class Testable {
bool ok_;
public:
explicit Testable(bool b = true) : ok_(b) {}
operator bool() const { // use bool conversion operator
return ok_;
}
};
We can test the class as thus:
Testable test;
if (test) {
std::cout << "Yes, test is working!\n";
}
else {
std::cout << "No, test is not working!\n";
}
However, operator bool is considered unsafe because it allows nonsensical operations such as test << 1; or int i = test.
Using operator! is safer because we avoid implicit conversion or overloading issues.
The implementation is trivial,
bool operator!() const { // use operator!
return !ok_;
}
The two idiomatic ways to test Testable object are
Testable test;
if (!!test) {
std::cout << "Yes, test is working!\n";
}
if (!test) {
std::cout << "No, test is not working!\n";
}
The first version if (!!test) is what some people call the double-bang trick.
It's correct but, in C, pointless here -- 'if' and '&&' would treat the expression the same way without the '!!'.
The reason to do this in C++, I suppose, is that '&&' could be overloaded. But then, so could '!', so it doesn't really guarantee you get a bool, without looking at the code for the types of variable and api.call. Maybe someone with more C++ experience could explain; perhaps it's meant as a defense-in-depth sort of measure, not a guarantee.
Maybe the programmers were thinking something like this...
!!myAnswer is boolean. In context, it should become boolean, but I just love to bang bang things to make sure, because once upon a time there was a mysterious bug that bit me, and bang bang, I killed it.
As pointed out by this article, it is impossible to overload the comparison operator (==) such that both sides could take primitive types.
"No, the C++ language requires that your operator overloads take at least one operand of a "class type" or enumeration type. The C++ language will not let you define an operator all of whose operands / parameters are of primitive types." (parashift)
I was wondering:
**If I really-really needed to compare two primitives in a non-standard way using the ==, is there a way to implicitly cast them to some other class?
For example, the following code will work for const char* comparison, but it requires an explicit cast. I would prefer to avoid explicit casts if possible.
// With an explicit cast
if(string("a")=="A") // True
// Without the cast
if("a"=="A") // False
// An example overloaded function:
bool operator == (string a, const char* b)
{
// Compares ignoring case
}
Casting can be pretty clunky in some situations, especially if you need to do several casts inside a long expression. So that's why I was looking for a way to automatically cast the first input (or both) to a sting type.
Edit 1:
Another way to do this is to write an isEqual(const char* a, const char* b) function, but I want to avoid this because it will result in a mess of parenthesis if I were to use it inside of a large if-statement. Here's an oversimplified example that still shows what I mean:
if (str1 == str2 || str1 == str3 || str2==str4)
As opposed to:
if (isEqual(str1,str2) || isEqual(str1,str3) || isEqual(str2,str4))
Edit 2:
I know there exist many ways to achieve the desired functionality without overloading the ==. But I looking specifically for a way to make the == work because I then could apply the knowledge to other operators as well.
This question is in fact closely related to the Wacky Math Calculator question I asked a few weeks ago, and being able to overload the == will help make the code look considerably nicer (visually, but perhaps not in a "clean code" way).
And that's I wanted to ask this question here on SO, in case someone had a cool C++ trick up their sleeve that I didn't know about. But if the answer is No then that's fine too.
You could certainly write one or more free functions to do your comparison. It doesn't have to be an operator overload.
for example:
bool IsEqual(const char* a, const string& b)
{
// code
}
bool IsEqual(const string& a, const char* b)
{
// code
}
and so on.
If the types of the operands are given, and you cannot add an overload for equality, because
It is simply not overloadable, because all are primitive types
There is already an overload, which does not do what you want
You do not want to risk violation of the ODR because someone else could be pulling the same kind of stunts you do (In which case a TU-local override might work, aka free file-local. Be aware of adverse effects on templates.)
there are just two options:
Use a wrapper and overload all the operators to your hearts content.
Use a function having the desired behavior explicitly and just forget about the syntax-sugar.
BTW: That standard-library containers and algorithms often can be customized three ways:
Using a type having overloaded operators
Having the used standard traits-class specialized
Providing a different traits-class.
No, there's no way to implicitly cast the primitive types the way you want.
The only way to achieve the desired functionality is either by explicitly casting the inputs into another class as stated in the question, or by creating a free function as shown by #Logicat.
I'm browsing through some code and I found a few ternary operators in it. This code is a library that we use, and it's supposed to be quite fast.
I'm thinking if we're saving anything except for space there.
What's your experience?
Performance
The ternary operator shouldn't differ in performance from a well-written equivalent if/else statement... they may well resolve to the same representation in the Abstract Syntax Tree, undergo the same optimisations etc..
Things you can only do with ? :
If you're initialising a constant or reference, or working out which value to use inside a member initialisation list, then if/else statements can't be used but ? : can be:
const int x = f() ? 10 : 2;
X::X() : n_(n > 0 ? 2 * n : 0) { }
Factoring for concise code
Keys reasons to use ? : include localisation, and avoiding redundantly repeating other parts of the same statements/function-calls, for example:
if (condition)
return x;
else
return y;
...is only preferable to...
return condition ? x : y;
...on readability grounds if dealing with very inexperienced programmers, or some of the terms are complicated enough that the ? : structure gets lost in the noise. In more complex cases like:
fn(condition1 ? t1 : f1, condition2 ? t2 : f2, condition3 ? t3 : f3);
An equivalent if/else:
if (condition1)
if (condition2)
if (condition3)
fn(t1, t2, t3);
else
fn(t1, t2, f3);
else if (condition3)
fn(t1, f2, t3);
else
fn(t1, f2, f3);
else
if (condition2)
...etc...
That's a lot of extra function calls that the compiler may or may not optimise away.
Further, ? allows you to select an object, then use a member thereof:
(f() ? a : b).fn(g() ? c : d).field_name);
The equivalent if/else would be:
if (f())
if (g())
x.fn(c.field_name);
else
x.fn(d.field_name);
else
if (g())
y.fn(c.field_name);
else
y.fn(d.field_name);
Can't named temporaries improve the if/else monstrosity above?
If the expressions t1, f1, t2 etc. are too verbose to type repeatedly, creating named temporaries may help, but then:
To get performance matching ? : you may need to use std::move, except when the same temporary is passed to two && parameters in the function called: then you must avoid it. That's more complex and error-prone.
c ? x : y evaluates c then either but not both of x and y, which makes it safe to say test a pointer isn't nullptr before using it, while providing some fallback value/behaviour. The code only gets the side effects of whichever of x and y is actually selected. With named temporaries, you may need if / else around or ? : inside their initialisation to prevent unwanted code executing, or code executing more often than desired.
Functional difference: unifying result type
Consider:
void is(int) { std::cout << "int\n"; }
void is(double) { std::cout << "double\n"; }
void f(bool expr)
{
is(expr ? 1 : 2.0);
if (expr)
is(1);
else
is(2.0);
}
In the conditional operator version above, 1 undergoes a Standard Conversion to double so that the type matched 2.0, meaning the is(double) overload is called even for the true/1 situation. The if/else statement doesn't trigger this conversion: the true/1 branch calls is(int).
You can't use expressions with an overall type of void in a conditional operator either, whereas they're valid in statements under an if/else.
Emphasis: value-selection before/after action needing values
There's a different emphasis:
An if/else statement emphasises the branching first and what's to be done is secondary, while a ternary operator emphasises what's to be done over the selection of the values to do it with.
In different situations, either may better reflect the programmer's "natural" perspective on the code and make it easier to understand, verify and maintain. You may find yourself selecting one over the other based on the order in which you consider these factors when writing the code - if you've launched into "doing something" then find you might use one of a couple (or few) values to do it with, ? : is the least disruptive way to express that and continue your coding "flow".
The only potential benefit to ternary operators over plain if statements in my view is their ability to be used for initializations, which is particularly useful for const:
E.g.
const int foo = (a > b ? b : a - 10);
Doing this with an if/else block is impossible without using a function cal as well. If you happen to have lots of cases of const things like this you might find there's a small gain from initializing a const properly over assignment with if/else. Measure it! Probably won't even be measurable though. The reason I tend to do this is because by marking it const the compiler knows when I do something later that could/would accidentally change something I thought was fixed.
Effectively what I'm saying is that ternary operator is important for const-correctness, and const correctness is a great habit to be in:
This saves a lot of your time by letting the compiler help you spot mistakes you make
This can potentially let the compiler apply other optimizations
Well...
I did a few tests with GCC and this function call:
add(argc, (argc > 1)?(argv[1][0] > 5)?50:10:1, (argc > 2)?(argv[2][0] > 5)?50:10:1, (argc > 3)?(argv[3][0] > 5)?50:10:1);
The resulting assembler code with gcc -O3 had 35 instructions.
The equivalent code with if/else + intermediate variables had 36. With nested if/else using the fact that 3 > 2 > 1, I got 44. I did not even try to expand this into separate function calls.
Now I did not do any performance analysis, nor did I do a quality check of the resulting assembler code, but at something simple like this with no loops e.t.c. I believe shorter is better.
It appears that there is some value to ternary operators after all :-)
That is only if code speed is absolutely crucial, of course. If/else statements are much easier to read when nested than something like (c1)?(c2)?(c3)?(c4)?:1:2:3:4. And having huge expressions as function arguments is not fun.
Also keep in mind that nested ternary expressions make refactoring the code - or debugging by placing a bunch of handy printfs() at a condition - a lot harder.
If you're worried about it from a performance perspective then I'd be very surprised if there was any different between the two.
From a look 'n feel perspective it's mainly down to personal preference. If the condition is short and the true/false parts are short then a ternary operator is fine, but anything longer tends to be better in an if/else statement (in my opinion).
You assume that there must be a distinction between the two when, in fact, there are a number of languages which forgo the "if-else" statement in favor of an "if-else" expression (in this case, they may not even have the ternary operator, which is no longer needed)
Imagine:
x = if (t) a else b
Anyway, the ternary operator is an expression in some languages (C,C#,C++,Java,etc) which do not have "if-else" expressions and thus it serves a distinct role there.
Here is some code I'm writing in C++. There's a call to an addAVP() function
dMessage.addAVP(AVP_DESTINATION_HOST, peer->getDestinationHost() || peer->getHost());
which has two versions: one overloaded in the second parameter to addAVP(int, char*) and another to addAVP(int, int). I find the C++ compiler I use calls the addAVP(int, int) version which is not what I wanted since getDestinationHost() and getHost() both return char*.
Nonetheless the || operator is defined to return bool so I can see where my error is. Bool somehow counts as an integer and this compiles cleanly and calls the second addAVP().
Lately I'm using a lot of dynamically typed languages, i.e. lisp, where the above code is correct can be written without worries. Clearly, clearly the above code in C++ is a big error, but still have some questions:
Should I be using this kind of shortcut, i.e. using the ||-operator's return value, at all in C++. Is this compiler dependent?
Imagine that I really, really had to write the nice a || b syntax, could this be done cleanly in C++? By writing an operator redefinition? Without losing performance?
As a followup to my original request, or my own answer to 2 :-) I was thinking along the lines of using a class to encapsulate the (evil?) rawpointer:
class char_ptr_w {
const char* wrapped_;
public:
char_ptr_w(const char* wrapped) : wrapped_(wrapped) {}
char_ptr_w(char_ptr_w const& orig) { wrapped_=orig.wrapped(); }
~char_ptr_w() {}
inline const char* wrapped() const { return wrapped_; }
};
inline char_ptr_w operator||(char_ptr_w &lhs, char_ptr_w& rhs) {
if (lhs.wrapped() != NULL)
return char_ptr_w(lhs.wrapped());
else
return char_ptr_w(rhs.wrapped());
};
Then I could use:
char_ptr_w a(getDestinationHost());
char_ptr_w b(getHost());
addAVP(AVP_DESTINATION_HOST, a || b);
In which this addAVP would be overloaded for char_ptr_w. According to my tests, this generates at most the same assembly code as ternary a?b:c solution, particularly because of the NRVO optimization in the operator, which does not, in most compilers, call the copy-constructor (although you have to include it).
Naturally, in this particular example I agree that the ternary solution is the best. I also agree that operator redefinition is something to be taken with care, and not always beneficial. But is there anything conceptually wrong, in a C++ sense, with the above solution?
It is legal in C++ to overload the logic operators, but only if one or both of the arguments are of a class type, and anyway it's a very bad idea. Overloaded logic operators do not short circuit, so this may cause apparently valid code elsewhere in your program to crash.
return p && p->q; // this can't possibly dereference a null pointer... can it?
As you discovered, a bool is really an int. The compiler is picking the correct function for your footprint. If you want to keep similar syntax, you might try
char*gdh=0;
dMessage.addAVP(AVP\_DESTINATION\_HOST,
(gdh=peer->getDestinationHost()) ? gdh : peer->getHost());
I would strongly recommend against redefining the operator. From a maintenance perspective, this is very likely to confuse later developers.
Why are you using an "or" operator on two char pointers?
I am assuming that peer->getDestinationHost() or peer->getHost() can return a NULL, and you are trying to use the one that returns a valid string, right?
In that case you need to do this separately:
char *host = peer->getDestinationHost();
if(host == NULL)
host = peer->getHost();
dMessage.addAVP(AVP\_DESTINATION\_HOST, host);
It makes no sense to pass a boolean to a function that expects a char *.
In C++ || returns a bool, not one of its operands. It is usually a bad idea to fight the language.
1) Should I be using this kind of shortcut, i.e. using the ||-operator's return value, at all in C++. Is this compiler dependent?
It's not compiler dependent, but it doesn't do the same as what the || operator does in languages such as JavaScript or or in common lisp. It coerces it first operand to a boolean values, and if that operand is true returns true. If the first operand is false, the second is evaluated and coerced to a boolean value, and this boolean value is returned.
So what it is doing is the same as ( peer->getDestinationHost() != 0 ) || ( peer->getHost() != 0 ). This behaviour is not compiler dependent.
2) Imagine that I really, really had to write the nice a || b syntax, could this be done cleanly in C++? By writing an operator redefinition? Without losing performance?
Since you are using pointers to chars, you can't overload the operator ( overloading requires one formal parameter of a class type, and you've got two pointers ). The equivalent statement C++ would be to store the first value in a temporary variable and then use the ?: ternary operator, or you can write it inline with the cost of evaluating the first expression twice.
You could instead do something like:
dMessage.addAVP(AVP_DESTINATION_HOST, (peer->getDestinationHost())? peer->getDestinationHost() : peer->getHost());
This is not as neat as || but near to it.
Well, you're right about what the problem is with your code: a || b will return a bool, which is converted to int (0 for false, != 0 for true).
As for your questions:
I'm not sure whether the return value is actually defined in the standard or not, but I wouldn't use the return value of || in any context other than a bool (since it's just not going to be clear).
I would use the ? operator instead. The syntax is: (Expression) ? (execute if true) : (execute if false). So in your case, I'd write: (peer->getDestinationHost() =! NULL) ? peer->getDestinationHost() : peer->getHost(). Of course, this will call getDestinationHost() twice, which might not be desirable. If it's not, you're going to have to save the return value of getDestinationHost(), in which case I'd just forget about making it short and neat, and just use a plain old "if" outside of the function call. That's the best way to keep it working, efficient, and most importantly, readable.
I just came onto a project with a pretty huge code base.
I'm mostly dealing with C++ and a lot of the code they write uses double negation for their boolean logic.
if (!!variable && (!!api.lookup("some-string"))) {
do_some_stuff();
}
I know these guys are intelligent programmers, it's obvious they aren't doing this by accident.
I'm no seasoned C++ expert, my only guess at why they are doing this is that they want to make absolutely positive that the value being evaluated is the actual boolean representation. So they negate it, then negate that again to get it back to its actual boolean value.
Is this correct, or am I missing something?
It's a trick to convert to bool.
It's actually a very useful idiom in some contexts. Take these macros (example from the Linux kernel). For GCC, they're implemented as follows:
#define likely(cond) (__builtin_expect(!!(cond), 1))
#define unlikely(cond) (__builtin_expect(!!(cond), 0))
Why do they have to do this? GCC's __builtin_expect treats its parameters as long and not bool, so there needs to be some form of conversion. Since they don't know what cond is when they're writing those macros, it is most general to simply use the !! idiom.
They could probably do the same thing by comparing against 0, but in my opinion, it's actually more straightforward to do the double-negation, since that's the closest to a cast-to-bool that C has.
This code can be used in C++ as well... it's a lowest-common-denominator thing. If possible, do what works in both C and C++.
The coders think that it will convert the operand to bool, but because the operands of && are already implicitly converted to bool, it's utterly redundant.
Yes it is correct and no you are not missing something. !! is a conversion to bool. See this question for more discussion.
It's a technique to avoid writing (variable != 0) - i.e. to convert from whatever type it is to a bool.
IMO Code like this has no place in systems that need to be maintained - because it is not immediately readable code (hence the question in the first place).
Code must be legible - otherwise you leave a time debt legacy for the future - as it takes time to understand something that is needlessly convoluted.
It side-steps a compiler warning. Try this:
int _tmain(int argc, _TCHAR* argv[])
{
int foo = 5;
bool bar = foo;
bool baz = !!foo;
return 0;
}
The 'bar' line generates a "forcing value to bool 'true' or 'false' (performance warning)" on MSVC++, but the 'baz' line sneaks through fine.
Legacy C developers had no Boolean type, so they often #define TRUE 1 and #define FALSE 0 and then used arbitrary numeric data types for Boolean comparisons. Now that we have bool, many compilers will emit warnings when certain types of assignments and comparisons are made using a mixture of numeric types and Boolean types. These two usages will eventually collide when working with legacy code.
To work around this problem, some developers use the following Boolean identity: !num_value returns bool true if num_value == 0; false otherwise. !!num_value returns bool false if num_value == 0; true otherwise. The single negation is sufficient to convert num_value to bool; however, the double negation is necessary to restore the original sense of the Boolean expression.
This pattern is known as an idiom, i.e., something commonly used by people familiar with the language. Therefore, I don't see it as an anti-pattern, as much as I would static_cast<bool>(num_value). The cast might very well give the correct results, but some compilers then emit a performance warning, so you still have to address that.
The other way to address this is to say, (num_value != FALSE). I'm okay with that too, but all in all, !!num_value is far less verbose, may be clearer, and is not confusing the second time you see it.
Is operator! overloaded?
If not, they're probably doing this to convert the variable to a bool without producing a warning. This is definitely not a standard way of doing things.
!! was used to cope with original C++ which did not have a boolean type (as neither did C).
Example Problem:
Inside if(condition), the condition needs to evaluate to some type like double, int, void*, etc., but not bool as it does not exist yet.
Say a class existed int256 (a 256 bit integer) and all integer conversions/casts were overloaded.
int256 x = foo();
if (x) ...
To test if x was "true" or non-zero, if (x) would convert x to some integer and then assess if that int was non-zero. A typical overload of (int) x would return only the LSbits of x. if (x) was then only testing the LSbits of x.
But C++ has the ! operator. An overloaded !x would typically evaluate all the bits of x. So to get back to the non-inverted logic if (!!x) is used.
Ref Did older versions of C++ use the `int` operator of a class when evaluating the condition in an `if()` statement?
As Marcin mentioned, it might well matter if operator overloading is in play. Otherwise, in C/C++ it doesn't matter except if you're doing one of the following things:
direct comparison to true (or in C something like a TRUE macro), which is almost always a bad idea. For example:
if (api.lookup("some-string") == true) {...}
you simply want something converted to a strict 0/1 value. In C++ an assignment to a bool will do this implicitly (for those things that are implicitly convertible to bool). In C or if you're dealing with a non-bool variable, this is an idiom that I've seen, but I prefer the (some_variable != 0) variety myself.
I think in the context of a larger boolean expression it simply clutters things up.
If variable is of object type, it might have a ! operator defined but no cast to bool (or worse an implicit cast to int with different semantics. Calling the ! operator twice results in a convert to bool that works even in strange cases.
This may be an example of the double-bang trick (see The Safe Bool Idiom for more details). Here I summarize the first page of the article.
In C++ there are a number of ways to provide Boolean tests for classes.
An obvious way is the operator bool conversion operator.
// operator bool version
class Testable {
bool ok_;
public:
explicit Testable(bool b = true) : ok_(b) {}
operator bool() const { // use bool conversion operator
return ok_;
}
};
We can test the class as thus:
Testable test;
if (test) {
std::cout << "Yes, test is working!\n";
}
else {
std::cout << "No, test is not working!\n";
}
However, operator bool is considered unsafe because it allows nonsensical operations such as test << 1; or int i = test.
Using operator! is safer because we avoid implicit conversion or overloading issues.
The implementation is trivial,
bool operator!() const { // use operator!
return !ok_;
}
The two idiomatic ways to test Testable object are
Testable test;
if (!!test) {
std::cout << "Yes, test is working!\n";
}
if (!test) {
std::cout << "No, test is not working!\n";
}
The first version if (!!test) is what some people call the double-bang trick.
It's correct but, in C, pointless here -- 'if' and '&&' would treat the expression the same way without the '!!'.
The reason to do this in C++, I suppose, is that '&&' could be overloaded. But then, so could '!', so it doesn't really guarantee you get a bool, without looking at the code for the types of variable and api.call. Maybe someone with more C++ experience could explain; perhaps it's meant as a defense-in-depth sort of measure, not a guarantee.
Maybe the programmers were thinking something like this...
!!myAnswer is boolean. In context, it should become boolean, but I just love to bang bang things to make sure, because once upon a time there was a mysterious bug that bit me, and bang bang, I killed it.