Better alternatives to assert(false) in C/C++ - c++

Currently, I write
assert(false);
at places that my code is never supposed to reach. One example, in a very C-ish style, is:
int findzero( int length, int * array ) {
for( int i = 0; i < length; i++ )
if( array[i] == 0 )
return i;
assert(false);
}
My compiler recognizes that the program finishes once assert(false) has been reached. However, whenever I compile with -DNDEBUG for performance reasons, the last assertion vanishes and the compiler warns that the execution finishes the function without a return statement.
What are better alternatives of finishing off a program if a supposedly unreachable part of the code has been reached? The solution should
be recognized by the compiler and not produce warnings (like the ones above or others)
perhaps even allow for a custom error message.
I am explicitly interested in solutions no matter whether it's modern C++ or like 90s C.

Replacing your assert(false) is exactly what "unreachable" built-ins are for.
They are a semantic equivalent to your use of assert(false). In fact, VS's is spelt very similarly.
GCC/Clang/Intel:
__builtin_unreachable()
MSVS:
__assume(false)
These have effect regardless of NDEBUG (unlike assert) or optimisation levels.
Your compiler, particularly with the above built-ins but also possibly with your assert(false), nods its head in understanding that you're promising that part of the function will never be reached. It can use this to perform some optimisations on certain code paths, and it will silence warnings about missing returns because you've already promised that it was deliberate.
The trade-off is that the statement itself has undefined behaviour (much like going forth and flowing off the end of the function was already). In some situations, you may instead wish to consider throwing an exception (or returning some "error code" value instead), or calling std::abort() (in C++) if you want to just terminate the program.
There's a proposal (P0627R0), to add this to C++ as a standard attribute.
From the GCC docs on Builtins:
If control flow reaches the point of the __builtin_unreachable, the program is undefined. It is useful in situations where the compiler cannot deduce the unreachability of the code. [..]

I like to use
assert(!"This should never happen.");
...which can also be used with a condition, as in
assert(!vector.empty() || !"Cannot take element from empty container." );
What's nice about this is that the string shows up in the error message in case an assertion does not hold.

As a fully portable solution, consider this:
[[ noreturn ]] void unreachable(std::string_view msg = "<No Message>") {
std::cerr << "Unreachable code reached. Message: " << msg << std::endl;
std::abort();
}
The message part is, of course, optional.

Looks like std::unreachable() made it to C++23:
https://en.cppreference.com/w/cpp/utility/unreachable

I use a custom assert that turns into __builtin_unreachable() or *(char*)0=0 when NDEBUG is on (I also use an enum variable instead of a macro so that I can easily set NDEBUG per scope).
In pseudocode, it's something like:
#define my_assert(X) do{ \
if(!(X)){ \
if (my_ndebug) MY_UNREACHABLE(); \
else my_assert_fail(__FILE__,__LINE__,#X); \
} \
}while(0)
The __builtin_unreachable() should eliminate the warning and help with optimization at the same time, but in debug mode, it's better to have an assert or an abort(); there so you get a reliable panic. (__builtin_unreachable() just gives you undefined behavior when reached).

I recommend C++ Core Gudelines's Expects and Ensures. They can be configured to abort (default), throw, or do nothing on violation.
To suppress compiler warnings on unreachable branches you can also use GSL_ASSUME.
#include <gsl/gsl>
int findzero( int length, int * array ) {
Expects(length >= 0);
Expects(array != nullptr);
for( int i = 0; i < length; i++ )
if( array[i] == 0 )
return i;
Expects(false);
// or
// GSL_ASSUME(false);
}

assert is meant for scenarios that are ACTUALLY supposed to be impossible to happen during execution. It is useful in debugging to point out "Hey, turns out what you thought to be impossible is, in fact, not impossible." It looks like what you should be doing in the given example is expressing the function's failure, perhaps by returning -1 as that would not be a valid index. In some instances, it might be useful to set errno to clarify the exact nature of an error. Then, with this information, the calling function can decide how to handle such error.
Depending on how critical this error is to the rest of the application, you might try to recover from it, or you might just log the error and call exit to put it out of its misery.

I believe the reason you are getting the errors is because assertions are generally used for debugging on your own code. When these functions are run in release, exceptions should be used instead with an exit by std::abort() to indicate abnormal program termination.
If you still want to use asserts, there is an answer about defining a custom one by PSkocik, as well as a link here where someone proposes the use of custom asserts and how to enable them in cmake here as well.

One rule that is sometimes found in style-guides is
"Never return from the middle of a function"
All functions should have a single return, at the end of the function.
Following this rule, your code would look like:
int findzero( int length, int * array ) {
int i;
for( i = 0; i < length; i++ )
{
if( array[i] == 0 )
break; // Break the loop now that i is the correct value
}
assert(i < length); // Assert that a valid index was found.
return i; // Return the value found, or "length" if not found!
}

Related

c++ code behaves abnormally when the return value is missing

GCC 9.2.1 gives a warning that there is "no return statement in function returning non-void", however, the code does compile (I compiled with flags -O3 and -finline-functions).
I expected the program to have no output, as the condition for the while loop should evaluate to false. However, I got the following output from the program (printed in the while loop):
"it != mMap.end(): 0"
The output is particularly odd, given that the printed value (i.e., 0 or "false") is also the condition for the while loop.
After the print, the program segfaults because the iterator becomes invalid (via it++ in the while loop that should never have executed).
I guess this can all be chalked up to the missing return value. But, I find it surprising that the code behaves so pathologically simply because a return value isn't supplied. I'd appreciate any insight into understanding what's happening at a deeper level.
#include <iostream>
#include <tr1/unordered_map>
struct Test
{
int Dummy (void) const
{
std::tr1::unordered_map<int, int>::const_iterator it = mMap.begin();
while (it != mMap.end())
{
std::cout << "it != mMap.end(): " << (it != mMap.end()) << std::endl;
it++;
}
}
std::tr1::unordered_map<int, int> mMap;
};
int main (void)
{
Test test;
test.Dummy();
return 0;
}
Returning from a non-void function without supplying a return value is underfined behavior, and therefore the compiler can do whatever it wants.
What the compiler has done in this case in particular, seems to be that it has spotted that skipping the loop would trigger UB and therefore assumes the loop is entered at least once. Therefore it assumes that it can safely enter the loop without checking the condition as the compiler trusts that the author of the program did not invoke any undefined behavior.
This is of course speculation from my part.
You should turn on warnings using -Wextra and -Wall in gcc which will e.g. enable the warning Control reaches the end of a non-void function so you will hopefully not forget about fixing stuff like that. To enforce it, enable the compiler flag -Werror.
Not returning a value invokes undefined behavior so the compiler is allowed to either return a garbage value or crash your program (this is what clang typically does) if the function returns. This is bad and you should never keep broken code like that. For additional insight you can use a disassembler and see what your compiler did at the end of the function.

C/C++: goto into the for loop

I have a bit unusual situation - I want to use goto statement to jump into the loop, not to jump out from it.
There are strong reasons to do so - this code must be part of some function which makes some calculations after the first call, returns with request for new data and needs one more call to continue. Function pointers (obvious solution) can't be used because we need interoperability with code which does not support function pointers.
I want to know whether code below is safe, i.e. it will be correctly compiled by all standard-compliant C/C++ compilers (we need both C and C++).
function foo(int not_a_first_call, int *data_to_request, ...other parameters... )
{
if( not_a_first_call )
goto request_handler;
for(i=0; i<n; i++)
{
*data_to_request = i;
return;
request_handler:
...process data...
}
}
I've studied standards, but there isn't much information about such use case. I also wonder whether replacing for by equivalent while will be beneficial from the portability point of view.
Thanks in advance.
UPD: Thanks to all who've commented!
to all commenters :) yes, I understand that I can't jump over initializers of local variables and that I have to save/restore i on each call.
about strong reasons :) This code must implement reverse communication interface. Reverse communication is a coding pattern which tries to avoid using function pointers. Sometimes it have to be used because of legacy code which expects that you will use it.
Unfortunately, r-comm-interface can't be implemented in a nice way. You can't use function pointers and you can't easily split work into several functions.
Seems perfectly legal.
From a draft of the C99 standard http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/n843.htm in the section on the goto statement:
[#3] EXAMPLE 1 It is sometimes convenient to jump into the
middle of a complicated set of statements. The following
outline presents one possible approach to a problem based on
these three assumptions:
1. The general initialization code accesses objects only
visible to the current function.
2. The general initialization code is too large to
warrant duplication.
3. The code to determine the next operation is at the
head of the loop. (To allow it to be reached by
continue statements, for example.)
/* ... */
goto first_time;
for (;;) {
// determine next operation
/* ... */
if (need to reinitialize) {
// reinitialize-only code
/* ... */
first_time:
// general initialization code
/* ... */
continue;
}
// handle other operations
/* ... */
}
Next, we look at the for loop statement:
[#1] Except for the behavior of a continue statement in the |
loop body, the statement
for ( clause-1 ; expr-2 ; expr-3 ) statement
and the sequence of statements
{
clause-1 ;
while ( expr-2 ) {
statement
expr-3 ;
}
}
Putting the two together with your problem tells you that you are jumping past
i=0;
into the middle of a while loop. You will execute
...process data...
and then
i++;
before flow of control jumps to the test in the while/for loop
i<n;
Yes, that's legal.
What you're doing is nowhere near as ugly as e.g. Duff's Device, which also is standard-compliant.
As #Alexandre says, don't use goto to skip over variable declarations with non-trivial constructors.
I'm sure you're not expecting local variables to be preserved across calls, since automatic variable lifetime is so fundamental. If you need some state to be preserved, functors (function objects) would be a good choice (in C++). C++0x lambda syntax makes them even easier to build. In C you'll have no choice but to store state into some state block passed in by pointer by the caller.
First, I need to say that you must reconsider doing this some other way. I've rarely seen someone using goto this days if not for error management.
But if you really want to stick with it, there are a few things you'll need to keep in mind:
Jumping from outside the loop to the middle won't make your code loop. (check the comments below for more info)
Be careful and don't use variables that are set before the label, for instance, referring to *data_to_request. This includes iwhich is set on the for statement and is not initialized when you jump to the label.
Personally, I think in this case I would rather duplicate the code for ...process data... then use goto. And if you pay close attention, you'll notice the return statement inside your for loop, meaning that the code of the label will never get executed unless there's a goto in the code to jump to it.
function foo(int not_a_first_call, int *data_to_request, ...other parameters... )
{
int i = 0;
if( not_a_first_call )
{
...process data...
*data_to_request = i;
return;
}
for (i=0; i<n; i++)
{
*data_to_request = i;
return;
}
}
No, you can't do this. I don't know what this will do exactly, but I do know that as soon as you return, your call stack is unwound and the variable i doesn't exist anymore.
I suggest refactoring. It looks like you're pretty much trying to build an iterator function similar to yield return in C#. Perhaps you could actually write a C++ iterator to do this?
It seems to me that you didn't declare i. From the point of declaration completely depends whether or not this is legal what you are doing, but see below for the initialization
In C you may declare it before the loop or as loop variable. But if it is declared as loop variable its value will not be initialized when you use it, so this is undefined behavior. And if you declare it before the for the assignment of 0 to it will not be performed.
In C++ you can't jump across the constructor of the variable, so you must declare it before the goto.
In both languages you have a more important problem, this is if the value of i is well defined, and if it is initialized if that value makes sense.
Really if there is any way to avoid this, don't do it. Or if this is really, really, performance critical check the assembler if it really does what you want.
If I understand correctly, you're trying to do something on the order of:
The first time foo is called, it needs to request some data from somewhere else, so it sets up that request and immediately returns;
On each subsequent call to foo, it processes the data from the previous request and sets up a new request;
This continues until foo has processed all the data.
I don't understand why you need the for loop at all in this case; you're only iterating through the loop once per call (if I understand the use case here). Unless i has been declared static, you lose its value each time through.
Why not define a type to maintain all the state (such as the current value of i) between function calls, and then define an interface around it to set/query whatever parameters you need:
typedef ... FooState;
void foo(FooState *state, ...)
{
if (FirstCall(state))
{
SetRequest(state, 1);
}
else if (!Done(state))
{
// process data;
SetRequest(state, GetRequest(state) + 1);
}
}
The initialisation part of the for loop will not occur, which makes it somewhat redundant. You need to initialise i before the goto.
int i = 0 ;
if( not_a_first_call )
goto request_handler;
for( ; i<n; i++)
{
*data_to_request = i;
return;
request_handler:
...process data...
}
However, this is really not a good idea!
The code is flawed in any case, the return statment circumvents the loop. As it stands it is equivalent to:
int i = 0 ;
if( not_a_first_call )
\\...process_data...
i++ ;
if( i < n )
{
*data_to_request = i;
}
In the end, if you think you need to do this then your design is flawed, and from the fragment posted your logic also.

Using "assert" with pointers in C++

When do we need to use "assert" for pointers in C++, and when they are used, how are they most commonly implemented?
Generally you would use an assert to check a condition that, if false, would indicate a bug in your application. So if a NULL pointer shouldn't ever be encountered at some point in the application, unless there's a bug, then assert it. If it might be encountered due to some invalid input then you need to do proper error handling.
You don't need to use assert on pointers at all. The idea is to ensure you don't crash when dereferencing your pointers when they're null.
You can do this with assert but it's not a very professional way to handle errors like this since it invariably terminates the program - not a good idea if the user hasn't, for example, saved their last three hours worth of data entry.
What you should do with pointers is to check them for null-ness and fail gracefully. In other words, have your function return an error of some sort or do nothing (not everyone will agree with this approach but it's perfectly acceptable if it's documented).
The assert stuff is meant, in my opinion, for catching problems during development which is why you'll find assert does nothing in release builds under some compilers. It is not a substitute for defensive programming.
As to how to do it:
#include <assert.h>
void doSomethingWithPointer (int *p) {
assert (p != 0);
cout << *p << endl;
}
but this would be better done as:
void doSomethingWithPointer (int *p) {
if (p != 0)
cout << *p << endl;
}
In other words, even if your "contract" (API) states that you're not allowed to receive null pointers, you should still handle them gracefully. An old quote: be conservative in what you give, liberal in what you accept (paraphrased).
ASSERT statements are great as "enforced documentation" - that is, they tell the reader something about the code ("This should never happen") and then enforces it by letting you know if they don't hold true.
If it's something that could happen (invalid input, memory not able to be allocated), that's not a time to use ASSERT. Asserts are only for things that can not possibly happen if everyone is obeying pre-conditions and such.
You can do it thusly:
ASSERT(pMyPointer);
From experience if you assert on null conditions that should never happen under normal conditions you program is in a really bad state. Recovering from such null condition will more likely than not mask the original problem.
Unless you code with exception guarantee in mind (linky) I say let it crash, then you know you have a problem.
I would use an ASSERT where a null pointer wouldn't immediately cause a crash but might lead to somethign wrong later that's hard to spot.
eg:
ASSERT(p);
strcpy(p, "hello");
Is a little unnecessary, it simply replaces a fatal exception with a fatal assert!
But in more complex code, particulalrly things like smart pointers, it might be useful to know check if the pointer is what you thing it is.
Remember ASSERTs only run in debug builds, they dissapear in the release.
In C, there also assert function..
in debug mode, if assert(x), x condition is false, there will pop up an alert...
But remember it works only in debug mode...
in release mode, all assert functions are all skipped
Assertions are used to to define how the program should function. That being said, the most common use of Assert()s when dealing with pointers is going to either be that they are valid (non-NULL and point towards valid memory) or that their internal state is valid if they point to an object/class instance, for example.
Assertions are not for replacing or acting as error condition code, but instead to enforce rules that you are placing on the functioning of your code, such as what conditions should be at given points in time.
For example,
function int f(int x, int * pY)
{
// These are entrance conditions expected in the function. It would be
// a BUG if this happened at all.
Assert(x >= 0);
Assert(pY != nullptr);
Assert(*pY >= 0, "*pY should never be less than zero");
// ...Do a bunch of computations with x and pY and return the result as z...
int z = x * 2 / (x + 1) + pow(*pY, x); // Maybe z should be always positive
// after these calculations:
Assert(x >= 0, "X should always be positive after calculations);
// Maybe *pY should always be non-zero after calculations
Assert(*pY != 0, "y should never be zero after computation");
Assert(z > 0):
return z;
}
Many users of Asserts choose to apply Assertions to internal state validation once they become familiar with them. We call these Invariants() which are methods on a class that assert many things about the internals of the object that should always hold true.
For example:
class A
{
public:
A(wchar_t * wszName)
{
_cch = 0;
_wszName = wszName;
}
// Invariant method to be called at times to verify that the
// internal state is consistent. This means here that the
// internal variable tracking the length of the string is
// matching the actual length of the string.
void Invariant()
{
Assert(pwszName != nullptr);
Assert(_cch == wcslen(pwszName));
}
void ProcessABunchOfThings()
{
...
}
protected:
int _cch;
wchar_t * pwszName;
}
// Call to validate internal state of object is consistent/ok
A a(L"Test Object");
a.Invariant();
a.ProcessABunchOfThings();
a.Invariant();
The important thing to remember is that this is to make sure that when bugs do happen that mean the program is not working as you would expect, then the effect of the bug happens as close to where it happened in the code as possible in order to make debugging easier. I have used Asserts extensively in my own code and while at Microsoft and I swear by them since they have saved me so much time in debugging and even knowing the defect is there.

Declaring and initializing a variable in a Conditional or Control statement in C++

In Stroustrup's The C++ Programming Language: Special Edition (3rd Ed), Stroustrup writes that the declaration and initialization of variables in the conditionals of control statements is not only allowed, but encouraged. He writes that he encourages it because it reduces the scope of the variables to only the scope that they are required for. So something like this...
if ((int i = read(socket)) < 0) {
// handle error
}
else if (i > 0) {
// handle input
}
else {
return true;
}
...is good programming style and practice. The variable i only exists for the block of if statements for which it is needed and then goes out of scope.
However, this feature of the programming language doesn't seem to be supported by g++ (version 4.3.3 Ubuntu specific compile), which is surprising to me. Perhaps I'm just calling g++ with a flag that turns it off (the flags I've called are -g and -Wall). My version of g++ returns the following compile error when compiling with those flags:
socket.cpp:130: error: expected primary-expression before ‘int’
socket.cpp:130: error: expected `)' before ‘int’
On further research I discovered that I didn't seem to be the only one with a compiler that doesn't support this. And there seemed to be some confusion in this question as to exactly what syntax was supposedly standard in the language and what compilers compile with it.
So the question is, what compilers support this feature and what flags need to be set for it to compile? Is it an issue of being in certain standards and not in others?
Also, just out of curiosity, do people generally agree with Stroustrup that this is good style? Or is this a situation where the creator of a language gets an idea in his head which is not necessarily supported by the language's community?
It is allowed to declare a variable in the control part of a nested block, but in the case of if and while, the variable must be initialized to a numeric or boolean value that will be interpreted as the condition. It cannot be included in a more complex expression!
In the particular case you show, it doesn't seem you can find a way to comply unfortunately.
I personally think it's good practice to keep the local variables as close as possible to their actual lifetime in the code, even if that sounds shocking when you switch from C to C++ or from Pascal to C++ - we were used to see all the variables at one place. With some habit, you find it more readable, and you don't have to look elsewhere to find the declaration. Moreover, you know that it is not used before that point.
Edit:
That being said, I don't find it a good practice to mix too much in a single statement, and I think it's a shared opinion. If you affect a value to a variable, then use it in another expression, the code will be more readable and less confusing by separating both parts.
So rather than using this:
int i;
if((i = read(socket)) < 0) {
// handle error
}
else if(i > 0) {
// handle input
}
else {
return true;
}
I would prefer that:
int i = read(socket);
if(i < 0) {
// handle error
}
else if(i > 0) {
// handle input
}
else {
return true;
}
I consider it a good style when used with possibly NULL pointer:
if(CObj* p = GetOptionalValue()) {
//Do something with p
}
This way whether p is declared, it is a valid pointer. No dangling pointer access danger.
On the other hand at least in VC++ it is the only use supported (i.e. checking whether assignment is true)
I use const as much as possible in these situations. Instead of your example, I would do:
const int readResult = read(socket);
if(readResult < 0) {
// handle error
}
else if(readResult > 0)
{
// handle input
}
else {
return true;
}
So although the scope isn't contained, it doesn't really matter, since the variable can't be altered.
They are fixing this in c++17 :
if (int i = read(socket); i < 0)
where if can have an initializer statement.
see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0305r0.html
I've run into a similar problem:
The problem seems to be the parentheses around the int declaration. It should work if you can express the assignment and test without them, i.e.
if (int i = read(socket)) {
should work, but that means the test is != 0, which is not what you want.
While you can use a declaration as a boolean expression, you cannot place a declaration in the middle of an expression. I cannot help thinking that you are mis-reading what Bjarne is saying.
The technique is useful and desirable mostly for control variables of for loops, but in this instance I believe is ill-advised and does not serve clarity. And of course it does not work! ;)
if( <type> <identifier> = <initialiser> ) // valid, but not that useful IMO
if( (<type> <identifier> = <initialiser>) <operator> <operand> ) // not valid
for( <type> <identifier> = <initialiser>;
<expression>;
<expression> ) // valid and desirable
In your example you have called a function with side effects in a conditional, which IMO is a bad idea regardless of what you might think about declaring the variable there.
Adding to what RedGlyph and Ferruccio said.
May be we can do the following to still declare within a conditional statement to limit its use:
if(int x = read(socket)) //x != 0
{
if(x < 0) //handle error
{}
else //do work
{}
}
else //x == 0
{
return true;
}
To complement the other folks' good answers, you can always limit the scope of the variable by braces:
{
const int readResult = read(socket);
if(readResult < 0) {
// handle error
}
else if(readResult > 0)
{
// handle input
}
else {
return true;
}
}
While not directly related to the question, all of the examples put the error handling first. Since there are 3 cases (>0 -> data, ==0 -> connection closed and <0 -> error), that means that the most common case of getting new data requires two tests. Checking for >0 first would cut the expected number of tests by almost half. Unfortunately the "if(int x = read(socket))" approach given by White_Pawn still requires 2 tests for the case of data, but the C++17 proposal could be used to test for >0 first.

Inadvertent use of = instead of ==

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
It seems that
if (x=y) { .... }
instead of
if (x==y) { ... }
is a root of many evils.
Why don't all compilers mark it as error instead of a configurable warning?
I'm interested in finding out cases where the construct if (x=y) is useful.
One useful construct is for example:
char *pBuffer;
if (pBuffer = malloc(100))
{
// Continue to work here
}
As mentioned before, and downvoted several times now, I might add this is not specially good style, but I have seen it often enough to say it's useful. I've also seen this with new, but it makes more pain in my chest.
Another example, and less controversial, might be:
while (pointer = getNextElement(context))
{
// Go for it. Use the pointer to the new segment of data.
}
which implies that the function getNextElement() returns NULL when there is no next element so that the loop is exited.
Most of the time, compilers try very hard to remain backward compatible.
Changing their behavior in this matter to throw errors will break existing legitimate code, and even starting to throw warnings about it will cause problems with automatic systems that keep track of code by automatically compiling it and checking for errors and warnings.
This is an evil we're pretty much stuck with at the moment, but there are ways to circumvent and reduce the dangers of it.
Example:
void *ptr = calloc(1, sizeof(array));
if (NULL = ptr) {
// Some error
}
This causes a compilation error.
Simple answer: An assignment operation, such as x=y, has a value, which is the same as the newly assigned value in x. You can use this directly in a comparison, so instead of
x = y; if (x) ...
you can write
if (x = y) ...
It is less code to write (and read), which is sometimes a good thing, but nowadays most people agree that it should be written in some other way to increase readability. For example, like this:
if ((x = y) != 0) ...
Here is a realistic example. Assume you want to allocate some memory with malloc, and see if it worked. It can be written step by step like this:
p = malloc(4711); if (p != NULL) printf("Ok!");
The comparison to NULL is redundant, so you can rewrite it like this:
p = malloc(4711); if (p) printf("Ok!");
But since the assignment operation has a value, which can be used, you could put the entire assignment in the if condition:
if (p = malloc(4711)) printf("Ok!");
This does the same thing, but it is more concise.
Because it's not illegal (in C or C++ anyway) and sometimes useful...
if ( (x = read(blah)) > 0)
{
// now you know how many bits/bytes/whatever were read
// and can use that info. Esp. if you know, say 30 bytes
// are coming but only got 10
}
Most compilers kick up a real stink if you don't put parenthesis around the assignment anyway, which I like.
About the valid uses of if(i = 0)
The problem is that you're taking the problem upside down. The "if" notation is not about comparing two values like in some other languages.
The C/C++ "if" instruction waits for any expression that will evaluate to either a boolean, or a null/non-null value. This expression can include two values comparison, and/or can be much more complex.
For example, you can have:
if(i >> 3)
{
std::cout << "i is less than 8" << std::endl
}
Which proves that, in C/C++, the if expression is not limited to == and =. Anything will do, as long as it can be evaluated as true or false (C++), or zero non-zero (C/C++).
Another C++ valid use:
if(MyObject * pObject = dynamic_cast<MyInterface *>(pInterface))
{
pObject->doSomething();
}
And these are simple uses of the if expression (note that this can be used, too, in the for loop declaration line). More complex uses do exist.
About advanced uses of if(i = 0) in C++ (Quoted from myself)
After discovering a duplicate of this question at In which case is if(a=b) a good idea?, I decided to complete this answer with an additional bonus, that is, variable injection into a scope, which is possible in C++, because if will evaluate its expression, including a variable declaration, instead of limiting itself to compare two operands like it is done in other languages:
So, quoting from myself:
Another use would be to use what is called C++ variable injection. In Java, there is this cool keyword:
synchronized(p)
{
// Now, the Java code is synchronized using p as a mutex
}
In C++, you can do it, too. I don't have the exact code in mind (nor the exact Dr. Dobb's Journal's article where I discovered it), but this simple define should be enough for demonstration purposes:
#define synchronized(lock) \
if (auto_lock lock_##__LINE__(lock))
synchronized(p)
{
// Now, the C++ code is synchronized using p as a mutex
}
This is the same way, mixing injection with an if and for declaration. You can declare a primitive foreach macro (if you want an industrial-strength foreach, use Boost's).
See the following articles for a less naive, more complete and more robust implementation:
FOR_EACH and LOCK
Exception Safety Analysis
Concurrent Access Control & C++
How many errors of this kind really happens?
Rarely. In fact, I have yet to remember one, and I have been a professional for the past 8 years.
I guess it happened, but then, in 8 years, I did produce a sizeable quantity of bugs. It's just that this kind of bugs did not happen enough to have me remember them in frustration.
In C, you'll have more bugs because of buffer overruns, like:
void doSomething(char * p)
{
strcpy(p, "Hello, World! How are you \?\n");
}
void doSomethingElse()
{
char buffer[16];
doSomething(buffer);
}
In fact, Microsoft was burned so hard because of that they added a warning in Visual C++ 2008 deprecating strcpy!
How can you avoid most errors?
The very first "protection" against this error is to "turn around" the expression: As you can't assign a value to a constant, this:
if(0 = p) // ERROR: It should have been if(0 == p). IT WON'T COMPILE!
It won't compile.
But I find this quite a poor solution, because it tries to hide behind a style what should be a general programming practice, that is: Any variable that is not supposed to change should be constant.
For example, instead of:
void doSomething(char * p)
{
if(p == NULL) // POSSIBLE TYPO ERROR
return;
size_t length = strlen(p);
if(length == 0) // POSSIBLE TYPO ERROR
printf("\"%s\" length is %i\n", p, length);
else
printf("the string is empty\n");
}
Trying to "const" as many variables as possible will make you avoid most typo errors, including those not inside "if" expressions:
void doSomething(const char * const p) // CONST ADDED HERE
{
if(p == NULL) // NO TYPO POSSIBLE
return;
const size_t length = strlen(p); // CONST ADDED HERE
if(length == 0) // NO TYPO POSSIBLE
printf("\"%s\" length is %i\n", p, length);
else
printf("the string is empty\n");
}
Of course, it is not always possible (as some variables do need to change), but I found than most of the variables I use are constants (I keep initializing them once, and then, only reading them).
Conclusion
Usually, I see code using the if(0 == p) notation, but without the const-notation.
To me, it's like having a trash can for recyclables, and another for non-recyclable, and then in the end, throw them together in the same container.
So, do not parrot an easy style habit hoping it will make your code a lot better. It won't. Use the language constructs as much as possible, which means, in this case, using both the if(0 == p) notation when available, and using of the const keyword as much as possible.
The 'if(0 = x)' idiom is next to useless because it doesn't help when both sides are variables ('if(x = y)') and most (all?) of the time you should be using constant variables rather than magic numbers.
Two other reasons I never use this idiom, IMHO it makes code less readable and to be honest I find the single '='to be the root of very little evil. If you test your code thouroughly (which we all do, obviously) this sort of syntax error turns up very quickly.
Standard C idiom for iterating:
list_elem* curr;
while ( (curr = next_item(list)) != null ) {
/* ... */
}
Many compilers will detect this and warn you, but only if you set the warning level high enough.
For example:
~> gcc -c -Wall foo.c
foo.c: In function ‘foo’:
foo.c:5: warning: suggest parentheses around assignment used as truth value
Is this really such a common error? I learned about it when I learned C myself, and as a teacher I have occasionally warned my students and told them that it is a common error, but I have rarely seen it in real code, even from beginners. Certainly not more often than other operator mistakes, such as for example writing "&&" instead of "||".
So the reason that compilers don't mark it as an error (except for it being perfectly valid code) is perhaps that it isn't the root of very many evils.
The assignment as conditional is legal C and C++, and any compiler that doesn't permit it isn't a real C or C++ compiler. I would hope that any modern language not designed to be explicitly compatible with C (as C++ was) would consider it an error.
There are cases where this allows concise expressions, such as the idiomatic while (*dest++ = *src++); to copy a string in C, but overall it's not very useful, and I consider it a mistake in language design. It is, in my experience, easy to make this mistake, and hard to spot when the compiler doesn't issue a warning.
I think the C and C++ language designers noticed there is no real use in forbidding it because
Compilers can warn about it if they want anyway
Disallowing it would add special cases to the language, and would remove a possible feature.
There isn't complexity involved in allowing it. C++ just says that an expression implicitly convertible to bool is required. In C, there are useful cases detailed by other answers. In C++, they go one step further and allowed this one in addition:
if(type * t = get_pointer()) {
// ....
}
Which actually limits the scope of t to only the if and its bodies.
It depends on the language. Java flags it as an error as only Boolean expressions can be used inside the if parenthesis (and unless the two variables are Boolean, in which case the assignment is also a Boolean).
In C, it is a quite common idiom for testing pointers returned by malloc or if after a fork we are in the parent or child process:
if ( x = (X*) malloc( sizeof(X) ) {
// 'malloc' worked, pointer != 0
if ( pid = fork() ) {
// Parent process as pid != 0
C/C++ compilers will warn with a high enough warning level if you ask for it, but it cannot be considered an error as the language allows it. Unless, then again, you ask the compiler to treat warnings as errors.
Whenever comparing with constants, some authors suggest using the test constant == variable so that the compiler will detect if the user forgets the second equality sign.
if ( 0 == variable ) {
// The compiler will complaint if you mistakenly
// write =, as you cannot assign to a constant
Anyway, you should try to compile with the highest possible warning settings.
Try viewing
if( life_is_good() )
enjoy_yourself();
as
if( tmp = life_is_good() )
enjoy_yourself();
Part of it has to do with personal style and habits. I am agnostic to reading either if (kConst == x) or if (x == kConst). I don't use the constant on the left because historically I don't make that error and I write code as I would say it or would like to read it. I see this as a personal decision as part of a personal responsibility to being a self-aware, improving engineer. For example, I started analyzing the types of bugs that I was creating and started to re-engineer my habits so as not to make them - similar to constant on the left, just with other things.
That said, compiler warnings, historically, are pretty crappy and even though this problem has been well known for years, I didn't see it in a production compiler until the late 80's. I also found that working on projects that were portable helped clean up my C a great deal, as different compilers and different tastes (ie, warnings) and different subtle semantic differences.
I, personally, consider this the most useful example.
Say that you have a function read() that returns the number of bytes read, and you need to use this in a loop. It's a lot simpler to use
while((count = read(foo)) > 0) {
//Do stuff
}
than to try and get the assignment out of the loop head, which would result in things like
while(1) {
count = read(foo);
if(!(count > 0))
break;
//...
}
or
count = read(foo);
while(count > 0) {
//...
count = read(foo);
}
The first construct feels awkward, and the second repeats code in an unpleasant way.
Unless, of course, I've missed some brilliant idiom for this...
There are a lot of great uses of the assignment operator in a conditional statement, and it'd be a royal pain in the ass to see warnings about each one all the time. What would be nice would be a function in your IDE that let you highlight all the places where assignment has been used instead of an equality check - or - after you write something like this:
if (x = y) {
then that line blinks a couple of times. Enough to let you know that you've done something not exactly standard, but not so much that it's annoying.
if ((k==1) || (k==2)) is a conditional
if ((k=1) || (k=2) ) is BOTH a conditional AND an assignment statement
Here's the explanation
Like most languages, C works inner-most to outermost in order by operator precedence.
First, it tries to set k to 1, and succeeds.
Result: k = 1 and Boolean = 'true'
Next: it sets k to 2, and succeeds.
Result: k = 2 and Boolean = 'true'
Next: it evaluates (true || true)
Result: k still = 2, and Boolean = true
Finally, it then resolves the conditional: If (true)
Result: k = 2 and the program takes the first branch.
In nearly 30 years of programming I have not seen a valid reason for using this construct, though if one exists it probably has to do with a need to deliberately obfuscate your code.
When one of our new people has a problem, this is one of the things I look for, right along with not sticking a terminator on a string, and copying a debug statement from one place to another and not changing the '%i to '%s' to match the new field they are dumping.
This is fairly common in our shop because we constantly switch between C and Oracle PL/SQL; if( k = 1) is the correct syntax in PL/SQL.
It is very common with "low level" loop constructs in C/C++, such as with copies:
void my_strcpy(char *dst, const char *src)
{
while((*dst++ = *src++) != '\0') { // Note the use of extra parentheses, and the explicit compare.
/* DO NOTHING */
}
}
Of course, assignments are very common with for loops:
int i;
for(i = 0; i < 42; ++i) {
printf("%d\n", i);
}
I do believe it is easier to read assignments when they are outside of if statements:
char *newstring = malloc(strlen(src) * sizeof(char));
if(newstring == NULL) {
fprintf(stderr, "Out of memory, d00d! Bailing!\n");
exit(2);
}
// Versus:
if((newstring = malloc(strlen(src) * sizeof(char))) == NULL) // ew...
Make sure the assignment is obvious, thuogh (as with the first two examples). Don't hide it.
As for accidental uses ... that doesn't happen to me much. A common safeguard is to put your variable (lvalues) on the right hand side of the comparison, but that doesn't work well with things like:
if(*src == *dst)
because both oprands to == are lvalues!
As for compilers ... who can blame 'em? Writing compilers is difficult, and you should be writing perfect programs for the compiler anyway (remember GIGO?). Some compilers (the most well-known for sure) provide built-in lint-style checking, but that certainly isn't required. Some browsers don't validate every byte of HTML and Javascript it's thrown, so why would compilers?
There are several tactics to help spot this .. one is ugly, the other is typically a macro. It really depends on how you read your spoken language (left to right, right to left).
For instance:
if ((fp = fopen("foo.txt", "r") == NULL))
Vs:
if (NULL == (fp = fopen(...)))
Sometimes it can be easier to read/write (first) what your testing for, which makes it easier to spot an assignment vs a test. Then bring in most comp.lang.c folks that hate this style with a passion.
So, we bring in assert():
#include <assert.h>
...
fp = fopen("foo.txt", "r");
assert(fp != NULL);
...
when your at the midst, or end of a convoluted set of conditionals, assert() is your friend. In this case, if FP == NULL, an abort() is raised and the line/file of the offending code is conveyed.
So if you oops:
if (i = foo)
insted of
if (i == foo)
followed by
assert (i > foo + 1)
... you'll quickly spot such mistakes.
Hope this helps :)
In short, reversing arguments sometimes helps when debugging .. assert() is your long life friend and can be turned off in compiler flags in production releases.
As pointed out in other answers, there are cases where using assignment within a condition offers a brief-but-readable piece of code that does what you want. Also, a lot of up-to-date compilers will warn you if they see an assignment where they expect a condition. (If you're a fan of the zero-warnings approach to development, you'll have seen these.)
One habit I've developed that keeps me from getting bitten by this (at least in C-ish languages) is that if one of the two values I'm comparing is a constant (or otherwise not a legal lvalue), I put it on the left-hand side of the comparator: if (5 == x) { whatever(); } Then, if I should accidentally type if (5 = x), the code won't compile.
You asked why it was useful, but keep questioning examples people are providing. It's useful because it's concise.
Yes, all the examples which use it can be rewritten - as longer pieces of code.
I have only had this typo once in my 15 years of development. I would not say it is on the top of my list of things to look out for. I also avoid that construct anyway.
Note also that some compilers (the one I use) issue a warning on that code. Warnings can be treated as errors for any compiler worth its salt. They can also be ignored.
Placing the constant on the left side of a comparison is defensive programming. Sure you would never make the silly mistake of forgetting that extra '=', but who knows about the other guy.
The D programming language does flag this as an error. To avoid the problem with wanting to use the value later, it allows declarations sort of like C++ allows with for loops.
if(int i = some_fn())
{
another_fn(i);
}
The compiler won't flag it as an error because it is valid C/C++. But what you can do (at least with Visual C++) is turn up the warning level so that it flags it as a warning and then tell the compiler to treat warnings as errors. This is a good practice anyway so that developers don't ignore warnings.
If you had actually meant = instead of == then you need to be more explicit about it. E.g.,
if ((x = y) != 0)
Theoretically, you're supposed to be able to do this:
if ((x = y))
to override the warning, but that doesn't seem to always work.
In practice I don't do it, but a good tip is to do:
if ( true == $x )
In the case that you leave out an equals, assigning $x to true will obviously return an error.
RegEx sample
RegEx r;
if(((r = new RegEx("\w*)).IsMatch()) {
// ... do something here
}
else if((r = new RegEx("\d*")).IsMatch()) {
// ... do something here
}
Assign a value test
int i = 0;
if((i = 1) == 1) {
// 1 is equal to i that was assigned to a int value 1
}
else {
// ?
}
That's why it's better to write:
0 == CurrentItem
Instead of:
CurrentItem == 0
so that the compiler warns you if you type = instead of ==.