What's the point of if(true)? c++ - c++

I've seen if(true) used a bunch of times.
int a = 10;
if(true){
int b = 20;
}
int c = 15;
I don't understand what the point of putting if(true) there. Does it always evaluate to true, meaning it always executes? It's not part of a function. It's just there. Does it have to do with memory allocation?

This is equivalent to :
{
int b = 20;
}
maybe someone was using if (false) then switched to if (true). if(false) makes actually sense because you are removing some code - it should not get into compiled exe, but it gets compiled by compiler - and checked for any errors.

If one is fiddling with the code, it's very easy to turn
if (true) {
// block of code
}
into
if (false) {
// block of code
}
so this is a useful construct if you often need to turn a block of code on/off. It could also be a placeholder for future changes, where the boolean value is replaced with a (template) parameter or global constant or somesuch. (or a holdover from a former change that did the reverse)

Related

Debug complex return statement

I am using totalview as a linux C++ debugger. Functions in our code often look something like this
double foo() {
int a = 2;
int b = 3;
return bar(a,b);
}
where some preliminary work is done and than a more or less complex function bar is called a the return statement.
This is hard to debug with totalview, since the interesting return value can not be observed easily. Totalview can not evaluate the expressing bar(a,b). I can rewrite the code as
double foo() {
int a = 2;
int b = 3;
const auto retVal = bar(a,b);
return retVal;
}
Now, I can place a breakpoint at the return value and observe the in- and output of my function bar.
How can I do this without introducing a new (useless) variable?
Let the compiler optimise out the "useless" variable, by a process called named return value optimisation, and leave it in. (Personally though I would help the compiler as much as possible by using the return type of the function explicitly rather than auto; so there's no potential type conversion at the return stage.). For what it's worth, I do this all the time, even with heavy objects such as std::vector. You can always check the generated assembler if you suspect the compiler is making superfluous copies.
Then you can set a breakpoint at the appropriate place as you know.
In some debuggers you can inspect the function return value directly by peeking at a register, but this is by no means universal.
Reference: http://en.cppreference.com/w/cpp/language/copy_elision

Is it problematic to declare variables in the main loop of a OpenGL program? [duplicate]

Question #1: Is declaring a variable inside a loop a good practice or bad practice?
I've read the other threads about whether or not there is a performance issue (most said no), and that you should always declare variables as close to where they are going to be used. What I'm wondering is whether or not this should be avoided or if it's actually preferred.
Example:
for(int counter = 0; counter <= 10; counter++)
{
string someString = "testing";
cout << someString;
}
Question #2: Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?
This is excellent practice.
By creating variables inside loops, you ensure their scope is restricted to inside the loop. It cannot be referenced nor called outside of the loop.
This way:
If the name of the variable is a bit "generic" (like "i"), there is no risk to mix it with another variable of same name somewhere later in your code (can also be mitigated using the -Wshadow warning instruction on GCC)
The compiler knows that the variable scope is limited to inside the loop, and therefore will issue a proper error message if the variable is by mistake referenced elsewhere.
Last but not least, some dedicated optimization can be performed more efficiently by the compiler (most importantly register allocation), since it knows that the variable cannot be used outside of the loop. For example, no need to store the result for later re-use.
In short, you are right to do it.
Note however that the variable is not supposed to retain its value between each loop. In such case, you may need to initialize it every time. You can also create a larger block, encompassing the loop, whose sole purpose is to declare variables which must retain their value from one loop to another. This typically includes the loop counter itself.
{
int i, retainValue;
for (i=0; i<N; i++)
{
int tmpValue;
/* tmpValue is uninitialized */
/* retainValue still has its previous value from previous loop */
/* Do some stuff here */
}
/* Here, retainValue is still valid; tmpValue no longer */
}
For question #2:
The variable is allocated once, when the function is called. In fact, from an allocation perspective, it is (nearly) the same as declaring the variable at the beginning of the function. The only difference is the scope: the variable cannot be used outside of the loop. It may even be possible that the variable is not allocated, just re-using some free slot (from other variable whose scope has ended).
With restricted and more precise scope come more accurate optimizations. But more importantly, it makes your code safer, with less states (i.e. variables) to worry about when reading other parts of the code.
This is true even outside of an if(){...} block. Typically, instead of :
int result;
(...)
result = f1();
if (result) then { (...) }
(...)
result = f2();
if (result) then { (...) }
it's safer to write :
(...)
{
int const result = f1();
if (result) then { (...) }
}
(...)
{
int const result = f2();
if (result) then { (...) }
}
The difference may seem minor, especially on such a small example.
But on a larger code base, it will help : now there is no risk to transport some result value from f1() to f2() block. Each result is strictly limited to its own scope, making its role more accurate. From a reviewer perspective, it's much nicer, since he has less long range state variables to worry about and track.
Even the compiler will help better : assuming that, in the future, after some erroneous change of code, result is not properly initialized with f2(). The second version will simply refuse to work, stating a clear error message at compile time (way better than run time). The first version will not spot anything, the result of f1() will simply be tested a second time, being confused for the result of f2().
Complementary information
The open-source tool CppCheck (a static analysis tool for C/C++ code) provides some excellent hints regarding optimal scope of variables.
In response to comment on allocation:
The above rule is true in C, but might not be for some C++ classes.
For standard types and structures, the size of variable is known at compilation time. There is no such thing as "construction" in C, so the space for the variable will simply be allocated into the stack (without any initialization), when the function is called. That's why there is a "zero" cost when declaring the variable inside a loop.
However, for C++ classes, there is this constructor thing which I know much less about. I guess allocation is probably not going to be the issue, since the compiler shall be clever enough to reuse the same space, but the initialization is likely to take place at each loop iteration.
Generally, it's a very good practice to keep it very close.
In some cases, there will be a consideration such as performance which justifies pulling the variable out of the loop.
In your example, the program creates and destroys the string each time. Some libraries use a small string optimization (SSO), so the dynamic allocation could be avoided in some cases.
Suppose you wanted to avoid those redundant creations/allocations, you would write it as:
for (int counter = 0; counter <= 10; counter++) {
// compiler can pull this out
const char testing[] = "testing";
cout << testing;
}
or you can pull the constant out:
const std::string testing = "testing";
for (int counter = 0; counter <= 10; counter++) {
cout << testing;
}
Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?
It can reuse the space the variable consumes, and it can pull invariants out of your loop. In the case of the const char array (above) - that array could be pulled out. However, the constructor and destructor must be executed at each iteration in the case of an object (such as std::string). In the case of the std::string, that 'space' includes a pointer which contains the dynamic allocation representing the characters. So this:
for (int counter = 0; counter <= 10; counter++) {
string testing = "testing";
cout << testing;
}
would require redundant copying in each case, and dynamic allocation and free if the variable sits above the threshold for SSO character count (and SSO is implemented by your std library).
Doing this:
string testing;
for (int counter = 0; counter <= 10; counter++) {
testing = "testing";
cout << testing;
}
would still require a physical copy of the characters at each iteration, but the form could result in one dynamic allocation because you assign the string and the implementation should see there is no need to resize the string's backing allocation. Of course, you wouldn't do that in this example (because multiple superior alternatives have already been demonstrated), but you might consider it when the string or vector's content varies.
So what do you do with all those options (and more)? Keep it very close as a default -- until you understand the costs well and know when you should deviate.
I didn't post to answer JeremyRR's questions (as they have already been answered); instead, I posted merely to give a suggestion.
To JeremyRR, you could do this:
{
string someString = "testing";
for(int counter = 0; counter <= 10; counter++)
{
cout << someString;
}
// The variable is in scope.
}
// The variable is no longer in scope.
I don't know if you realize (I didn't when I first started programming), that brackets (as long they are in pairs) can be placed anywhere within the code, not just after "if", "for", "while", etc.
My code compiled in Microsoft Visual C++ 2010 Express, so I know it works; also, I have tried to to use the variable outside of the brackets that it was defined in and I received an error, so I know that the variable was "destroyed".
I don't know if it is bad practice to use this method, as a lot of unlabeled brackets could quickly make the code unreadable, but maybe some comments could clear things up.
For C++ it depends on what you are doing.
OK, it is stupid code but imagine
class myTimeEatingClass
{
public:
//constructor
myTimeEatingClass()
{
sleep(2000);
ms_usedTime+=2;
}
~myTimeEatingClass()
{
sleep(3000);
ms_usedTime+=3;
}
const unsigned int getTime() const
{
return ms_usedTime;
}
static unsigned int ms_usedTime;
};
myTimeEatingClass::ms_CreationTime=0;
myFunc()
{
for (int counter = 0; counter <= 10; counter++) {
myTimeEatingClass timeEater();
//do something
}
cout << "Creating class took " << timeEater.getTime() << "seconds at all" << endl;
}
myOtherFunc()
{
myTimeEatingClass timeEater();
for (int counter = 0; counter <= 10; counter++) {
//do something
}
cout << "Creating class took " << timeEater.getTime() << "seconds at all" << endl;
}
You will wait 55 seconds until you get the output of myFunc.
Just because each loop constructor and destructor together need 5 seconds to finish.
You will need 5 seconds until you get the output of myOtherFunc.
Of course, this is a crazy example.
But it illustrates that it might become a performance issue when each loop the same construction is done when the constructor and / or destructor needs some time.
Since your second question is more concrete, I'm going to address it first, and then take up your first question with the context given by the second. I wanted to give a more evidence-based answer than what's here already.
Question #2: Do most compilers realize that the variable has already
been declared and just skip that portion, or does it actually create a
spot for it in memory each time?
You can answer this question for yourself by stopping your compiler before the assembler is run and looking at the asm. (Use the -S flag if your compiler has a gcc-style interface, and -masm=intel if you want the syntax style I'm using here.)
In any case, with modern compilers (gcc 10.2, clang 11.0) for x86-64, they only reload the variable on each loop pass if you disable optimizations. Consider the following C++ program—for intuitive mapping to asm, I'm keeping things mostly C-style and using an integer instead of a string, although the same principles apply in the string case:
#include <iostream>
static constexpr std::size_t LEN = 10;
void fill_arr(int a[LEN])
{
/* *** */
for (std::size_t i = 0; i < LEN; ++i) {
const int t = 8;
a[i] = t;
}
/* *** */
}
int main(void)
{
int a[LEN];
fill_arr(a);
for (std::size_t i = 0; i < LEN; ++i) {
std::cout << a[i] << " ";
}
std::cout << "\n";
return 0;
}
We can compare this to a version with the following difference:
/* *** */
const int t = 8;
for (std::size_t i = 0; i < LEN; ++i) {
a[i] = t;
}
/* *** */
With optimization disabled, gcc 10.2 puts 8 on the stack on every pass of the loop for the declaration-in-loop version:
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
mov DWORD PTR -12[rbp], 8 ;✷
whereas it only does it once for the out-of-loop version:
mov DWORD PTR -12[rbp], 8 ;✷
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
Does this make a performance impact? I didn't see an appreciable difference in runtime between them with my CPU (Intel i7-7700K) until I pushed the number of iterations into the billions, and even then the average difference was less than 0.01s. It's only a single extra operation in the loop, after all. (For a string, the difference in in-loop operations is obviously a bit greater, but not dramatically so.)
What's more, the question is largely academic, because with an optimization level of -O1 or higher gcc outputs identical asm for both source files, as does clang. So, at least for simple cases like this, it's unlikely to make any performance impact either way. Of course, in a real-world program, you should always profile rather than make assumptions.
Question #1: Is declaring a variable inside a loop a good practice or
bad practice?
As with practically every question like this, it depends. If the declaration is inside a very tight loop and you're compiling without optimizations, say for debugging purposes, it's theoretically possible that moving it outside the loop would improve performance enough to be handy during your debugging efforts. If so, it might be sensible, at least while you're debugging. And although I don't think it's likely to make any difference in an optimized build, if you do observe one, you/your pair/your team can make a judgement call as to whether it's worth it.
At the same time, you have to consider not only how the compiler reads your code, but also how it comes off to humans, yourself included. I think you'll agree that a variable declared in the smallest scope possible is easier to keep track of. If it's outside the loop, it implies that it's needed outside the loop, which is confusing if that's not actually the case. In a big codebase, little confusions like this add up over time and become fatiguing after hours of work, and can lead to silly bugs. That can be much more costly than what you reap from a slight performance improvement, depending on the use case.
Once upon a time (pre C++98); the following would break:
{
for (int i=0; i<.; ++i) {std::string foo;}
for (int i=0; i<.; ++i) {std::string foo;}
}
with the warning that i was already declared (foo was fine as that's scoped within the {}). This is likely the WHY people would first argue it's bad. It stopped being true a long time ago though.
If you STILL have to support such an old compiler (some people are on Borland) then the answer is yes, a case could be made to put the i out the loop, because not doing so makes it makes it "harder" for people to put multiple loops in with the same variable, though honestly the compiler will still fail, which is all you want if there's going to be a problem.
If you no longer have to support such an old compiler, variables should be kept to the smallest scope you can get them so that you not only minimise the memory usage; but also make understanding the project easier. It's a bit like asking why don't you have all your variables global. Same argument applies, but the scopes just change a bit.
It's a very good practice, as all above answer provide very good theoretical aspect of the question let me give a glimpse of code, i was trying to solve DFS over GEEKSFORGEEKS, i encounter the optimization problem......
If you try to solve the code declaring the integer outside the loop will give you Optimization Error..
stack<int> st;
st.push(s);
cout<<s<<" ";
vis[s]=1;
int flag=0;
int top=0;
while(!st.empty()){
top = st.top();
for(int i=0;i<g[top].size();i++){
if(vis[g[top][i]] != 1){
st.push(g[top][i]);
cout<<g[top][i]<<" ";
vis[g[top][i]]=1;
flag=1;
break;
}
}
if(!flag){
st.pop();
}
}
Now put integers inside the loop this will give you correct answer...
stack<int> st;
st.push(s);
cout<<s<<" ";
vis[s]=1;
// int flag=0;
// int top=0;
while(!st.empty()){
int top = st.top();
int flag = 0;
for(int i=0;i<g[top].size();i++){
if(vis[g[top][i]] != 1){
st.push(g[top][i]);
cout<<g[top][i]<<" ";
vis[g[top][i]]=1;
flag=1;
break;
}
}
if(!flag){
st.pop();
}
}
this completely reflect what sir #justin was saying in 2nd comment....
try this here
https://practice.geeksforgeeks.org/problems/depth-first-traversal-for-a-graph/1. just give it a shot.... you will get it.Hope this help.
Chapter 4.8 Block Structure in K&R's The C Programming Language 2.Ed.:
An automatic variable declared and initialized in a
block is initialized each time the block is entered.
I might have missed seeing the relevant description in the book like:
An automatic variable declared and initialized in a
block is allocated only one time before the block is entered.
But a simple test can prove the assumption held:
#include <stdio.h>
int main(int argc, char *argv[]) {
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
int k;
printf("%p\n", &k);
}
}
return 0;
}
The two snippets below generate the same assembly.
// snippet 1
void test() {
int var;
while(1) var = 4;
}
// snippet 2
void test() {
while(1) int var = 4;
}
output:
test():
push rbp
mov rbp, rsp
.L2:
mov DWORD PTR [rbp-4], 4
jmp .L2
Link: https://godbolt.org/z/36hsM6Pen
So, until profiling opposes or computation extensive constructor is involved, keeping declation close to its usage should be the default approach.

Optimize non-cost variable access

There's an interesting optimization problem I'm facing.
In a large code base, consisting of a large number of classes, in many places the value of a non-constant global (=file scope) variable is very often used/examined and the unnecessary memory accesses of this variable are to be avoided.
This variable is initialized once, but because of the complexity of its initialization and the need to call a number of functions, it cannot be initialized like this, before execution of main():
unsigned size = 1000;
int main()
{
// some code
}
or
unsigned size = CalculateSize();
int main()
{
// some code
}
Instead it has to be initialized like this:
unsigned size;
int main()
{
// some code
size = CalculateSize();
// lots of code (statically/dynamically created class objects, whatnot)
// that makes use of "size"
return 0;
}
Just because size isn't a constant and it is global (=file scope) and the code is large and complex, the compiler is unable to infer that size never changes after size = CalculateSize();. The compiler generates code that fetches and refetches the value of size from the variable and can't "cache" it in a register or in a local (on-stack) variable that's likely to be in the CPU's d-cache together with other frequently accessed local variables.
So, if I have something like the following (a made-up example for illustrative purposes):
size = CalculateSize();
if (size > 200) blah1();
blah2();
if (size > 200) blah3();
The compiler thinks that blah1() and blah2() may change size and it generates a memory read from size in if (size > 200) blah3();.
I'd like to avoid that extra read whenever and wherever possible.
Obviously, hacks like this:
const unsigned size = 0;
int main()
{
// some code
*(unsigned*)&size = CalculateSize();
// lots more code
}
won't do as they invoke undefined behavior.
The question is how to inform the compiler that it can "cache" the value of size once size = CalculateSize(); has been performed and do it without invoking undefined behavior, unspecified behavior and, hopefully, implementation-specific behavior.
This is needed for C++03 and g++ (4.x.x). C++11 may or may not be an option, I'm not sure, I'm trying to avoid using advanced/modern C++ features to stay within the coding guidelines and predefined toolset.
So far I've only come up with a hack to create a constant copy of size within every class that's using it and use the copy, something like this (decltype makes it C++11, but we can do without decltype):
#include <iostream>
using namespace std;
volatile unsigned initValue = 255;
unsigned size;
#define CACHE_VAL(name) \
const struct CachedVal ## name \
{ \
CachedVal ## name() { this->val = ::name; } \
decltype(::name) val; \
} _CachedVal ## name;
#define CACHED(name) \
_CachedVal ## name . val
class C
{
public:
C() { cout << CACHED(size) << endl; }
CACHE_VAL(size);
};
int main()
{
size = initValue;
C c;
return 0;
}
The above may only help up to a point. Are there better and more suggestive-to-the-compiler alternatives that are legal C++? Hoping for a minimally intrusive (source-code-wise) solution.
UPDATE: To make it a bit more clear, this is in a performance-sensitive application. It's not that I'm trying to get rid of unnecessary reads of that particular variable out of whim. I'm trying to let/make the compiler produce more optimal code. Any solution that involves reading/writing another variable as often as size and any additional code in the solution (especially with branching and conditional branching) executed as often as size is referred to is also going to affect the performance. I don't want to win in one place only to lose the same or even more in another place.
Here's a related non-solution, causing UB (at least in C).
There's the register keyword in C++ which tells the compiler you plan on using a variable a lot. Don't know about the compiler you're using, but most of the modern compilers do that for the users, adding a variable into the registry if needed. You can also declare the variable as constant and initialize it using const_cast.
what of:
const unsigned getSize( void )
{
static const unsigned size = calculateSize();
return size;
}
This will delay the initialization of size until the first call to getSize(), but still keep it const.
GCC 4.8.2
#include <iostream>
unsigned calculate() {
std::cout<<"calculate()\n";
return 42;
}
const unsigned mySize() {
std::cout<<"mySize()\n";
static const unsigned someSize = calculate();
return someSize;
}
int main() {
std::cout<<"main()\n";
mySize();
}
prints:
main()
mySize()
calculate()
on GCC 4.8.0
Checking for whether it has been initialized already or not will be almost fully mitigated by the branch predictor. You will end up having one false and a quadrillion trues afterwards.
Yes, you will still have to access that state after the pipeline has been basically built, potentially wreaking havoc in the caches, but you can't be sure unless you profile.
Also, compiler can likely do some extra magic for you (and it is what you're looking for), so I suggest you first compile and profile with this approach before discarding it entirely.

Optimizer bug or programming error?

First of all: I know that most optimization bugs are due to programming errors or relying on facts which may change depending on optimization settings (floating point values, multithreading issues, ...).
However I experienced a very hard to find bug and am somewhat unsure if there is any way to prevent these kind of errors from happening without turning the optimization off. Am I missing something? Could this really be an optimizer bug? Here's a simplified example:
struct Data {
int a;
int b;
double c;
};
struct Test {
void optimizeMe();
Data m_data;
};
void Test::optimizeMe() {
Data * pData; // Note that this pointer is not initialized!
bool first = true;
for (int i = 0; i < 3; ++i) {
if (first) {
first = false;
pData = &m_data;
pData->a = i * 10;
pData->b = i * pData->a;
pData->c = pData->b / 2;
} else {
pData->a = ++i;
} // end if
} // end for
};
int main(int argc, char *argv[]) {
Test test;
test.optimizeMe();
return 0;
}
The real program of course has a lot more to do than this. But it all boils down to the fact that instead of accessing m_data directly, a (previously unitialized) pointer is being used. As soon as I add enough statements to the if (first)-part, the optimizer seems to change the code to something along these lines:
if (first) {
first = false;
// pData-assignment has been removed!
m_data.a = i * 10;
m_data.b = i * m_data.a;
m_data.c = m_data.b / m_data.a;
} else {
pData->a = ++i; // This will crash - pData is not set yet.
} // end if
As you can see, it replaces the unnecessary pointer dereference with a direct write to the member struct. However it does not do this in the else-branch. It also removes the pData-assignment. Since the pointer is now still unitialized, the program will crash in the else-branch.
Of course there are various things which could be improved here, so you might blame it on the programmer:
Forget about the pointer and do what the optimizer does - use m_data directly.
Initialize pData to nullptr - that way the optimizer knows that the else-branch will fail if the pointer is never assigned. At least it seems to solve the problem in my test-environment.
Move the pointer assignment in front of the loop (effectively initializing pData with &m_data, which then could also be a reference instead of a pointer (for good measure). This makes sense because pData is needed in all cases so there is no reason to do this inside the loop.
The code is obviously smelly, to say the least, and I'm not trying to "blame" the optimizer for doing this. But I'm asking: What am I doing wrong? The program might be ugly, but it's valid code...
I should add that I'm using VS2012 with C++/CLI and v110_xp-Toolset. Optimization is set to /O2. Please also note that if you really want to reproduce the problem (that's not really the point of this question though) you need to play around with the complexity of the program. This is a very simplified example and the optimizer sometimes doesn't remove the pointer assignment. Hiding &m_data behind a function seems to "help".
EDIT:
Q: How do I know that the compiler is optimizing it to something like the example provided?
A: I'm not very good at reading assembler, I have looked at it however and have made 3 observations which make me believe that it's behaving this way:
As soon as optimization kicks in (adding more assignments usually does the trick) the pointer assignment has no associated assembler statement. It also hasn't been moved up to the declaration, so it's really left uninitialized it seems (at least to me).
In cases where the program crashes, the debugger skips the assignment statement. In cases where the program runs without problems, the debugger stops there.
If I watch the content of pData and the content of m_data while debugging, it clearly shows that all assignments in the if-branch have an effect on m_data and m_data receives the correct values. The pointer itself it still pointing to the same uninitialized value it had from the beginning. Therefore I have to assume that it is in fact not using the pointer to make the assignments at all.
Q: Does it have to do anything with i (Loop unrolling)?
A: No, the actual program actually uses do { ... } while() to loop over a SQL SELECT-resultset so the iteration count is completely runtime-specific and cannot be predetermined by the compiler.
It sure looks like an bug to me. It's fine for the optimizer to eliminate the unnecessary redirection, but it should not eliminate the assignment to pData.
Of course, you can work around the problem by assigning to pData before the loop (at least in this simple example). I gather that the problem in your actual code isn't as easily resolved.
I also vote for an optimizer bug if it is really reproducible in this example. To overrule the optimizer you could try to declare pData as volatile.

In which case is if(a=b) a good idea? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Inadvertent use of = instead of ==
C++ compilers let you know via warnings that you wrote,
if( a = b ) { //...
And that it might be a mistake that you certainly wanted to write:
if( a == b ) { //...
But is there a case where the warning should be ignored, because it's a good way to use this "feature"?
I don't see any code clarity reason possible, so is there a case where it’s useful?
Two possible reasons:
Assign & Check
The = operator (when not overriden) normally returns the value that it assigned. This is to allow statements such as a=b=c=3. In the context of your question, it also allows you to do something like this:
bool global;//a global variable
//a function
int foo(bool x){
//assign the value of x to global
//if x is equal to true, return 4
if (global=x)
return 4;
//otherwise return 3
return 3;
}
...which is equivalent to but shorter than:
bool global;//a global variable
//a function
int foo(bool x){
//assign the value of x to global
global=x;
//if x is equal to true, return 4
if (global==true)
return 4;
//otherwise return 3
return 3;
}
Also, it should be noted (as stated by Billy ONeal in a comment below) that this can also work when the left-hand argument of the = operator is actually a class with a conversion operator specified for a type which can be coerced (implicitly converted) to a bool. In other words, (a=b) will evaulate to true or false if a is of a type which can be coerced to a boolean value.
So the following is a similar situation to the above, except the left-hand argument to = is an object and not a bool:
#include <iostream>
using namespace std;
class Foo {
public:
operator bool (){ return true; }
Foo(){}
};
int main(){
Foo a;
Foo b;
if (a=b)
cout<<"true";
else
cout<<"false";
}
//output: true
Note: At the time of this writing, the code formatting above is bugged. My code (check the source) actually features proper indenting, shift operators and line spacing. The <'s are supposed to be <'s, and there aren't supposed to be enourmous gaps between each line.
Overridden = operator
Since C++ allows the overriding of operators, sometimes = will be overriden to do something other than what it does with primitive types. In these cases, the performing the = operation on an object could return a boolean (if that's how the = operator was overridden for that object type).
So the following code would perform the = operation on a with b as an argument. Then it would conditionally execute some code depending on the return value of that operation:
if (a=b){
//execute some code
}
Here, a would have to be an object and b would be of the correct type as defined by the overriding of the = operator for objects of a's type. To learn more about operator overriding, see this wikipedia article which includes C++ examples: Wikipedia article on operator overriding
while ( (line = readNextLine()) != EOF) {
processLine();
}
You could use to test if a function returned any error:
if (error_no = some_function(...)) {
// Handle error
}
Assuming that some_function returns the error code in case of an error. Or zero otherwise.
This is a consequence of basic feature of the C language:
The value of an assignment operation is the assigned value itself.
The fact that you can use that "return value" as the condition of an if() statement is incidental.
By the way, this is the same trick that allows this crazy conciseness:
void strcpy(char *s, char *t)
{
while( *s++ = *t++ );
}
Of course, the while exits when the nullchar in t is reached, but at the same time it is copied to the destination s string.
Whether it is a good idea, usually not, as it reduce code readability and is prone to errors.
Although the construct is perfectly legal syntax and your intent may truly be as shown below, don't leave the "!= 0" part out.
if( (a = b) != 0 ) {
...
}
The person looking at the code 6 months, 1 year, 5 years from now, at first glance, is simply going to believe the code contains a "classic bug" written by a junior programmer and will try to "fix" it. The construct above clearly indicates your intent and will be optimized out by the compiler. This would be especially embarrassing if you are that person.
Your other option is to heavily load it with comments. But the above is self-documenting code, which is better.
Lastly, my preference is to do this:
a = b;
if( a != 0 ) {
...
}
This is about a clear as the code can get. If there is a performance hit, it is virtually zero.
A common example where it is useful might be:
do {
...
} while (current = current->next);
I know that with this syntax you can avoid putting an extra line in your code, but I think it takes away some readability from the code.
This syntax is very useful for things like the one suggested by Steven Schlansker, but using it directly as a condition isn't a good idea.
This isn't actually a deliberate feature of C, but a consequence of two other features:
Assignment returns the assigned value
This is useful for performing multiple assignments, like a = b = 0, or loops like while ((n = getchar()) != EOF).
Numbers and pointers have truth values
C originally didn't have a bool type until the 1999 standard, so it used int to represent Boolean values. Backwards compatibility requires C and C++ to allow non-bool expressions in if, while, and for.
So, if a = b has a value and if is lenient about what values it accepts, then if (a = b) works. But I'd recommend using if ((a = b) != 0) instead to discourage anyone from "fixing" it.
You should explicitly write the checking statement in a better coding manner, avoiding the assign & check approach. Example:
if ((fp = fopen("filename.txt", "wt")) != NULL) {
// Do something with fp
}
void some( int b ) {
int a = 0;
if( a = b ) {
// or do something with a
// knowing that is not 0
}
// b remains the same
}
But is there a case where the warning
should be ignored because it's a good
way to use this "feature"? I don't see
any code clarity reason possible so is
there a case where its useful?
The warning can be suppressed by placing an extra parentheses around the assignment. That sort of clarifies the programmer's intent. Common cases I've seen that would match the (a = b) case directly would be something like:
if ( (a = expression_with_zero_for_failure) )
{
// do something with 'a' to avoid having to reevaluate
// 'expression_with_zero_for_failure' (might be a function call, e.g.)
}
else if ( (a = expression2_with_zero_for_failure) )
{
// do something with 'a' to avoid having to reevaluate
// 'expression2_with_zero_for_failure'
}
// etc.
As to whether writing this kind of code is useful enough to justify the common mistakes that beginners (and sometimes even professionals in their worst moments) encounter when using C++, it's difficult to say. It's a legacy inherited from C and Stroustrup and others contributing to the design of C++ might have gone a completely different, safer route had they not tried to make C++ backwards compatible with C as much as possible.
Personally I think it's not worth it. I work in a team and I've encountered this bug several times before. I would have been in favor of disallowing it (requiring parentheses or some other explicit syntax at least or else it's considered a build error) in exchange for lifting the burden of ever encountering these bugs.
while( (l = getline()) != EOF){
printf("%s\n", l);
}
This is of course the simplest example, and there are lots of times when this is useful. The primary thing to remember is that (a = true) returns true, just as (a = false) returns false.
Preamble
Note that this answer is about C++ (I started writing this answer before the tag "C" was added).
Still, after reading Jens Gustedt's comment, I realized it was not the first time I wrote this kind of answer. Truth is, this question is a duplicate of another, to which I gave the following answer:
Inadvertent use of = instead of ==
So, I'll shamelessly quote myself here to add an important information: if is not about comparison. It's about evaluation.
This difference is very important, because it means anything can be inside the parentheses of a if as long as it can be evaluated to a Boolean. And this is a good thing.
Now, limiting the language by forbidding =, where all other operators are authorized, is a dangerous exception for the language, an exception whose use would be far from certain, and whose drawbacks would be numerous indeed.
For those who are uneasy with the = typo, then there are solutions (see Alternatives below...).
About the valid uses of if(i = 0) [Quoted from myself]
The problem is that you're taking the problem upside down. The "if" notation is not about comparing two values like in some other languages.
The C/C++ if instruction waits for any expression that will evaluate to either a Boolean, or a null/non-null value. This expression can include two values comparison, and/or can be much more complex.
For example, you can have:
if(i >> 3)
{
std::cout << "i is less than 8" << std::endl
}
Which proves that, in C/C++, the if expression is not limited to == and =. Anything will do, as long as it can be evaluated as true or false (C++), or zero non-zero (C/C++).
About valid uses
Back to the non-quoted answer.
The following notation:
if(MyObject * p = findMyObject())
{
// uses p
}
enables the user to declare and then use p inside the if. It is a syntactic sugar... But an interesting one. For example, imagine the case of an XML DOM-like object whose type is unknown well until runtime, and you need to use RTTI:
void foo(Node * p_p)
{
if(BodyNode * p = dynamic_cast<BodyNode *>(p_p))
{
// this is a <body> node
}
else if(SpanNode * p = dynamic_cast<SpanNode *>(p_p))
{
// this is a <span> node
}
else if(DivNode * p = dynamic_cast<DivNode *>(p_p))
{
// this is a <div> node
}
// etc.
}
RTTI should not be abused, of course, but this is but one example of this syntactic sugar.
Another use would be to use what is called C++ variable injection. In Java, there is this cool keyword:
synchronized(p)
{
// Now, the Java code is synchronized using p as a mutex
}
In C++, you can do it, too. I don't have the exact code in mind (nor the exact Dr. Dobb's Journal's article where I discovered it), but this simple define should be enough for demonstration purposes:
#define synchronized(lock) \
if (auto_lock lock_##__LINE__(lock))
synchronized(p)
{
// Now, the C++ code is synchronized using p as a mutex
}
(Note that this macro is quite primitive, and should not be used as is in production code. The real macro uses a if and a for. See sources below for a more correct implementation).
This is the same way, mixing injection with if and for declaration, you can declare a primitive foreach macro (if you want an industrial-strength foreach, use Boost's).
About your typo problem
Your problem is a typo, and there are multiple ways to limit its frequency in your code. The most important one is to make sure the left-hand-side operand is constant.
For example, this code won't compile for multiple reasons:
if( NULL = b ) // won't compile because it is illegal
// to assign a value to r-values.
Or even better:
const T a ;
// etc.
if( a = b ) // Won't compile because it is illegal
// to modify a constant object
This is why in my code, const is one of the most used keyword you'll find. Unless I really want to modify a variable, it is declared const and thus, the compiler protects me from most errors, including the typo error that motivated you to write this question.
But is there a case where the warning should be ignored because it's a good way to use this "feature"? I don't see any code clarity reason possible so is there a case where its useful?
Conclusion
As shown in the examples above, there are multiple valid uses for the feature you used in your question.
My own code is a magnitude cleaner and clearer since I use the code injection enabled by this feature:
void foo()
{
// some code
LOCK(mutex)
{
// some code protected by a mutex
}
FOREACH(char c, MyVectorOfChar)
{
// using 'c'
}
}
... which makes the rare times I was confronted to this typo a negligible price to pay (and I can't remember the last time I wrote this type without being caught by the compiler).
Interesting sources
I finally found the articles I've had read on variable injection. Here we go!!!
FOR_EACH and LOCK (2003-11-01)
Exception Safety Analysis (2003-12-01)
Concurrent Access Control & C++ (2004-01-01)
Alternatives
If one fears being victim of the =/== typo, then perhaps using a macro could help:
#define EQUALS ==
#define ARE_EQUALS(lhs,rhs) (lhs == rhs)
int main(int argc, char* argv[])
{
int a = 25 ;
double b = 25 ;
if(a EQUALS b)
std::cout << "equals" << std::endl ;
else
std::cout << "NOT equals" << std::endl ;
if(ARE_EQUALS(a, b))
std::cout << "equals" << std::endl ;
else
std::cout << "NOT equals" << std::endl ;
return 0 ;
}
This way, one can protect oneself from the typo error, without needing a language limitation (that would cripple language), for a bug that happens rarely (i.e., almost never, as far as I remember it in my code).
There's an aspect of this that hasn't been mentioned: C doesn't prevent you from doing anything it doesn't have to. It doesn't prevent you from doing it because C's job is to give you enough rope to hang yourself by. To not think that it's smarter than you. And it's good at it.
Never!
The exceptions cited don't generate the compiler warning. In cases where the compiler generates the warning, it is never a good idea.
RegEx sample
RegEx r;
if(((r = new RegEx("\w*)).IsMatch()) {
// ... do something here
}
else if((r = new RegEx("\d*")).IsMatch()) {
// ... do something here
}
Assign a value test
int i = 0;
if((i = 1) == 1) {
// 1 is equal to i that was assigned to a int value 1
}
else {
// ?
}
My favourite is:
if (CComQIPtr<DerivedClassA> a = BaseClassPtr)
{
...
}
else if (CComQIPtr<DerivedClassB> b = BaseClassPtr)
{
...
}