I was shown a sample program to demonstrate recursion which looks like it should not work but does. The logic is pretty clear but why does it work even when the recursed function call is not returned? It seems like the return command breaks out of the stack even if it isn't requested to. Is this a language standard or a gcc thing? I saw it with C and C++ compiled with gcc on Windows and Linux.
#include <iostream>
#include <cstdlib>
using namespace std;
int isprime(int num, int i)
{
if (i == 1) {
return 1;
}
else {
if (num % i == 0)
return 0;
else
isprime(num, i-1); // should be returned
}
}
int main(int argc, char** argv)
{
int input = atoi(argv[1]);
cout << input << "\t" << isprime(input, input/2) << "\n";
}
Things like that only work if accidentally the return value happens to be in the register where the caller expects it. This only works if this is realized by your compiler as a recursive function. Technically it is undefined behavior to use the return value of a function that doesn't provide one.
Edit: On modern architectures the return value of a function for values for which it is possible is passed in a specific hardware register. When you call your function recursively, on the bottom in all cases that hardware register is set to the expect value. If by chance when popping up from recursion that hardware register is never changed, you end up with the correct value.
All of this pattern wouldn't work, if the return value would be placed at some location of the stacks of the (recursive) callers.
In any case, all of that should be captured by any modern compiler and give you a warning. If it doesn't you don't have a good compiler, or you are using too defensive command line options.
New year's eve special: In the real world, code like this (with the return) wouldn't even be realized as a recursive function. With not too much effort you will find an iterative variant of that function, and any modern decent compiler should be able to find it as well if you ask for maximal optimization.
A lot here depends what you mean by "it works"?
to try and answer the main point of your question, functions will return when the end of the function is reached, whether or not a return statement is met.
I would expect to see compiler warnings telling you the possible controls paths may not return a value, in C++ at any rate. Resulting in undefined behaviour, see this question:
not returning a value from a non-void returning function
I would say that this example "works" as after a prime is found and isPrime has returned, then the next function up the stack is also free to return. Nothing depends on the return value of isPrime either, so the program will run back up the stack and output something.
...but as behaviour is undefined, the value that actually gets output is likely to be junk. If you are seeing 0 & 1 consistent with primes as input, then wow.
If you think this is working, I would look at testing more broadly with different values.
Also have you been building with any "debug" settings? if so try this again with debug settings off, as thiese sometimes do extra work to keep things uninitialised memory clean.
I can explain exactly what happens:
The function is called, and it recurses back into itself until it reaches the return at either modulo (return 0) or end of recursion (return 1). At this point the function reuturns to the caller, which is is_prime. But there is no more code in the function to execute, so it immediately returns without any further action.
However, you could easily break this by, for example, add printf("Done for %d, %d\n", num, i); behind the call of is_prime() [doesn't have to be in the if-statement]. Or adding a C++ object that is created and destroyed on entry/exit of the function, as another example.
You're just being lucky that it works. And it's very fragile and easy to break - compile it with a different compiler (or with different optimization settings, or a new version of the compiler, or a million other things), and it may well break.
Aren't you forgetting a return statement? For normal recursion you need to put a return before isprime(num,i-1); as well.
I guess this even should give a compile warning if you compile this using strict rules, because the function must always return an int, now it does not (at least if your compiler does not fix this).
Related
I'm struggling with a non-sensical if statement...
Consider this code in a C++ file
if (coreAudioDevice) {
delete coreAudioDevice;
coreAudioDevice = nullptr;
}
coreAudioDevice = AudioDevice::GetDevice(defaultOutputDeviceID, false, coreAudioDevice, true);
if (coreAudioDevice)
{
coreAudioDevice->setDefaultDevice(true);
// we use the quick mode which skips initialisation; cache the device name (in AudioDevice)
// using an immediate, blocking look-up.
char devName[256];
coreAudioDevice->GetName(devName, 256);
AUDINFO ("Using default output device %p #%d=\"%s\".\n",
defaultOutputDeviceID, coreAudioDevice, coreAudioDevice->GetName());
}
else
AUDERR ("Failed to obtain a handle on the default device (%p)\n", coreAudioDevice);
calling a function in an ObjC++ file:
AudioDevice *AudioDevice::GetDevice(AudioObjectID devId, bool forInput, AudioDevice *dev, bool quick)
{
if (dev) {
if (dev->ID() != devId) {
delete dev;
} else {
return nullptr;
}
}
dev = new AudioDevice(devId, quick, forInput);
return dev;
}
Which leads to the following terminal output:
ERROR coreaudio.cc:232 [init]: Failed to obtain a handle on the default device (0x7f81a1f1f1b0)
Evidently the if shouldn't fail because coreAudioDevice supposedly is NULL and then print a non-null value for this variable in the else branch.
I tried different compiler options and a different compiler (clang 4.0.1 vs. 5.0.1), apparently there is really something fishy in my code. Any thoughts?
Reaching the end of the function without returning a value is undefined behavior in C++.
See http://en.cppreference.com/w/cpp/language/ub and What are all the common undefined behaviours that a C++ programmer should know about?.
So the call setDefaultDevice() can legally result in anything. The compiler is free to compile the program into an executable that can do anything, when the program's control flow leads to undefined behavior (i.e. the call to setDefaultDevice()).
In this case, entering the if block with coreAudioDevice non-zero leads to UB. So the optimizing compiler foresees this and chooses to then make it go into the else branch instead. Like this it can remove the first branch and the if entirely, to produce more optimized code.
See https://blogs.msdn.microsoft.com/oldnewthing/20140627-00/?p=633
Without optimizations the program should normally run as expected.
Well, at least I found a reason, but no understanding (yet).
I had defined this method, without noticing the compiler warning (amidst a bunch of deprecation warnings printed multiple times because of concurrent compilation...):
bool setDefaultDevice(bool isDefault)
{
mDefaultDevice = isDefault;
}
Indeed, no return value.
Notice that I call this method inside the skipped if block - so theoretically I never got the chance to do that. BTW, it's what led me to discover this strange issue.
The issue goes away when I remove the call or when I make the method void as intended.
I think this also explains the very strange way of crashing I've seen: somehow the optimiser gets completely confused because of this. I'm tempted to call this a compiler bug; I don't use the return value from the method, so flow shouldn't be affected at all IMHO.
Ah, right. Should I read that as "free to build an exec that can do anything EXCEPT the sensical thing"? If so, that former boss of mine had a point banning C++ as an anomaly (the exact word was French, "saleté")...
Anyway, I can understand why the behaviour would be undefined when you don't know a function doesn't actually put a return value on the stack. You'd be popping bytes off the stack after the return, messing things up. (Read heap for stack if necessary =] ). I'm guessing that's what would happen when you run this kind of code without optimisation, in good ole C or with the bugggy method defined out of line (so the optimiser cannot know that it's buggy).
But once you know that a function doesn't actually return a value and you see that the value wouldn't be used anyway, you should be able to emit code that doesn't pop the corresponding number of bytes. IOW, code that behaves as expected. With a big fat compile-time warning. Presuming the standard allows this that'd be the sensical thing to do, rather than optimise the entire tainted block away because that'd be faster. If that's indeed the reasoning followed in clang it doesn't actually inspire confidence...
Does the standard say this cannot be an error by default? I'd certainly prefer that over the current behaviour!
I was shown a sample program to demonstrate recursion which looks like it should not work but does. The logic is pretty clear but why does it work even when the recursed function call is not returned? It seems like the return command breaks out of the stack even if it isn't requested to. Is this a language standard or a gcc thing? I saw it with C and C++ compiled with gcc on Windows and Linux.
#include <iostream>
#include <cstdlib>
using namespace std;
int isprime(int num, int i)
{
if (i == 1) {
return 1;
}
else {
if (num % i == 0)
return 0;
else
isprime(num, i-1); // should be returned
}
}
int main(int argc, char** argv)
{
int input = atoi(argv[1]);
cout << input << "\t" << isprime(input, input/2) << "\n";
}
Things like that only work if accidentally the return value happens to be in the register where the caller expects it. This only works if this is realized by your compiler as a recursive function. Technically it is undefined behavior to use the return value of a function that doesn't provide one.
Edit: On modern architectures the return value of a function for values for which it is possible is passed in a specific hardware register. When you call your function recursively, on the bottom in all cases that hardware register is set to the expect value. If by chance when popping up from recursion that hardware register is never changed, you end up with the correct value.
All of this pattern wouldn't work, if the return value would be placed at some location of the stacks of the (recursive) callers.
In any case, all of that should be captured by any modern compiler and give you a warning. If it doesn't you don't have a good compiler, or you are using too defensive command line options.
New year's eve special: In the real world, code like this (with the return) wouldn't even be realized as a recursive function. With not too much effort you will find an iterative variant of that function, and any modern decent compiler should be able to find it as well if you ask for maximal optimization.
A lot here depends what you mean by "it works"?
to try and answer the main point of your question, functions will return when the end of the function is reached, whether or not a return statement is met.
I would expect to see compiler warnings telling you the possible controls paths may not return a value, in C++ at any rate. Resulting in undefined behaviour, see this question:
not returning a value from a non-void returning function
I would say that this example "works" as after a prime is found and isPrime has returned, then the next function up the stack is also free to return. Nothing depends on the return value of isPrime either, so the program will run back up the stack and output something.
...but as behaviour is undefined, the value that actually gets output is likely to be junk. If you are seeing 0 & 1 consistent with primes as input, then wow.
If you think this is working, I would look at testing more broadly with different values.
Also have you been building with any "debug" settings? if so try this again with debug settings off, as thiese sometimes do extra work to keep things uninitialised memory clean.
I can explain exactly what happens:
The function is called, and it recurses back into itself until it reaches the return at either modulo (return 0) or end of recursion (return 1). At this point the function reuturns to the caller, which is is_prime. But there is no more code in the function to execute, so it immediately returns without any further action.
However, you could easily break this by, for example, add printf("Done for %d, %d\n", num, i); behind the call of is_prime() [doesn't have to be in the if-statement]. Or adding a C++ object that is created and destroyed on entry/exit of the function, as another example.
You're just being lucky that it works. And it's very fragile and easy to break - compile it with a different compiler (or with different optimization settings, or a new version of the compiler, or a million other things), and it may well break.
Aren't you forgetting a return statement? For normal recursion you need to put a return before isprime(num,i-1); as well.
I guess this even should give a compile warning if you compile this using strict rules, because the function must always return an int, now it does not (at least if your compiler does not fix this).
Should the super simple code below produce an error? Should it produce anything?
int main()
{
return -1;
}
I am just starting to work through the 'C++ Primer' and supposedly from Exercise 1.2:
A return value of -1 is often treated as an indicator that the program failed. Recompile and rerun your program to see how your system treats a failure indicator from main.
It seems like my system treats a failure indicator as business as usual and it is driving me crazy!!!
I can return any number, positive or negative and there won't be an error. Only when I type in a string like 'return cisstupid;' I would (obviously) get an error. An hour of googling has not yielded anything and I have tried running the code with -Wpedantic, -Wall, -Werror (i.e. g++ -Wpedantic -o test test.cpp). The system is Windows 8.1 with mingw installed.
return keyword is used to return some value to the calling function by the function that has been called this return value can be almost anything the interpretation of the return value entirely depends on how the calling function has handled the return value.
int main()
{
return -1;
}
Here you are simply returning a negative number using return whose interpretation can be handled in the calling function (here Operatin.
This is a convention not a protocol or rule to treat this return value as an operational error for the program , depending on the calling system the value of return can effect the output ie. as CiaPan stated that there is much difference between the often and always words.
In Short the interpretation of return -1 depends on the calling System.
Programs can complete with an error. I'm not sure what kind of shell you are using, but with a bash shell, you can type echo $? to get the return code of the last executed program.
To understand why this is, consider this: Most of the time, you do not want the entire system to come to a halt. You do, however, want to know that something went wrong. So it is common to call something, get the return value, then react appropriately, rather than let the entire thing blow up.
If you're on Windows, yes, returning anything is business as always. I'm not sure about other systems, but I've never heard of using a different return value in main to indicate something bad has happened to the user.
The actual use for return values is just as you know them, to get information back. The return value of main is used the same way. If you're in A.exe and call B.exe, then you can use main's return value to indicate whether or not B.exe ran successfully. However, where you're at now, you'll never do that, hence you just always return 0.
I was looking over some example functions and methods (I'm currently in a C++ class), and I noticed that there were a few functions that, rather than being void, they were something like
int myFunction() {
// ...;
return 0;
}
Where the ellipses is obviously some other statement. Why are they returning zero? What's the point of returning a specific value every time you run a function?
I understand that main() has to be int (at least according to the standards) because it is related (or is?) the exit code and thus works with the operating system. However, I can't think of a reason a non-main function would do this.
Is there any particular reason why someone might want to do this, as opposed to simply making a void function?
If that's really what they're doing, returning 0 regardless of what the function does, then it's entirely pointless and they shouldn't be doing it.
In the C world, an int return type is a convention so that you can return your own "error code", but not only is this not idiomatic C++ but if, again, your programmer is always returning 0, then it's entirely silly.
Specifically:
I understand that main() has to be int (at least according to the standards) because it is related (or is?) the exit code and thus works with the operating system. However, I can't think of a reason a non-main function would do this.
I agree.
There's a common convention of int functions returning 0 for success and some non-zero error code for failure.
An int function that always returns 0 might as well be a void function if viewed in isolation. But depending on the context, there might be good reasons to make it compatible with other functions that returning meaningful results. It could mean that the function's return type won't have to be changed if it's later modified so it detects errors -- or it might be necessary for its declaration to be compatible with other int-returning functions, if it's used as a callback or template argument.
I suggest examining other similar functions in the library or program.
It's a convention, particularly among C programmers, to return 0 if the function did not experience any errors and return a nonzero value if there was an error.
This has carried over into C++, and although it's less common and less of a convention due to exception handling and other more object-oriented-friendly ways of handling errors, it does come up often enough.
One more issue that was not touched by other answers. Within the ellipses may be another return statement:
int myFunction() {
// ...;
if (error)
return code;
// ...;
return 0;
}
in which case myFunction is not always returning 0, but rather only when no error has occurred. Such return statements are often preferred over more structured but more verbose if/else code blocks, and may often be disguised within long, sloppy code.
Most of the time function like this should be returning void.
Another possibility is that this function is one of a series of closed-related functions that have the same signature. The return int value may signal the status, say returning 0 for success, and a few of these functions always succeed. To change the signature may break the consistency, or would make the function unusable as function objects since the signature does not match.
Is there any particular reason why someone might want to do this, as opposed to simply making a void function?
Why does your mother cut the ends off the roast before putting it in the oven? Answer: Because that's what her grandmother did. However, her grandmother did that for a simple reason: Her roast pan wasn't big enough to hold a full-sized roast.
I work with a simulation tool that in its earliest incarnations required that all functions callable by the simulation engine must return a success status: 0=success, non-zero=failure. Functions that could never fail were coded to always returned zero. The simulation engine has been able to accommodate functions that return void for a long, long, time. That returning an integer success code was the required behavior from some previous millennium hasn't stopped cargo cult programmers from carrying this behavior of writing functions that always returning zero forward to the current day.
In certain programming languages you find procedures and functions. In C, C++ and similar languages you don't. Rather you only have functions.
In practice, a procedure is a part of a program that performs a certain task. A function on the other hand is like a procedure but the function can return an answer back.
Since C++ has only functions, how would you create a procedure? That's when you would either create a void function or return any value you like to show that the task is complete. It doesn't have to be 0. You can even return a character if you like to.
Take for example, the cout statement. It just outputs something but not return anything. This works like a procedure.
Now consider a math function like tan(x). It is meant to use x and return an answer back to the program that called it. In this case, you cannot return just anything. You must return the value of the TAN operation.
So if you need to write your own functions, you must return a value based on what you're doing. If there's nothing to return, you may just write a void function or return a dummy value like 0 or anything else.
In practice though, it's common to find functions returning 0 to indicate that 'all went off well' but this is not necessarily a rule.
here's an example of a function I would write, which returns a value:
float Area ( int radius)
{
float Answer = 3.14159 * radius * radius;
return Answer;
}
This takes the radius as a parameter and returns the calculated answer (area). In this case you cannot just say return 0.
I hope this is clear.
#include <setjmp.h>
#include <vector>
int main(int argc, char**) {
std::vector<int> foo(argc);
jmp_buf env;
if (setjmp(env)) return 1;
}
Compiling the above code with GCC 4.4.1, g++ test.cc -Wextra -O1, gives this confusing warning:
/usr/include/c++/4.4/bits/stl_vector.h: In function ‘int main(int, char**)’:
/usr/include/c++/4.4/bits/stl_vector.h:1035: warning: variable ‘__first’ might be clobbered by ‘longjmp’ or ‘vfork’
Line 1035 of stl_vector.h is in a helper function used by the vector(n, value) constructor that I invoke while constructing foo. The warning disappears if the compiler can figure out the argument value (e.g. it is a numeric literal), so I use argc in this test case because the compiler cannot determine the value of that.
I guess the warning might be because of compiler optimizing the vector construction so that it actually happens after the setjmp landing point (which seems to be the case here when the constructor argument depends on a parameter of the function).
How can I avoid the problem, preferably without having to break the setjmp part to another function?
Not using setjmp is not an option because I am stuck with a bunch of C libraries that require using it for error handling.
The rule is that any non-volatile, non-static local variable in the stack frame calling setjmp might be clobbered by a call to longjmp. The easiest way to deal with it is to ensure that the frame you call setjmp doesn't contain any such variables you care about. This can usually be done by putting the setjmp into a function by itself and passing in references to things that have been declared in another function that doesn't call setjmp:
#include <setjmp.h>
#include <vector>
int wrap_libcall(std::vector<int> &foo)
{
jmp_buf env;
// no other local vars
if (setjmp(env)) return 1;
// do stuff with your library that might call longjmp
return 0;
}
int main(int argc, char**) {
std::vector<int> foo(argc);
return wrap_libcall(foo);
}
Note also that in this context, clobbering really just means resetting to the value it had when setjmp was called. So if longjmp can never be called after a modification of a local, you're ok too.
Edit
The exact quote from the C99 spec on setjmp is:
All accessible objects have values, and all other components of the abstract machine
have state, as of the time the longjmp function was called, except that the values of
objects of automatic storage duration that are local to the function containing the
invocation of the corresponding setjmp macro that do not have volatile-qualified type
and have been changed between the setjmp invocation and longjmp call are
indeterminate.
This is not a warning that you should ignore, longjmp() and C++ objects don't get along with each other. The problem is that the compiler automatically emits a destructor call for your foo object. A longjmp() can bypass the destructor call.
C++ exceptions unwind stack frames too but they guarantee that destructors of local objects will be called. No such guarantee from longjmp(). Finding out if longjmp() is going to byte you requires carefully analyzing the local variables in each function which might be terminated early due to the longjmp(). That's not easy.
As evidenced by the line number 1035 in the error message, your code snippet has considerably simplified the actual problem code. You went too far. There is no clue as to how you are using 'first'. The problem is that the compiler can't figure that out even in the real code. It is afraid that the value of 'first' after a non-zero return from 'setjmp' may not be what you think it is. This is because you changed its value both before and after the first call (zero return) to 'setjmp'. If the variable was stored in a register, the value will probably be different than if it were stored in memory. So the compiler is being conservative by giving you the warning.
To take a blind leap and answer the question, you may be able to get rid of the warning message by qualifying the declaration of 'first' with 'volatile'. You could also try making 'first' global. Perhaps by dropping the optimization level (-O flag), you could cause the compiler to keep variables in memory. These are quick fixes, and may actually hide a bug.
You should really take a look at your code, and how you are using 'first'. I'll take another wild guess, and say you may be able to eliminate that variable. Could that name, 'first', mean you are using it to indicate a first call (zero return) to 'setjmp'? If so, get rid of it - redesign your logic.
If the real code just exits on a non-zero return from 'setjmp' (as in the snippet), then the value of 'first' doesn't matter in that logic path. Don't use it on both sides of the 'setjmp'.
The quick answer: drop the -O1 flag or regress the compiler to an earlier version. Either one made the warning disappear on my system. I had to build and use gcc4.4 to get the warning in the first place. (darn that's a huge system)
No? I thought not.
I really don't understand everything C++ does with its objects, and exactly how they are deallocated. Yet OP's comment that the problem didn't occur if a constant value were used in place of 'argc' for the vector size gives me an opportunity to stick my neck out. I'll hazard a guess that C++ uses the '__first' pointer on deallocation only when the initial allocation is not a constant. At the higher level of optimization, the compiler uses the registers more and there is a conflict between the pre- and post-setjmp allocations ... I don't know, it makes no sense.
The general meaning of this warning is "Are you sure you know what you are doing?" The compiler doesn't know if you know what the value of '__first' will be when you do the longjmp, and get a non-zero return from 'setjmp'. The question is whether its value after the (non-zero) return is the value that was put into the save buffer, or the value that you created after the save. In this case, it's confusing because you didn't know you were using '__first', and because in such a simple program, there is no (explicit) change to '__first'
The compiler can't analyze logic flow in a complex program, so it apparently doesn't even try for any program. It allows for the possibility that you did change the value. So it just gives you a friendly 'heads-up'. The compiler is second guessing you, trying to be helpful.
If you are stubborn with your choice of compiler and optimization, there is a programming fix. Save the environment before the allocation of the vector. Move the 'setjmp' up to the top of the program. Depending on the vector use and the error logic in the real program, this may require other changes.
edit 1/21 -------
my justification (using g++-mp-4.4 -Wextra -O1 main.cpp):
#include <setjmp.h>
#include <vector>
#include <iostream>
int main(int argc, char**) {
jmp_buf env;
int id = -1, idd = -2;
if ((id=setjmp(env)))
idd = 1;
else
idd = 0;
std::cout<<"Start with "<< id << " " << idd <<std::endl;
std::vector<int> foo(argc );
if(id != 4)
longjmp(env, id+1);
std::cout<<"End with "<< id << " " << idd <<std::endl;
}
No warnings; a.out produced:
Start with 0 0
Start with 1 1
Start with 2 1
Start with 3 1
Start with 4 1
End with 4 1