function attribute returns_twice - c++

I just was looking up funciton attributes for gcc
(http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html)
and came across the returns_twice attribute.
And I am absolutely clueless in what case a function can return twice... I looked up quickly the mentioned vfork() and setjmp() but continue without an idea how an applicable scenario looks like - anyone of you used it or can explain a bit?

The setjmp function is analogous to creating a label (in the goto sense), as such you will first return from setjmp when you set the label, and then each time that you actually jump to it.
If it seems weird, rest assured, you should not be using setjmp in your daily programming. Or actually... you should probably not be using it at all. It is a very low-level command that break the expected execution flow (much like goto) and, especially in C++, most of the invariants you could expect.

When you call setjmp, it establishes that as a return point, then execution continues at the code immediately following the setjmp call.
At some point later in the code, calling longjmp (with the jump buffer initialized by the previous call to setjmp) returns execution to start from that same point again (i.e., the code immediately following the call the setjmp).
Therefore, the original call returns normally, then at arbitrary later times, execution returns (or at least may return) to the same point again.
The attribute simply warns the compiler of that fact.

Related

Is this a valid use of the noreturn attribute?

While working on a thread (fiber) scheduling class, I found myself writing a function that never returns:
// New thread, called on an empty stack
// (implementation details, exception handling etc omitted)
[[noreturn]] void scheduler::thread() noexcept
{
current_task->state = running;
current_task->run();
current_task->state = finished;
while (true) yield();
// can't return, since the stack contains no return address.
}
This function is never directly called (by thread();). It is "called" only by a jmp from assembly code, right after switching to a new context, so there is no way for it to "return" anywhere. The call to yield(), at the end, checks for state == finished and removes this thread from the thread queue.
Would this be a valid use of the [[noreturn]] attribute? And if so, would it help in any way?
edit: Not a duplicate. I understand what the attribute is normally used for. My question is, would it do anything in this specific case?
I'd say that it is valid but pointless.
It's valid because the function does not return. The contract cannot be broken.
It's pointless because the function is never called from C++ code. So no caller can make use of the fact that the function does not return because there is no caller. And at the point of definition of the function, the compiler should not require your assistance to determine that code following the while statement is dead, including a function postlude if any.
Well, the jmp entering the function is a bit weird, but to answer your Question is
"Most likely no".
Why most likely ? Because I don't believe you understand the idea of no return OR you are stating your use case wrongfully.
1st of all the function is never entered (or so you are stating) which means that by default is not a no-return (dead code could be removed by the compiler).
But let's consider that you are actually calling the function without realizing it (via that "JMP").
The idea of a no-return function is to never reach the end of scope (or not in a normal way at least). Meaning that either the whole Program is terminated within the function OR an error is thrown (meaning that the function won't pop the stack in a normal way). std::terminate is a good example of such a function. If you call it within your function, then your function becomes no return.
In your case, you are checking if the thread is finished.
If you are in a scenario where you are murdering the thread through suicide , and this is the function which is checking for the thread completion , and you call this function from the thread itself (suicide, which I highly doubt, since the thread will become blocked by the while and never finish), and you are forcing the thread to exit abruptly (OS Specific how to do that) then yes the function is indeed a no return because the execution on the stack won't be finished.
Needless to say, if you're in the above scenario you have HUGE problems with your program.
Most likely you are calling this function from another thread, or you are exiting the thread normally, cases in which the function won't be a no-return.

Why return 0 statement doesn't pass the value zero?

I have just started learning C++ and after getting accustomed to data types, I am learning about functions and variables in programs. However, I couldn't really follow what action is the return statement intended for.
My college course's slides say return statement returns control back to the OS if the caller is not a function. But I don't understand what does returning program control back to the caller precisely mean. Does it mean that the controller in the CPU directs to execute the following instruction? They also say it passes the value 0. What does this mean? I don't see any such return value when I tried executing a program.
Please explain what does it mean to return program control to the caller and also why is the return value not passed when the return statement is executed?
The statements in a function are executed one by one in precisely the order they are stated. When a statement is a function call the control is given to the called function and the statement in this function are processed until the end of the function and then control goes back to the original function, i.e. the caller.
A return statement can be used to immediately jump back to the caller without processing any further statements in a function. Depending on whether the function is supposed to return a value you can (and have to) specify a return value.
The sequence of jumps from one function to the other and from a statement to the next is referred to as the control flow. The machine can only process at most one statement at a time and the function with the current statement is said to have control. Of course this is simplfied but I think this simplification is appropriate with regard to your question.
The return value from main isn't really a return value. It's called an "exit status", and it goes to the program that ran your program. The purpose of this is, if someone is using your program along with several others, you can tell them that something went wrong and they should stop what they're doing.
Regarding the return value from normal functions, they will probably cover that later in the course you're taking. Unfortunately a good explanation of these is beyond the scope of this answer.

Possibilities to quit a function

I was wondering about a general topic in C/C++.
Let's say we're executing a function A() which calls a function B(), can we be sure that the call of B() in A() will always return "after" the call itself.
In a more general question, which are the possibilities to quit a function ?
The C keywords are (Wikipedia) : auto, break, case, char, const (C89), continue, default, do, double, else, enum (C89), extern, float, for, goto, if, inline (C99), int, long, register, restrict (C99), return, short, signed (C89), sizeof, static, struct, switch, typedef, union, unsigned, void (C89), volatile (C89), while, _Bool (C99), _Complex (C99), _Imaginary (C99).
As far as I know, the one interesting in this topic are :
break/continue : Used in loops or switchs (as I was told by GCC after trying), they can't exit a function.
goto : The scope of labels is restricted by the functions so a goto can't exit a function
return : Can exit a function but always returns to the instruction after the call. We're safe with this one.
The exit()/abort() functions which will end up the application. We won't return to the calling point, but .. we won't return at all.
I think that is for the C language. Do you think there is another way to quit a function and without returning to the calling point ?
In C++, exceptions will obviously not return to the calling point. They will either go to a catch block or reach the calling function, looking for a catch block.
As far as I known, it would be the only case.
Thanks for helping me =)
In standard C (using setjmp/longjmp) and C++ (using exceptions), it is possible to effectively return to marked points closer to the root of the CFG. In effect, a function may never return, but if it does return, it will be to the point following the call.
However, the low-level nature of the setjmp mechanism actually makes it possible to implement coroutines (albeit in a non-portable way). Posix attempted to improve on this situation by mandating makecontext and friends, which allow for explicit stack swapping, but those functions were deprecated in Posix.1-2001 and removed from Posix.1-2008, citing portability issues, with the suggestion that threads be used instead. Nonetheless, there are a number of coroutine libraries in use which use these features to allow C programmers to enjoy the flexibility of coroutines.
In coroutine control flow, while the execution path following a (co)call might be sinuous, it is still the case that a function call either never returns or eventually returns to the immediately following point. However, the low-level nature of the C library facilities makes it possible to implement more complex control flows, in which a given (co)call might, for example, return several times. (I've never seen this particular anomaly implemented in production code, but I can't claim to have seen even a tiny percentage of all production code in the world :) ).
A gcc extension to C allows for the use of "label values", which are pointers to labels in the code. These are real values (of type void *) so they can be passed as arguments to functions. (The gcc manual warns against doing this.) With some reverse-engineering, it would probably be possible to write a function which takes one or more label arguments and uses one of them as a return point. That would clearly be a misuse of the feature, would likely not be either portable nor future-proof, and would almost certainly break any coding standards in existence.
The interesting thing about the C library facilities, as opposed to C++ exceptions which are actually part of the core language, is that they really are functions; in C, as with many programming languages, functions can be called indirectly through function pointers so that it might not be readily computable through static analysis which function is being called at a given call site. So, at least in theory, I'd say all bets are off. But in practice it's probably a safe assumption that a function call will either eventually return to the immediately following point or return to somewhere down the call stack, possibly the operating system environment.
Look up setjmp() and longjmp(). setjmp() records a certain amount of local state at the point it's called. longjmp() will return, potentially across several levels of function, to the point where you called setjmp().
You can use it for a primitive form of exception handling in C. It's very, very rarely used.
You can exit a function with
return
silently at the end of its body (without explicit return in a void function())
an exception
a longjmp()
inline assembly jumping to some address
I remember creating this func when i needed a way to quit a big program without exit
int my_exit()
{
pid_t pid;
int i;
pid = getpid();
i = kill(pid, SIGQUIT);
if (i == -1)
return (-1);
return (0);
}
'Let's say we're executing a function A() which calls a function B(), can we be sure that the call of B() in A() will always return "after" the call itself'. No, because:
'B' may raise an exception that is not caught in 'A'.
'B' may contain an infinite loop.
'B' may make a blocking OS call that never returns.

Exit the entire recursion stack

I'm calling a function fooA from main() that calls another function fooB that is recursive.
When I wish to return, I keep using exit(1) to halt execution. What is the right way to exit when the recursion tree is deep?
Returning through the recursion stack may not be of help because returning usually clears a part solution I build and I don't want to do that. I want to do execute more piece of code from main().
I read Exceptions can be used, it would be nice if I can get a code snippet.
The goto statement won't work to hop from one function back to another; Nikos C. is correct that it wouldn't account for releasing the stack frames of each of the calls you've made, so when you got to the function you goto'ed to, the stack pointer would be pointing to the stack frame of the function you were just in... no, that just won't work. Similarly, you can't simply call (either directly, or indirectly via a function pointer) the function you want to end up in when your algorithm is done. You'd never get back to the context you were in prior to diving into your recursive algorithm. You could conceivably architect a system this way, but in essence each time you did this you'd "leak" what was currently on the stack (not quite the same as leaking heap memory, but a similar effect). And if you were deep into a highly recursive algorithm, that could be a lot of "leaked" stack space.
No, you need to somehow return back to the calling context. There are only three ways to do so in C++:
Exit each function in turn by returning from it to its caller
backing up through the call chain in an orderly fashion.
Throw an exception and catch it at the point right after you
launched into your recursive algorithm (which automatically destroys
any objects created by each function on the stack in an orderly
fashion).
Use setjmp() & longjmp() to do something similar to throwing &
catching an exception, but "throwing" a longjmp() will not destroy
objects on the stack; if any such objects own heap allocations,
those allocations will be leaked.
To do option 1, you have to write your recursive function such that once a solution is reached, it returns some sort of indication that it's complete to its caller (which may be the same function), and its caller sees that fact & relays that fact on to its caller by returning to it (which may be the same function), so on and so on, until finally all stack frames of the recursive algorithm are released and you return to whatever function called the first function in the recursive algorithm.
To do option 2, you wrap the call to your recursive algorithm in a try{...} and immediately after it you catch(){...} the expected thrown object (which could conceivably be the result of the computation, or just some object that lets the caller know "hey, I'm done, you know where to find the result"). Example:
try
{
callMyRecursiveFunction(someArg);
}
catch( whateverTypeYouWantToThrow& result )
{
...do whatever you want to do with the result,
including copy it to somewhere else...
}
...and in your recursive function, when you finish the results, you simply:
throw(whateverTypeYouWantToThrow(anyArgsItsConstructorNeeds));
To do option 3...
#include <setjmp.h>
static jmp_buf jmp; // could be allocated other ways; the longjmp() user just needs to have access to it.
.
.
.
if (!setjmp(jmp)) // setjmp() returns zero 1st time, or whatever int value you send back to it with longjmp()
{
callMyRecursiveFunction(someArg);
}
...and in your recursive function, when you finish the results, you simply:
longjmp(jmp, 1); // this passes 1 back to the setjmp(). If your result is an int, you
// could pass that back to setjmp(), but you can't pass zero back.
The bad thing about using setjmp()/longjmp() is that if there are any stack-allocated objects still "alive" on the stack when you call longjmp(), execution will jump back to the setjmp() point, skipping the destructors for those objects. If your algorithm uses only POD types, that's not an issue. It's also not an issue if the non-POD types your algorithm uses do NOT contain any heap allocations (e.g. from malloc() or new). If your algorithm uses non-POD types that contain heap allocations, then you're only safe with options 1 & 2 above. But if your algorithm meets the criteria of being OK with setjmp()/longjmp(), and if your algorithm is buried under a ton of recursive calls at the point it finishes, setjmp()/longjmp() may be the fastest way back to the initial calling context. If that won't work, option 1 is probably your best bet in terms of speed. Option 2 may seem convenient (and would possibly eliminate a condition check at the start of each recursion call), but the overhead associated with the system automatically unwinding the callstack is somewhat significant.
It's typically said you should reserve exceptions for "exceptional events" (events expected to be very rare), and the overhead associated with unwinding the callstack is why. Older compilers used something akin to setjmp()/longjmp() to implement exceptions (setjmp() at the location of the try & catch, and longjmp() at the location of a throw), but there was of course extra overhead associated with determining what objects on the stack need destroyed, even if there are no such objects. Plus, every time you'd run across a try, it would have to save the context just in case there was a throw, and if exceptions are truly exceptional events, the time spent saving that context was simply wasted. Newer compilers are now more likely to use what are known as "Zero Cost Exceptions" (a.k.a. Table Based Exceptions), which seems like that would solve all the world's problems, but it doesn't.... It makes normal runtime faster because there is no longer a need to save the context every time you run across a try, but in the event that a throw executes, there is now even more overhead associated with decoding information stored in massive tables that the runtime has to process in order to figure out how to unwind the stack based on the location where the throw was encountered and content of the runtime stack. So exceptions aren't free, even though they're very convenient. You'll find a lot of stuff on the internet where people make claims about how unreasonably expensive they are and how much they slow down your code, and you'll also find lots of stuff by people refuting those claims, with both sides presenting hard data to bolster their claims. What you should take away from the arguments is that using exceptions is great if you expect them to rarely occur, because they result in cleaner interfaces & logic that's free of a ton of condition checking for "badness" every time you make a function call. But you shouldn't use exceptions as a means of normal communication between a caller and its callees, because that mode of communication is significantly more expensive than simply using return values.
This happened to me while finding the path from root to node of a binary tree. I was using a stack to store the nodes in preorder and the recursion wouldnt stop until the last node returned NULL. I used a global variable, integer i=1, and when I reached the node I was looking for I set that variable to 0 and used while(i==0) return stack; to allow the program to go back up the memory stack without popping my nodes off.

Iterating without incurring the cost of IF statements

My question is based on curiosity and not whether there is another approach to the problem or not. It is a strange/interesting question, so please read it with an open mind.
Let's assume there is a game loop that is being called every frame. The game loop in turn calls several functions through a myriad of if statements. For example, if the user has GUI to false then don't refresh the GUI otherwise call RefreshGui(). There are many other if statements in the loop and they call their respective functions if they are true. Some are if/if-else.../else which are more costly in the worst case. Even the functions that are called, if the if statement is true, have logic. If user wants raypicking on all objects call FunctionA(), if user wants raypicking on lights, call FunctionB(), ... , else call all functions. Hopefully you get the idea.
My point is, that is a lot of redundant if statements. So I decided to use function pointers instead. Now my assumption is that a function pointer is always going to be faster than an if statement. It is a replacement for if/else. So if the user wants to switch between two different camera modes, he/she presses the C key to toggle between them. The callback function for the keyboard changes the function pointer to the correct UpdateCamera function (in this case, the function pointer can point to either UpdateCameraFps() or UpdateCameraArcBall() )... you get the gist of it.
Now to the question itself. What if I have several update functions all with the same signature (let's say void (*Update)(float time) ), so that a function pointer can potentially point to any one of them. Then, I have a vector which is used to store the pointers. Then in my main update loop, I go through the vector and call each update function. I can remove/add and even change the order of the updates, without changing the underlying code. In the best case, I might only be calling one update function or in the worst case all of them, all with a very clean while loop and no nasty (potentially nested) if statements. I have implemented this part and it works great. I am aware, that, with each iteration of the while loop responsible for iterating through the vector, I am checking whether the itrBegin == itrEnd. More specifically while (itrBegin != itrEnd). Is there any way to avoid the call to the if statements? Can I use branch prediction to my advantage (or am I taking advantage of it already without knowing)?
Again, please take the question as-is, i.e. I am not looking for a different approach (although you are more than welcome to give one).
EDIT: A few replies state that this is an unneeded premature optimization and I should not be focusing on it and that the if-statement(s) cost is minuscule compared to the work done in all the separate update functions. Very true, and I completely agree, but that was not the point of the question and I apologize if I did not make the question clearer. I did learn quite a few new things with all the replies though!
there is a game loop that is being called every frame
That's a backwards way of describing it. A game loop doesn't run during a frame, a frame is handled in the body of the game loop.
my assumption is that a function pointer is always going to be faster than an if statement
Have you tested that? It's not likely to be true, especially if you're changing the pointer frequently (which really messes with the CPU's branch prediction).
Can I use branch prediction to my advantage (or am I taking advantage of it already without knowing)?
This is just wishful thinking. By having one indirect call inside your loop calling a bunch of different functions you are definitely working against the CPU branch prediction logic.
More specifically while (itrBegin != itrEnd). Is there any way to avoid the call to the if statements?
One thing you could do in order to avoid conditionals as you iterate the chain of functions is to use a linked list. Then each function can call the next one unconditionally, and you simply install your termination logic as the last function in the chain (longjmp or something). Or you could hopefully just never terminate, include glSwapBuffers (or the equivalent for your graphics API) in the list and just link it back to the beginning.
First, profile your code. Then optimize the parts that need it.
"if" statements are the least of your concerns. Typically, with optimization, you focus on loops, I/O operations, API calls (e.g. SQL), containers/algorithms that are inefficient and used frequently.
Using function pointers to try to optimize is typically the worst thing you can do. You kill any chance at code readability and work against the CPU and compiler. I recommend using polymorphism or just use the "if" statements.
To me, this is asking for an event-driven approach. Rather than checking every time if you need to do something, monitor for the incoming request to do something.
I don't know if you consider it a deviation from your approach, but it would reduce the number of if...then statements to 1.
while( active )
{
// check message queue
if( messages )
{
// act on each message and update flags accordingly
}
// draw based on flags (whether or not they changed is irrelevant)
}
EDIT: Also I agree with the poster who stated that the loop should not be based on frames; the frames should be based on the loop.
If the conditions checked by your ifs are not changing during the loop, you could check them all once, and set a function pointer to the function you'd like to call in that case. Then in the loop call the function the function pointer points to.