I'm designing a preloader-based lock tracing utility that attaches to Pthreads, and I've run into a weird issue. The program works by providing wrappers that replace relevant Pthreads functions at runtime; these do some logging, and then pass the args to the real Pthreads function to do the work. They do not modify the arguments passed to them, obviously. However, when testing, I discovered that the condition variable pointer passed to my pthread_cond_wait() wrapper does not match the one that gets passed to the underlying Pthreads function, which promptly crashes with "futex facility returned an unexpected error code," which, from what I've gathered, usually indicates an invalid sync object passed in. Relevant stack trace from GDB:
#8 __pthread_cond_wait (cond=0x7f1b14000d12, mutex=0x55a2b961eec0) at pthread_cond_wait.c:638
#9 0x00007f1b1a47b6ae in pthread_cond_wait (cond=0x55a2b961f290, lk=0x55a2b961eec0)
at pthread_trace.cpp:56
I'm pretty mystified. Here's the code for my pthread_cond_wait() wrapper:
int pthread_cond_wait(pthread_cond_t* cond, pthread_mutex_t* lk) {
// log arrival at wait
the_tracer.add_event(lktrace::event::COND_WAIT, (size_t) cond);
// run pthreads function
GET_REAL_FN(pthread_cond_wait, int, pthread_cond_t*, pthread_mutex_t*);
int e = REAL_FN(cond, lk);
if (e == 0) the_tracer.add_event(lktrace::event::COND_LEAVE, (size_t) cond);
else {
the_tracer.add_event(lktrace::event::COND_ERR, (size_t) cond);
}
return e;
}
// GET_REAL_FN is defined as:
#define GET_REAL_FN(name, rtn, params...) \
typedef rtn (*real_fn_t)(params); \
static const real_fn_t REAL_FN = (real_fn_t) dlsym(RTLD_NEXT, #name); \
assert(REAL_FN != NULL) // semicolon absence intentional
And here's the code for __pthread_cond_wait in glibc 2.31 (this is the function that gets called if you call pthread_cond_wait normally, it has a different name because of versioning stuff. The stack trace above confirms that this is the function that REAL_FN points to):
int
__pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex)
{
/* clockid is unused when abstime is NULL. */
return __pthread_cond_wait_common (cond, mutex, 0, NULL);
}
As you can see, neither of these functions modifies cond, yet it is not the same in the two frames. Examining the two different pointers in a core dump shows that they point to different contents, as well. I can also see in the core dump that cond does not appear to change in my wrapper function (i.e. it's still equal to 0x5... in frame 9 at the crash point, which is the call to REAL_FN). I can't really tell which pointer is correct by looking at their contents, but I'd assume it's the one passed in to my wrapper from the target application. Both pointers point to valid segments for program data (marked ALLOC, LOAD, HAS_CONTENTS).
My tool is definitely causing the error somehow, the target application runs fine if it is not attached. What am I missing?
UPDATE: Actually, this doesn't appear to be what's causing the error, because calls to my pthread_cond_wait() wrapper succeed many times before the error occurs, and exhibit similar behavior (pointer value changing between frames without explanation) each time. I'm leaving the question open, though, because I still don't understand what's going on here and I'd like to learn.
UPDATE 2: As requested, here's the code for tracer.add_event():
// add an event to the calling thread's history
// hist_entry ctor gets timestamp & stack trace
void tracer::add_event(event e, size_t obj_addr) {
size_t tid = get_tid();
hist_map::iterator hist = histories.contains(tid);
assert(hist != histories.end());
hist_entry ev (e, obj_addr);
hist->second.push_back(ev);
}
// hist_entry ctor:
hist_entry::hist_entry(event e, size_t obj_addr) :
ts(chrono::steady_clock::now()), ev(e), addr(obj_addr) {
// these are set in the tracer ctor
assert(start_addr && end_addr);
void* buf[TRACE_DEPTH];
int v = backtrace(buf, TRACE_DEPTH);
int a = 0;
// find first frame outside of our own code
while (a < v && start_addr < (size_t) buf[a] &&
end_addr > (size_t) buf[a]) ++a;
// skip requested amount of frames
a += TRACE_SKIP;
if (a >= v) a = v-1;
caller = buf[a];
}
histories is a lock-free concurrent hashmap from libcds (mapping tid->per-thread vectors of hist_entry), and its iterators are guaranteed to be thread-safe as well. GNU docs say backtrace() is thread-safe, and there's no data races mentioned in the CPP docs for steady_clock::now(). get_tid() just calls pthread_self() using the same method as the wrapper functions, and casts its result to size_t.
Hah, figured it out! The issue is that Glibc exposes multiple versions of pthread_cond_wait(), for backwards compatibility. The version I reproduce in my question is the current version, the one we want to call. The version that dlsym() was finding is the backwards-compatible version:
int
__pthread_cond_wait_2_0 (pthread_cond_2_0_t *cond, pthread_mutex_t *mutex)
{
if (cond->cond == NULL)
{
pthread_cond_t *newcond;
newcond = (pthread_cond_t *) calloc (sizeof (pthread_cond_t), 1);
if (newcond == NULL)
return ENOMEM;
if (atomic_compare_and_exchange_bool_acq (&cond->cond, newcond, NULL))
/* Somebody else just initialized the condvar. */
free (newcond);
}
return __pthread_cond_wait (cond->cond, mutex);
}
As you can see, this version tail-calls the current one, which is probably why this took so long to detect: GDB is normally pretty good at detecting frames elided by tail calls, but I'm guessing it didn't detect this one because the functions have the "same" name (and the error doesn't affect the mutex functions because they don't expose multiple versions). This blog post goes into much more detail, coincidentally specifically about pthread_cond_wait(). I stepped through this function many times while debugging and sort of tuned it out, because every call into glibc is wrapped in multiple layers of indirection; I only realized what was going on when I set a breakpoint on the pthread_cond_wait symbol, instead of a line number, and it stopped at this function.
Anyway, this explains the changing pointer phenomenon: what happens is that the old, incorrect function gets called, reinterprets the pthread_cond_t object as a struct containing a pointer to a pthread_cond_t object, allocates a new pthread_cond_t for that pointer, and then passes the newly allocated one to the new, correct function. The frame of the old function gets elided by the tail-call, and to a GDB backtrace after leaving the old function it looks like the correct function gets called directly from my wrapper, with a mysteriously changed argument.
The fix for this was simple: GNU provides the libdl extension dlvsym(), which is like dlsym() but also takes a version string. Looking for pthread_cond_wait with version string "GLIBC_2.3.2" solves the problem. Note that these versions do not usually correspond to the current version (i.e. pthread_create()/exit() have version string "GLIBC_2.2.5"), so they need to be looked up on a per-function basis. The correct string can be determined either by looking at the compat_symbol() or versioned_symbol() macros that are somewhere near the function definition in the glibc source, or by using readelf to see the names of the symbols in the compiled library (mine has "pthread_cond_wait##GLIBC_2.3.2" and "pthread_cond_wait##GLIBC_2.2.5").
In a blog post from not too long ago, Scott Vokes describes a technical problem associated to lua's implementation of coroutines using the C functions setjmp and longjmp:
The main limitation of Lua coroutines is that, since they are implemented with setjmp(3) and longjmp(3), you cannot use them to call from Lua into C code that calls back into Lua that calls back into C, because the nested longjmp will clobber the C function’s stack frames. (This is detected at runtime, rather than failing silently.)
I haven’t found this to be a problem in practice, and I’m not aware of any way to fix it without damaging Lua’s portability, one of my favorite things about Lua — it will run on literally anything with an ANSI C compiler and a modest amount of space. Using Lua means I can travel light. :)
I have used coroutines a fair amount and I thought I understood broadly what was going on and what setjmp and longjmp do, however I read this at some point and realized that I didn't really understand it. To try to figure it out, I tried to make a program that I thought should cause a problem based on the description, and instead it seems to work fine.
However there are a few other places that I've seen people seem to allege that there are problems:
http://coco.luajit.org/
http://lua-users.org/lists/lua-l/2005-03/msg00179.html
The question is:
Under what circumstances do lua coroutines fail to work because of C function stack frames getting clobbered?
What exactly is the result? Does "detected at runtime" mean, lua panic? Or something else?
Does this still affect the most recent versions of lua (5.3) or is this actually a 5.1 issue or something?
Here was the code which I produced. In my test, it is linked with lua 5.3.1, compiled as C code, and the test itself is compiled itself as C++ code at C++11 standard.
extern "C" {
#include <lauxlib.h>
#include <lua.h>
}
#include <cassert>
#include <iostream>
#define CODE(C) \
case C: { \
std::cout << "When returning to " << where << " got code '" #C "'" << std::endl; \
break; \
}
void handle_resume_code(int code, const char * where) {
switch (code) {
CODE(LUA_OK)
CODE(LUA_YIELD)
CODE(LUA_ERRRUN)
CODE(LUA_ERRMEM)
CODE(LUA_ERRERR)
default:
std::cout << "An unknown error code in " << where << std::endl;
}
}
int trivial(lua_State *, int, lua_KContext) {
std::cout << "Called continuation function" << std::endl;
return 0;
}
int f(lua_State * L) {
std::cout << "Called function 'f'" << std::endl;
return 0;
}
int g(lua_State * L) {
std::cout << "Called function 'g'" << std::endl;
lua_State * T = lua_newthread(L);
lua_getglobal(T, "f");
handle_resume_code(lua_resume(T, L, 0), __func__);
return lua_yieldk(L, 0, 0, trivial);
}
int h(lua_State * L) {
std::cout << "Called function 'h'" << std::endl;
lua_State * T = lua_newthread(L);
lua_getglobal(T, "g");
handle_resume_code(lua_resume(T, L, 0), __func__);
return lua_yieldk(L, 0, 0, trivial);
}
int main () {
std::cout << "Starting:" << std::endl;
lua_State * L = luaL_newstate();
// init
{
lua_pushcfunction(L, f);
lua_setglobal(L, "f");
lua_pushcfunction(L, g);
lua_setglobal(L, "g");
lua_pushcfunction(L, h);
lua_setglobal(L, "h");
}
assert(lua_gettop(L) == 0);
// Some action
{
lua_State * T = lua_newthread(L);
lua_getglobal(T, "h");
handle_resume_code(lua_resume(T, nullptr, 0), __func__);
}
lua_close(L);
std::cout << "Bye! :-)" << std::endl;
}
The output I get is:
Starting:
Called function 'h'
Called function 'g'
Called function 'f'
When returning to g got code 'LUA_OK'
When returning to h got code 'LUA_YIELD'
When returning to main got code 'LUA_YIELD'
Bye! :-)
Much thanks to # Nicol Bolas for the very detailed answer!
After reading his answer, reading the official docs, reading some emails and playing around with it some more, I want to refine the question / ask a specific follow-up question, however you want to look at it.
I think this term 'clobbering' is not good for describing this issue and this was part of what confused me -- nothing is being "clobbered" in the sense of being written to twice and the first value being lost, the issue is solely, as #Nicol Bolas points out, that longjmp tosses part of the C stack, and if you are hoping to restore the stack later, too bad.
The issue is actually described very nicely in section 4.7 of lua 5.2 manual, in a link provided by #Nicol Bolas.
Curiously, there is no equivalent section in the lua 5.1 documentation. However, lua 5.2 has this to say about lua_yieldk:
Yields a coroutine.
This function should only be called as the return expression of a C function, as follows:
return lua_yieldk (L, n, i, k);
Lua 5.1 manual says something similar, about lua_yield instead:
Yields a coroutine.
This function should only be called as the return expression of a C function, as follows:
return lua_yieldk (L, n, i, k);
Some natural questions then:
Why does it matter if I use return here or not? If lua_yieldk will call longjmp then the lua_yieldk will never return anyways, so it shouldn't matter if I return then? So that cannot be what is happening, right?
Supposing instead that lua_yieldk just makes a note within the lua state that the current C api call has stated that it wants to yield, and then when it finally does return, lua will figure out what happens next. Then this solves the problem of saving C stack frames, no? Since after we return to lua normally, those stack frames have expired anyways -- so the complications described in #Nicol Bolas picture are skirted around? And second of all, in 5.2 at least the semantics are never that we should restore C stack frames, it seems -- lua_yieldk resumes to a continuation function, not to the lua_yieldk caller, and lua_yield apparently resumes to the caller of the current api call, not to the lua_yield caller itself.
And, the most important question:
If I consistently use lua_yieldk in the form return lua_yieldk(...) specified in the docs, returning from a lua_CFunction that was passed to lua, is it still possible to trigger the attempt to yield across a C-call boundary error?
Finally, (but this is less important), I would like to see a concrete example of what it looks like when a naive programmer "isn't careful" and triggers the attempt to yield across a C-call boundary error. I get the idea that there could be problem associated to setjmp and longjmp tossing stack frames that we later need, but I want to see some real lua / lua c api code that I can point to and say "for instance, don't do that", and this is surprisingly elusive.
I found this email where someone reported this error with some lua 5.1 code, and I attempted to reproduce it in lua 5.3. However what I found was that, this looks like just poor error reporting from the lua implementation -- the actual bug is being caused because the user is not setting up their coroutine properly. The proper way to load the coroutine is, create the thread, push a function onto the thread stack, and then call lua_resume on the thread state. Instead the user was using dofile on the thread stack, which executes the function there after loading it, rather than resuming it. So it is effectively yield outside of a coroutine iiuc, and when I patch this, his code works fine, using both lua_yield and lua_yieldk in lua 5.3.
Here is the listing I produced:
#include <cassert>
#include <cstdio>
extern "C" {
#include "lua.h"
#include "lauxlib.h"
}
//#define USE_YIELDK
bool running = true;
int lua_print(lua_State * L) {
if (lua_gettop(L)) {
printf("lua: %s\n", lua_tostring(L, -1));
}
return 0;
}
int lua_finish(lua_State *L) {
running = false;
printf("%s called\n", __func__);
return 0;
}
int trivial(lua_State *, int, lua_KContext) {
printf("%s called\n", __func__);
return 0;
}
int lua_sleep(lua_State *L) {
printf("%s called\n", __func__);
#ifdef USE_YIELDK
printf("Calling lua_yieldk\n");
return lua_yieldk(L, 0, 0, trivial);
#else
printf("Calling lua_yield\n");
return lua_yield(L, 0);
#endif
}
const char * loop_lua =
"print(\"loop.lua\")\n"
"\n"
"local i = 0\n"
"while true do\n"
" print(\"lua_loop iteration\")\n"
" sleep()\n"
"\n"
" i = i + 1\n"
" if i == 4 then\n"
" break\n"
" end\n"
"end\n"
"\n"
"finish()\n";
int main() {
lua_State * L = luaL_newstate();
lua_pushcfunction(L, lua_print);
lua_setglobal(L, "print");
lua_pushcfunction(L, lua_sleep);
lua_setglobal(L, "sleep");
lua_pushcfunction(L, lua_finish);
lua_setglobal(L, "finish");
lua_State* cL = lua_newthread(L);
assert(LUA_OK == luaL_loadstring(cL, loop_lua));
/*{
int result = lua_pcall(cL, 0, 0, 0);
if (result != LUA_OK) {
printf("%s error: %s\n", result == LUA_ERRRUN ? "Runtime" : "Unknown", lua_tostring(cL, -1));
return 1;
}
}*/
// ^ This pcall (predictably) causes an error -- if we try to execute the
// script, it is going to call things that attempt to yield, but we did not
// start the script with lua_resume, we started it with pcall, so it's not
// okay to yield.
// The reported error is "attempt to yield across a C-call boundary", but what
// is really happening is just "yield from outside a coroutine" I suppose...
while (running) {
int status;
printf("Waking up coroutine\n");
status = lua_resume(cL, L, 0);
if (status == LUA_YIELD) {
printf("coroutine yielding\n");
} else {
running = false; // you can't try to resume if it didn't yield
if (status == LUA_ERRRUN) {
printf("Runtime error: %s\n", lua_isstring(cL, -1) ? lua_tostring(cL, -1) : "(unknown)" );
lua_pop(cL, -1);
break;
} else if (status == LUA_OK) {
printf("coroutine finished\n");
} else {
printf("Unknown error\n");
}
}
}
lua_close(L);
printf("Bye! :-)\n");
return 0;
}
Here is the output when USE_YIELDK is commented out:
Waking up coroutine
lua: loop.lua
lua: lua_loop iteration
lua_sleep called
Calling lua_yield
coroutine yielding
Waking up coroutine
lua: lua_loop iteration
lua_sleep called
Calling lua_yield
coroutine yielding
Waking up coroutine
lua: lua_loop iteration
lua_sleep called
Calling lua_yield
coroutine yielding
Waking up coroutine
lua: lua_loop iteration
lua_sleep called
Calling lua_yield
coroutine yielding
Waking up coroutine
lua_finish called
coroutine finished
Bye! :-)
Here is the output when USE_YIELDK is defined:
Waking up coroutine
lua: loop.lua
lua: lua_loop iteration
lua_sleep called
Calling lua_yieldk
coroutine yielding
Waking up coroutine
trivial called
lua: lua_loop iteration
lua_sleep called
Calling lua_yieldk
coroutine yielding
Waking up coroutine
trivial called
lua: lua_loop iteration
lua_sleep called
Calling lua_yieldk
coroutine yielding
Waking up coroutine
trivial called
lua: lua_loop iteration
lua_sleep called
Calling lua_yieldk
coroutine yielding
Waking up coroutine
trivial called
lua_finish called
coroutine finished
Bye! :-)
Think about what happens when a coroutine does a yield. It stops executing, and processing returns to whomever it was that called resume on that coroutine, correct?
Well, let's say you have this code:
function top()
coroutine.yield()
end
function middle()
top()
end
function bottom()
middle()
end
local co = coroutine.create(bottom);
coroutine.resume(co);
At the moment of the call to yield, the Lua stack looks like this:
-- top
-- middle
-- bottom
-- yield point
When you call yield, the Lua call stack that is part of the coroutine is preserved. When you do resume, the preserved call stack is executed again, starting where it left off before.
OK, now let's say that middle was in fact not a Lua function. Instead, it was a C function, and that C function calls the Lua function top. So conceptually, your stack looks like this:
-- Lua - top
-- C - middle
-- Lua - bottom
-- Lua - yield point
Now, please note what I said before: this is what your stack looks like conceptually.
Because your actual call stack looks nothing like this.
In reality, there are really two stacks. There is Lua's internal stack, defined by a lua_State. And there's C's stack. Lua's internal stack, at the time when yield is about to be called, looks something like this:
-- top
-- Some C stuff
-- bottom
-- yield point
So what does the stack look like to C? Well, it looks like this:
-- arbitrary Lua interpreter stuff
-- middle
-- arbitrary Lua interpreter stuff
-- setjmp
And that right there is the problem. See, when Lua does a yield, it's going to call longjmp. That function is based on the behavior of the C stack. Namely, it's going to return to where setjmp was.
The Lua stack will be preserved because the Lua stack is separate from the C stack. But the C stack? Everything between the longjmp and setjmp?. Gone. Kaput. Lost forever.
Now you may go, "wait, doesn't the Lua stack know that it went into C and back into Lua"? A bit. But the Lua stack is incapable of doing something that C is incapable of. And C is simply not capable of preserving a stack (well, not without special libraries). So while the Lua stack is vaguely aware that some kind of C process happened in the middle of its stack, it has no way to reconstitute what was there.
So what happens if you resume this yielded coroutine?
Nasal demons. And nobody likes those. Fortunately, Lua 5.1 and above (at least) will error whenever you attempt to yield across C.
Note that Lua 5.2+ does have ways of fixing this. But it's not automatic; it requires explicit coding on your part.
When Lua code that is in a coroutine calls your C code, and your C code calls Lua code that may yield, you can use lua_callk or lua_pcallk to call the possibly-yielding Lua functions. These calling functions take an extra parameter: a "continuation" function.
If the Lua code you call does yield, then the lua_*callk function won't ever actually return (since your C stack will have been destroyed). Instead, it will call the continuation function you provided in your lua_*callk function. As you can guess by the name, the continuation function's job is to continue where your previous function left off.
Now, Lua does preserve the stack for your continuation function, so it gets the stack in the same state that your original C function was in. Well, except that the function+arguments that you called (with lua_*callk) are removed, and the return values from that function are pushed onto your stack. Outside of that, the stack is all the same.
There is also lua_yieldk. This allows your C function to yield back to Lua, such that when the coroutine is resumed, it calls the provided continuation function.
Note that Coco gives Lua 5.1 the ability to resolve this problem. It is capable (though OS/assembly/etc magic) of preserving the C stack during a yield operation. LuaJIT versions before 2.0 also provided this feature.
C++ note
You marked your question with the C++ tag, so I'll assume that's involved here.
Among the many differences between C and C++ is the fact that C++ is far more dependent on the nature of its callstack than Lua. In C, if you discard a stack, you might lose resources that weren't cleaned up. C++ however is required to call destructors of functions declared on the stack at some point. The standard does not allow you to just throw them away.
So continuations only work in C++ if there is nothing on the stack which needs to have a destructor call. Or more specifically, only types that are trivially destructible can be sitting on the stack if you call any of the continuation function Lua APIs.
Of course, Coco handles C++ just fine, since it's actually preserving the C++ stack.
Posting this as an answer which complements #Nicol Bolas' answer, and so that
I can have space to write down what it took for me to understand the original
question, and the answers to the secondary questions / a code listing.
If you read Nicol Bolas' answer but still have questions like I did, here are
some additional hints:
The three layers on the call stack, Lua, C, Lua, are essential to the problem.
If you only have two layers, Lua and C, you don't get the problem.
In imagining how the coroutine call is supposed to work -- the lua stack looks
a certain way, the C stack looks a certain way, the call yields (longjmp) and
later is resumed... the problem does not happen immediately when it is
resumed.
The problem happens when the resumed function later tries to return, to your
C function.
Because, for the coroutine semantics to work out, it is supposed to return
into a C function call, but the stack frames for that are gone, and cannot be
restored.
The workaround for this lack of ability to restore those stack frames is to
use lua_callk, lua_pcallk, which allow you to provide a substitute
function which can be called in place of that C function whose frames were
wiped out.
The issue about return lua_yieldk(...) appears to have nothing to do with
any of this. From skimming the implementation of lua_yieldk it appears that
it does indeed always longjmp, and it may only return in some obscure case
involving lua debugging hooks (?).
Lua internally (at current version) keeps track of when yield should not be
allowed, by keeping a counter variable nny (number non-yieldable) associated
to the lua state, and when you call lua_call or lua_pcall from a C api
function (a lua_CFunction which you earlier pushed to lua), nny is
incremented, and is only decremented when that call or pcall returns. When
nny is nonzero, it is not safe to yield, and you get this yield across
C-api boundary error if you try to yield anyways.
Here is a simple listing that produces the problem and reports the errors,
if you are like me and like to have a concrete code examples. It demonstrates
some of the difference in using lua_call, lua_pcall, and lua_pcallk
within a function called by a coroutine.
extern "C" {
#include <lauxlib.h>
#include <lua.h>
}
#include <cassert>
#include <iostream>
//#define USE_PCALL
//#define USE_PCALLK
#define CODE(C) \
case C: { \
std::cout << "When returning to " << where << " got code '" #C "'" << std::endl; \
break; \
}
#define ERRCODE(C) \
case C: { \
std::cout << "When returning to " << where << " got code '" #C "': " << lua_tostring(L, -1) << std::endl; \
break; \
}
int report_resume_code(int code, const char * where, lua_State * L) {
switch (code) {
CODE(LUA_OK)
CODE(LUA_YIELD)
ERRCODE(LUA_ERRRUN)
ERRCODE(LUA_ERRMEM)
ERRCODE(LUA_ERRERR)
default:
std::cout << "An unknown error code in " << where << ": " << lua_tostring(L, -1) << std::endl;
}
return code;
}
int report_pcall_code(int code, const char * where, lua_State * L) {
switch(code) {
CODE(LUA_OK)
ERRCODE(LUA_ERRRUN)
ERRCODE(LUA_ERRMEM)
ERRCODE(LUA_ERRERR)
default:
std::cout << "An unknown error code in " << where << ": " << lua_tostring(L, -1) << std::endl;
}
return code;
}
int trivial(lua_State *, int, lua_KContext) {
std::cout << "Called continuation function" << std::endl;
return 0;
}
int f(lua_State * L) {
std::cout << "Called function 'f', yielding" << std::endl;
return lua_yield(L, 0);
}
int g(lua_State * L) {
std::cout << "Called function 'g'" << std::endl;
lua_getglobal(L, "f");
#ifdef USE_PCALL
std::cout << "pcall..." << std::endl;
report_pcall_code(lua_pcall(L, 0, 0, 0), __func__, L);
// ^ yield across pcall!
// If we yield, there is no way ever to return normally from this pcall,
// so it is an error.
#elif defined(USE_PCALLK)
std::cout << "pcallk..." << std::endl;
report_pcall_code(lua_pcallk(L, 0, 0, 0, 0, trivial), __func__, L);
#else
std::cout << "call..." << std::endl;
lua_call(L, 0, 0);
// ^ yield across call!
// This results in an error being reported in lua_resume, rather than at
// the pcall
#endif
return 0;
}
int main () {
std::cout << "Starting:" << std::endl;
lua_State * L = luaL_newstate();
// init
{
lua_pushcfunction(L, f);
lua_setglobal(L, "f");
lua_pushcfunction(L, g);
lua_setglobal(L, "g");
}
assert(lua_gettop(L) == 0);
// Some action
{
lua_State * T = lua_newthread(L);
lua_getglobal(T, "g");
while (LUA_YIELD == report_resume_code(lua_resume(T, L, 0), __func__, T)) {}
}
lua_close(L);
std::cout << "Bye! :-)" << std::endl;
}
Example output:
call
Starting:
Called function 'g'
call...
Called function 'f', yielding
When returning to main got code 'LUA_ERRRUN': attempt to yield across a C-call boundary
Bye! :-)
pcall
Starting:
Called function 'g'
pcall...
Called function 'f', yielding
When returning to g got code 'LUA_ERRRUN': attempt to yield across a C-call boundary
When returning to main got code 'LUA_OK'
Bye! :-)
pcallk
Starting:
Called function 'g'
pcallk...
Called function 'f', yielding
When returning to main got code 'LUA_YIELD'
Called continuation function
When returning to main got code 'LUA_OK'
Bye! :-)
The following code just ends up printing "5"
#include <iostream>
#include <setjmp.h>
static jmp_buf buf;
float funcB()
{
setjmp(buf);
return 1.6f;
}
int funcA()
{
longjmp(buf,5);
std::cout<<"b";
return 2;
}
int main()
{
funcB();
std::cout<<funcA();
}
But this doesn't make any sense, as setjmp is returning 5, not either function...
Don't worry, I'm not using this code anywhere, I'm just curious!
What you are trying to do is specifically designated as undefined behavior in the documentation:
The longjmp() function restores the environment saved by the most recent invocation of setjmp() in the same thread, with the corresponding jmp_buf argument. If there is no such invocation, or if the function containing the invocation of setjmp() has terminated execution in the interim, the behaviour is undefined.
Since the function that called setjmp (i.e. funcB) has exited before you call longjmp in funcA, the behavior is undefined (it crashes on ideone).
You cannot use longjmp to return to a function you have exited. In other words longjmp won't restore the stack for you. See here.
What you need is a language like scheme, where doing this kind of thing would be perfectly normal.
It appears whatever compiler you are using is using a strict interpretation of what setjmp and longjmp do:
This macro may return more than once: A first time, on its direct
invocation; In this case it always returns zero. When longjmp is
called with the information set to env, the macro returns again; this
time it returns the value passed to longjmp as second argument if this
is different from zero, or 1 if it is zero.
From here
As it is UB, it can do this, order a pizza, end the world ... anything would be valid.
So I have a library (not written by me) which unfortunately uses abort() to deal with certain errors. At the application level, these errors are recoverable so I would like to handle them instead of the user seeing a crash. So I end up writing code like this:
static jmp_buf abort_buffer;
static void abort_handler(int) {
longjmp(abort_buffer, 1); // perhaps siglongjmp if available..
}
int function(int x, int y) {
struct sigaction new_sa;
struct sigaction old_sa;
sigemptyset(&new_sa.sa_mask);
new_sa.sa_handler = abort_handler;
sigaction(SIGABRT, &new_sa, &old_sa);
if(setjmp(abort_buffer)) {
sigaction(SIGABRT, &old_sa, 0);
return -1
}
// attempt to do some work here
int result = f(x, y); // may call abort!
sigaction(SIGABRT, &old_sa, 0);
return result;
}
Not very elegant code. Since this pattern ends up having to be repeated in a few spots of the code, I would like to simplify it a little and possibly wrap it in a reusable object. My first attempt involves using RAII to handle the setup/teardown of the signal handler (needs to be done because each function needs different error handling). So I came up with this:
template <int N>
struct signal_guard {
signal_guard(void (*f)(int)) {
sigemptyset(&new_sa.sa_mask);
new_sa.sa_handler = f;
sigaction(N, &new_sa, &old_sa);
}
~signal_guard() {
sigaction(N, &old_sa, 0);
}
private:
struct sigaction new_sa;
struct sigaction old_sa;
};
static jmp_buf abort_buffer;
static void abort_handler(int) {
longjmp(abort_buffer, 1);
}
int function(int x, int y) {
signal_guard<SIGABRT> sig_guard(abort_handler);
if(setjmp(abort_buffer)) {
return -1;
}
return f(x, y);
}
Certainly the body of function is much simpler and more clear this way, but this morning a thought occurred to me. Is this guaranteed to work? Here's my thoughts:
No variables are volatile or change between calls to setjmp/longjmp.
I am longjmping to a location in the same stack frame as the setjmp and returning normally, so I am allowing the code to execute the cleanup code that the compiler emitted at the exit points of the function.
It appears to work as expected.
But I still get the feeling that this is likely undefined behavior. What do you guys think?
I assume that f is in a third party library/app, because otherwise you could just fix it to not call abort. Given that, and that RAII may or may not reliably produce the right results on all platforms/compilers, you have a few options.
Create a tiny shared object that defines abort and LD_PRELOAD it. Then you control what happens on abort, and NOT in a signal handler.
Run f within a subprocess. Then you just check the return code and if it failed try again with updated inputs.
Instead of using the RAII, just call your original function from multiple call points and let it manually do the setup/teardown explicitly. It still eliminates the copy-paste in that case.
I actually like your solution, and have coded something similar in test harnesses to check that a target function assert()s as expected.
I can't see any reason for this code to invoke undefined behaviour. The C Standard seems to bless it: handlers resulting from an abort() are exempted from the restriction on calling library functions from a handler. (Caveat: this is 7.14.1.1(5) of C99 - sadly, I don't have a copy of C90, the version referenced by the C++ Standard).
C++03 adds a further restriction: If any automatic objects would be destroyed by a thrown exception transferring control to another (destination) point in the program, then a call to longjmp(jbuf, val) at the throw point that transfers control to the same (destination) point has undefined behavior. I'm supposing that your statement that 'No variables are volatile or change between calls to setjmp/longjmp' includes instantiating any automatic C++ objects. (I guess this is some legacy C library?).
Nor is POSIX async signal safety (or lack thereof) an issue - abort() generates its SIGABRT synchronously with program execution.
The biggest concern would be corrupting the global state of the 3rd party code: it's unlikely that the author will take pains to get the state consistent before an abort(). But, if you're correct that no variables change, then this isn't a problem.
If someone with a better understanding of the standardese can prove me wrong, I'd appreciate the enlightenment.
Is there a line of code that will terminate the program?
Something like python's sys.exit()?
While you can call exit() (and may need to do so if your application encounters some fatal error), the cleanest way to exit a program is to return from main():
int main()
{
// do whatever your program does
} // function returns and exits program
When you call exit(), objects with automatic storage duration (local variables) are not destroyed before the program terminates, so you don't get proper cleanup. Those objects might need to clean up any resources they own, persist any pending state changes, terminate any running threads, or perform other actions in order for the program to terminate cleanly.
#include <cstdlib>
...
exit( exit_code );
There are several ways to cause your program to terminate. Which one is appropriate depends on why you want your program to terminate. The vast majority of the time it should be by executing a return statement in your main function. As in the following.
int main()
{
f();
return 0;
}
As others have identified this allows all your stack variables to be properly destructed so as to clean up properly. This is very important.
If you have detected an error somewhere deep in your code and you need to exit out you should throw an exception to return to the main function. As in the following.
struct stop_now_t { };
void f()
{
// ...
if (some_condition())
throw stop_now_t();
// ...
}
int main()
{
try {
f();
} catch (stop_now_t& stop) {
return 1;
}
return 0;
}
This causes the stack to be unwound an all your stack variables to be destructed. Still very important. Note that it is appropriate to indicate failure with a non-zero return value.
If in the unlikely case that your program detects a condition that indicates it is no longer safe to execute any more statements then you should use std::abort(). This will bring your program to a sudden stop with no further processing. std::exit() is similar but may call atexit handlers which could be bad if your program is sufficiently borked.
Yes! exit(). It's in <cstdlib>.
Allowing the execution flow to leave main by returning a value or allowing execution to reach the end of the function is the way a program should terminate except under unrecoverable circumstances. Returning a value is optional in C++, but I typically prefer to return EXIT_SUCCESS found in cstdlib (a platform-specific value that indicates the program executed successfully).
#include <cstdlib>
int main(int argc, char *argv[]) {
...
return EXIT_SUCCESS;
}
If, however, your program reaches an unrecoverable state, it should throw an exception. It's important to realise the implications of doing so, however. There are no widely-accepted best practices for deciding what should or should not be an exception, but there are some general rules you need to be aware of.
For example, throwing an exception from a destructor is nearly always a terrible idea because the object being destroyed might have been destroyed because an exception had already been thrown. If a second exception is thrown, terminate is called and your program will halt without any further clean-up having been performed. You can use uncaught_exception to determine if it's safe, but it's generally better practice to never allow exceptions to leave a destructor.
While it's generally always possible for functions you call but didn't write to throw exceptions (for example, new will throw std::bad_alloc if it can't allocate enough memory), it's often difficult for beginner programmers to keep track of or even know about all of the special rules surrounding exceptions in C++. For this reason, I recommend only using them in situations where there's no sensible way for your program to continue execution.
#include <stdexcept>
#include <cstdlib>
#include <iostream>
int foo(int i) {
if (i != 5) {
throw std::runtime_error("foo: i is not 5!");
}
return i * 2;
}
int main(int argc, char *argv[]) {
try {
foo(3);
}
catch (const std::exception &e) {
std::cout << e.what() << std::endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
exit is a hold-over from C and may result in objects with automatic storage to not be cleaned up properly. abort and terminate effectively causes the program to commit suicide and definitely won't clean up resources.
Whatever you do, don't use exceptions, exit, or abort/terminate as a crutch to get around writing a properly structured program. Save them for exceptional situations.
if you are in the main you can do:
return 0;
or
exit(exit_code);
The exit code depends of the semantic of your code. 1 is error 0 e a normal exit.
In some other function of your program:
exit(exit_code)
will exit the program.
This SO post provides an answer as well as explanation why not to use exit(). Worth a read.
In short, you should return 0 in main(), as it will run all of the destructors and do object cleanup. Throwing would also work if you are exiting from an error.
In main(), there is also:
return 0;
#include <cstdlib>
...
/*wherever you want it to end, e.g. in an if-statement:*/
if (T == 0)
{
exit(0);
}
throw back to main which should return EXIT_FAILURE,
or std::terminate() if corrupted.
(from Martin York's comment)
else if(Decision >= 3)
{
exit(0);
}
exit(0); // at the end of main function before closing curly braces
simple enough..
exit ( 0 );
}//end of function
Make sure there is a space on both sides of the 0. Without spaces, the program will not stop.