I wrote a shared library in C++ on Linux, which contains a function f(). This library is used from multiple programs, calling the function.
Now I would like to conduct some debugging with that function, while calling it from program A. When calling it from any other program, it should fail, as long as I debug the function. After I do not have complete control over when the other programs are run, I would like to add an exception, which stops every program except program A when calling function f during Debug sessions.
How could I solve that?
The only way I can think of is by checking the information of the currently running process. You can get the pid by calling the getpid() function. All the information on all processes on a linux system can be found in the /proc/<pid> directory. When calling function f, you can check this information and decide whether to throw an exception or not.
There isn't a way that can't be gotten around. If another process has PTRACE on yours, it can make anything at all happen in your process, so any check you perform to try to make the function not work can be disabled.
Related
The scenario is here:
If a program is executed, at runtime assume it will link to some DLL files, the (master) program/process may or may not create multi-threading function-calls to the functions in DLLs.
Then is there a way that the DLL, of cause besides parameter-passing, can tell whether the master process, who calls the functions within the DLL at runtime, is in a single or multi-thread manner (For instance, by Open MP)?
You can check and compare the current thread ID to detect calls from different threads. You could also implement a DLLMain() function that gets called for every started and terminated thread. I'm pretty sure you can also retrieve a handle to the current process and enumerate the threads running in it. Only the first version will actually tell you if your code is run from different threads though, I think that e.g. WinSock will create a thread for you, even though your program is single-threaded otherwise.
BTW: Consider adding win32api tag and removing C++ tag.
I want to kill a child process if it does other system calls than read and write (and even filter these calls as well, but it's a different story) but there some system calls done by default.
I have compiled an empty test child (exits instantly) program and I also have a parent process which forks, enables ptracing and executes the child program. Parent process uses PTRACE_SYSCALL and checks orig_eax every time. My test program reports that the child was stopped 49 times (which, I assume, means 48 / 2 + 1 system calls).
I wanted to know whether the system calls sequence is always the same (initialization) and/or it's possible to know when I can start and when to stop kill-on-syscall in my parent?
I had a similar problem once (see my question on the topic). When a program starts, it executes a lot of system calls when initializing the application (such as loading shared libraries) before calling main(). What I did is to simply allow somewhat more system calls and use another means of security (such as chroot) to prevent the application from accessing undesired files.
A better option would be to somehow find the entry point of the main() function of the program (see this tutorial for writing debugging code) and disable system calls after that point. I don't know if it's possible to do in general case, but that's the way I would start to search.
After finding the entry point, there is another way of restricting the program from making certain system calls. Instead of using PTRACE_SYSCALL to check each system call done by the program, inject a prctl(PR_SET_SECCOMP, ...) call to the program (using ptrace()) then just leave the program running.
I'm currently wrapping a lua class in c++ and it's going pretty well so far. But I'm wondering if there some way to break a lua script from running(could be in the middle of the script) for another thread. So if I run my lua script on thread 1, can I break it from thread 2? Would lua_close(...) do that?
Thanks.
If this is an expected occurrence and most of the Lua script's time is spent inside Lua functions (i.e., not lengthy, blocking C calls), you could install a debug hook that checks for your "break flag" every N instructions and aborts the script. See the "debug library" section of Programming in Lua.
Then there must be some way to force the script to halt?
No, there doesn't.
Lua provides exactly one thread safety guarantee: two separate lua_States (separate being defined as different objects returned by lua_open) can be called freely from two different CPU threads. Thread A can call lua_State X, and thread B can call lua_State Y.
That is the only guarantee Lua gives you. Lua is inherently single-threaded; it only provides this guarantee because all of the global information for Lua is contained in a self-contained object: lua_State.
The rules of Lua require that only one thread talk to a lua_State at any one time. Therefore, it is impossible for Lua code to be executing and then be halted or paused by external C++ code. That would require breaking the rule: some other thread would have to call functions on this thread's lua_State.
If you want to abort a script in-progress, you will have to use debug hooks. This will dramatically slow down the performance of your script (if that matters to you). You will also need thread synchronization primitives, so that thread B can set a flag that thread A will read and halt on.
As for how you "abort" the script, that... is impossible in the most general case. And here's why.
The lua_State represents a Lua thread of execution. It's got a stack, a pointer to instructions to interpret, and some code it's currently executing. To "abort" this is not a concept that Lua has. What Lua does have is the concept of erroring. Calling lua_error from within C code being called by Lua will cause it to longjmp out of that function. This will also unwind the Lua thread, and it will return an error to the most recent function that handles errors.
Note: if you use lua_error in C++ code, you need to either have compiled Lua as C++ (thus causing errors to work via exception handling rather than setjmp/longjmp) or you need to make sure that every object with a destructor is off the stack when you call it. Calling longjmp in C++ code is not advisable without great care.
So if you issue lua_pcall to call some script, and in your debug hook, you issue lua_error, then you will get an error back in your lua_pcall statement. Lua's stack will be unwound, objects destroyed, etc. That's good.
The reason why this does not work in the general case (besides the fact that I have no idea how Lua would react to a debug hook calling lua_error) is quite simple: Lua scripts can execute pcall too. The error will go to the nearest pcall statement, whether it's in Lua or in C.
This Lua script will confound this method:
local function dostuff()
while true do end
end
while true do
pcall(dostuff)
end
So issuing lua_error does not guarantee that you'll "abort the script." Think of it like throwing exceptions; anyone could catch them, so you can't ensure that they are only caught in one place.
In general, you should never want to abort a Lua scripts execution in progress. This is not something you should be wanting to do.
When I exit my C++ program it crashes with errors like:
EAccessViolation with mesage 'Access violation at address 0...
and
Abnormal Program Termination
It is probably caused by some destructor because it happens only when the application exits. I use a few external libraries and cannot find the code that causes it. Is there a function that forces immediate program exit (something like kill in Linux) so that memory would have to be freed by the operating system? I could use this function in app exit event.
I know that it would be a terrible solution because it'd just hide the problem.
I'm just asking out of sheer curiosity, so please don't give me -1 :)
I tried exit(0) from stdlib but it didn't help.
EDIT:
Thanks for your numerous replies:)
I use Builder C++ 6 (I know it's outdated but for some reasons I had to use it). My app uses library to neural networks (FANN). Using the debugger I found that program crashes in:
~neural_net()
{
destroy();
}
destroy() calls multiple time another function fann_safe_free(ptr), that is:
#define fann_safe_free(x) {if(x) { free(x); x = NULL; }}
The library works great, problem only appears when it does cleaning. That's why I asked about so brutal solution. My app is multi-threaded but other threads operate on different data.
I will analyze my code for the n-th time(the bug must be somewhere), thanks for all your tips :)
You should fix the problem.
First step: find at check all functions you register with atexit() (not many I hope)
Second step: find all global variables and check their destructors.
Third Step: find all static function variables check their destructors.
But otherwise you can abort.
Note: abort is for Abnormal program termination.
abort()
The difference: (note letting an application leave the main function is the equivalent of exit())
exit()
Call the functions registered with the atexit(3) function, in the reverse order of their registration. This includes the destruction of all global (static storage duration) variables.
Flush all open output streams.
Close all open streams.
Unlink all files created with the tmpfile(3) function.
abort()
Flush all open output streams.
Close all open streams.
It's a terrible solution for more than one reason. It will hide the problem (maybe), but it could also corrupt data, depending on the nature of your application.
Why don't you use a debugger and try to find out what is causing the error?
If your application is multi-threaded, you should make sure that all threads are properly shut down before exiting the application. This is a fairly common cause of that type of error on exit, when a background thread is attempting to use memory/objects that have already been destructed.
Edit:
based on your updated question, I have the following suggestions:
Try to find out more specifically what is causing the crash in the destructor.
The first thing I would do is make sure that it's not trying to destruct a NULL object. When you get your crash in ~neural_net in your debugger, check your "this" pointer to make sure it's not NULL. If it is, then check your call-stack and see where it's being destructed, and do a check to make sure it's not NULL before calling delete.
If it's not NULL, then I would unroll that macro in destroy, so you can see if it's crashing on the call to free.
You could try calling abort(); (declared in <stdlib.h> and in <process.h>)
The version in VisualC++, however, will print a warning message as it exits: "This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information."
On Linux/UNIX you can use _exit:
#include <unistd.h>
void _exit(int status);
The function _exit() is like exit(), but does not call any functions registered with atexit() or on_exit(). Whether it flushes standard I/O buffers and removes temporary files created with tmpfile(3) is implementation dependent. On the other hand, _exit() does close open file descriptors, and this may cause an unknown delay, waiting for pending output to finish. If the delay is undesired, it may be useful to call functions like tcflush() before calling _exit(). Whether any pending I/O is cancelled, and which pending I/O may be cancelled upon _exit(), is implementation-dependent.
Have you tried the gruesome step by step? If you're project/solution is simply to large to do so maybe you could try segmenting it assuming you use a modular build and test each component indivdually. Without any code or visible destructors abstract advice is all I can give you I'm afraid. But nonetheless I hope trying to minimize the debugging field will help in some way.
Good luck with getting an answer :)
That immediate program exit (and yes, that's a terrible solution) is abort()
That happens most likely because a NULL pointer is being accessed. Depending on your OS try getting a stack trace and identify the culprit, don't just exit.
If you use linux, valgrind should solve your problem.
but if it is windows, try one of these: MemoryValidator, BoundsChecker or other tools like these.
Simply close your application is not the best way to deal with bugs ...
I have a program that depends on an external shared library, but after a function inside the library gets executed I lose the ability to use breakpoints.
I can break and step like normal up until I execute this function, but afterwards it is finished it never breaks. It won't even break on main if I try and use start for the second time executing the program. It's not an inlined function problem, because I've broken on these functions before and when I comment out this particular function everything starts to work again.
Has anyone ever encountered anything like this before? What can I do?
Using gdb 7.1 with gcc 3.2.3
Edit:
After some hints from users I figured out that the process is forking inside the library call. I'm not sure what it's doing (and I really don't care). Can I somehow compensate for this? I've been experimenting with the follow-fork-mode as child, but I'm really confused what happens once it forks and I can't seem to figure out how to continue execution or do anything of use.
Edit:
Further investigation. The nearest I can tell, gdb is losing all its symbol information somewhere. After the 2nd run, all symbols resolve to the #plt address and not to the actual address that they resolved to on the first run. Like somehow the second loading of the process loses all the info it gained the first time around and refuses to reload it. I'm so confused!!
Edit:
So I traced down the problem to the vfork of a popen call. Apparently gdb doesn't play nice with popen? As soon as I detach from the popen'd vforked process, I lose all my symbols. I've read some reports online about this as well. Is there any hope?
It's the combination of vfork(2) and exec(2) that's messing things up. Quoting from gdb manual (debugging forks):
On some systems, when a child process is spawned by vfork, you cannot debug the child or parent until an exec call completes.
...
By default, after an exec call executes, gdb discards the symbols of the previous executable image. You can change this behaviour with the set follow-exec-mode command.
Keep follow-fork-mode set to parent and set follow-exec-mode to same.
Alternatively, if your installation supports multi-process debugging (it should, since your gdb version is 7.1), try using info inferiors to find your original process and switching to it with inferior <num>.