Python C API - Stopping Execution (and continuing it later)

Python C API - Stopping Execution (and continuing it later) - c++

1) I would like to use the profiling functions in the Python C API to catch the python interpreter when it returns from specific functions.
2) I would like to pause the python interpreter, send execution back to the function that called the interpreter in my C++ program, and finally return execution to the python interpreter, starting it on the line of code after where it stopped. I would like to maintain both globals and locals between the times where execution belongs to python.
Part 1 I've finished. Part 2 is my question. I don't know what to save so I can return to execution, or how to return to execution given that saved data.
From what I could get off the python API docs, I will have to save some part of the executing frame, but I haven't found anything. Some additional questions...
What, exactly does a PyFrameObject contain? The python API docs, surprisingly, never explain that.

If I understand your problem, you have a C++ program that calls into python. When python finishes executing a function, you want to pause the interpreter and pick up where the C++ code left off. Some time later your C++ program needs to cal back into python, and have the python interpreter pick up where it left off.
I don't think you can do this very easily with one thread. Before you pause the interpreter the stack looks like this:
[ top of stack ]
[ some interpreter frames ]
[ some c++ frames ]
To pause the interpreter, you need to save off the interpreter frames, and jump back to the top-most C++ frame. Then to unpause, you need to restore the interpreter frames, and jump up the stack to where you left off. Jumping is doable (see http://en.wikipedia.org/wiki/Setjmp.h), but saving and restoring the stack is harder. I don't know of an API to do this.
However you could do this with two threads. The thread created at the start of your c++ program (call it thread 1) runs the c++ code, and it creates thread 2 to run the python interpreter.
Initially (when were running c++ code), thread 1 is executing and thread 2 is blocked (say on a condition variable, see https://computing.llnl.gov/tutorials/pthreads/). When you run or unpause the interpreter thread 1 signals the condition variable, and waits on it. This wakes up thread 2 (which runs the interpreter) and causes thread 1 to block. When the interpreter needs to pause, thread 2 signals the condition variable and waits on it (so thread 2 blocks, thread 1 wakes up). You can bounce back and forth between the threads to your heart's content. Hope this helps.

Related

How to run multiple shell command at the same time in linux

I am trying to run multiple command in ubuntu using c++ code at the same time.
I used system() call to run multiple command but the problem with system() call is it invoke only one command at a time and rest commands are in waiting.
below I wrote my sample code, may this help you to get what I am trying to do.
major thing is I want to run all these command at a time not one by one. Please help me.
Thanks in advance.
main()
{
string command[3];
command[0]= "ls -l";
command[1]="ls";
command[2]="cat main.cpp";
for(int i=0;i<3;i++){
system(command[i].c_str());
}
}

You should read Advanced Linux Programming (a bit old, but freely available). You probably want (in the traditional way, like most shells do):
perhaps catch SIGCHLD (set the signal handler before fork, see signal(7) & signal-safety(7)...)
call fork(2) to create a new process. Be sure to check all three cases (failure with a negative returned pid_t, child with a 0 pid_t, parent with a positive pid_t). If you want to communicate with that process, use pipe(2) (read about pipe(7)...) before the fork.
in the child process, close some useless file descriptors, then run some exec function (or the underlying execve(2)) to run the needed program (e.g. /bin/ls)
call (in the parent, perhaps after having got a SIGCHLD) wait(2) or waitpid(2) or related functions.
This is very usual. Several chapters of Advanced Linux Programming are explaining it better.
There is no need to use threads in your case.
However, notice that the role of ls and cat could be accomplished with various system calls (listed in syscalls(2)...), notably read(2) & stat(2). You might not even need to run other processes. See also opendir(3) & readdir(3)
Perhaps (notably if you communicate with several processes thru several pipe(7)-s) you might want to have some event loop using poll(2) (or the older select(2)). Some libraries provide an event loop (notably all GUI widget libraries).

You have a few options (as always):
Use threads (C++ standard library implementation is good) to spawn multiple threads which each perform a system call then terminate. join on the thread list to wait for them all to terminate.
Use the *NIX fork command to spawn a new process, then within each child process use exec to execute the desired command (see here for an example of "getting the right string to the right child"). Parent process can use waitpid to determine when all children have finished running, in order to move on with the program.
Append "&" to each of your commands, which'll tell the shell to run each one in the background (specifically, system will start the process in the background then return, without waiting for the result). Not tried this, don't know if it'll work. You can't then wait for the call to terminate though (thanks PSkocik).
Just pointing out - if you run those 3 specific commands at the same time, you're unlikely to be able to read the output as they'll all print text to the terminal at the same time.
If you do require reading the output from within the program (though not mentioned in your question), this is relevant (although it doesn't use system).

Python import firebase module causes program to loop unexpectedly

I am using Python2.7 ,Python-firebase 1.2 .
If we comment firebase import then it is giving output only once or else it is giving multiple times.
from firebase import firebase
print "result"
output:
result
result
result
result

That firebase module was written by bad programmers as it performs tasks that you don't explicitly ask for. For that reason, I would advise anybody to steer clear from using that module because you cannot know what other booby traps they might have in their code. Sure, they probably think this behavior is convenient, but convenience is everything but breaking the expectations of programmers (which is the one rule that absolutely every module writer has to follow) and if it was convenient this question wouldn't exist. They do say that it relies heavily on multiprocessing but they don't mention you won't have a say in it:
The interface heavily depends on the standart multiprocessing library when concurrency comes in. While creating an asynchronous call, an on-demand process pool is created and, the async method is executed by one of the idle process inside the pool. The pool remains alive until the main process dies. So every time you trigger an async call, you always use the same pool. When the method returns, the pool process ships the returning value back to the main process within the callback function provided.
So, all that being said... This happens because the main __init__.py of that module imports its async.py module, which in turn creates a multiprocessing.Pool (set to its _process_pool) with 5 fixed slots, and given nothing to work with you get 5 additional processes of your main script - hence, it prints out result 6 times (the main process and the 5 spawned sub-processes).
Bottom line - do not use this module. There are other alternatives, but if you absolutely have to - guard your code with a main process check:
if __name__ == "__main__":
print("result")
It will still spawn 5 subprocesses, and wait for all of them to finish (which is rather quick) but at least it won't execute your guarded code.

C++ executing a bash script which terminates and restarts the current process

So here is the situation, we have a C++ datafeed client program which we run ~30 instances of with different parameters, and there are 3 scripts written to run/stop them: start.sh stop.sh and restart.sh (which runs stop.sh and then start.sh).
When there is a high volume of data the client "falls behind" real time. We test this by comparing the system time to the most recent data entry times listed. If any of the clients falls behind more than 10 minutes or so, I want to call the restart script to start all the binaries fresh so our data is as close to real time as possible.
Normally I call a script using System(script.sh), however the restart script looks up and kills the process using kill, BUT calling System() also makes the current program execution ignore SIGQUIT and SIGINT until system() returns.
On top of this if there are two concurrent executions with the same arguments they will conflict and the program will hang (this stems from establishing database connections), so I can not start the new instance until the old one is killed and I can not kill the current one if it ignores SIGQUIT.
Is there any way around this? The current state of the binary and missing some data does not matter at all if it has reached the threshold, I also can not just have the program restart itself, since if one of the instances falls behind, we want to restart all 30 of the instances (so gaps in the data are at uniform times). Is there a clean way to call a script from within C++ which hands over control and allows the script to restart the program from scratch?
FYI we are running on CentOS 6.3

Use exec() instead of system(). It will replace your process with the new one. Note there is a significant different in how exec() is called and how it behaves: system() passes its string argument to the system shell to run. exec() actually executes an executable file, and you need to supply the arguments to the process one at a time, instead of letting the shell parse them apart for you.

Here's my two cents.
Temporary solution: Use SIGKILL.
Long-term solution: Optimize your code or the general logic of your service tree, using other system calls like exec or by rewritting it to use threads.
If you want better answers maybe you should post some code and or degeneralize the issue.

Executing new task based on sigchld() from previous task

I'm currently in the process of building a small shell within C++.
A user may enter a job at the prompt such as exe1 && exe2 &. Similar to the BASH shell, I will only execute exe2 if exe1 exits successfully. In addition, the entire job must be performed in the background (as specified by the trailing & operator).
Right now, I have a jobManager which handles execution of jobs and a job structure which contains the job's executable and their individual arguments / conditions. A job is started by calling fork() and then calling execvp() with the proper arguments. When a job ends, I have a signal handler for SIGCHLD, in which I perform wait() to determine which process has just ended. When exe1 ends, I observe its exit code and make a determination as to whether I should proceed to launch exe2.
My concern is how do I launch exe2. I am concerned that if I use my jobManager start function from the context of my SIGCHLD handler, I could end up with too many SIGCHLD handler functions hanging out on the stack (if there were 10 conditional executions, for instance). In addition, it just doesn't seem like a good idea to be starting the next execution from the signal handler, even if it is occurring indirectly. (I tried doing something similar 1.5 years ago when I was just learning about signal handling -- I seem to recall it failing on me).
All of the above needs to be able to occur in the background and I want to avoid having the jobManager sitting in a busy wait just waiting for exe1 to return. I would also prefer to not have a separate thread sitting around just waiting to start the execution of another process. However, instructing my jobManager to begin execution of the next process from the SIGCHLD handler seems like poor code.
Any feedback is appriciated.

I see two ways:
1)Replace you sighandler with loop that call "sigwait" (see man 3 sigwait)
then in loop
2)before start create pipe, and in mainloop of your program use "select" on pipe handle to wait
events. In signal handler write to pipe, and in mainloop handle situation.

Hmmm that's a good one.
What about forking twice, once per process? The first one runs, and the second one stops. In the parent SIGCHLD handler, send a SIGCONT to the second child, if appropriate, which then goes off and runs the job. Naturally, you SIGKILL the second one if the first one shouldn't run, which should be safe because you won't really have set anything up.
How does that sound? You'll have a process sitting around doing nothing, but it shouldn't be for very long.

command to suspend a thread with GDB

I'm a little new to GDB. I'm hoping someone can help me with something that should be quite simple, I've used Google/docs but I'm just missing something.
What is the 'normal' way folks debug threaded apps with GDB? I'm using pthreads. I'm wanting to watch only one thread - the two options I see are
a) tell the debugger somehow to attach to a particular thread, such that stepping wont result in jumping threads on each context switch
b) tell the debugger to suspend/free any 'uninteresting' threads
I'd prefer to go route b) - reading the help for GDB I dont see a command for this, tips?

See documentation for set scheduler-locking on.
Beware: if you suspend other threads, and if one of them holds a lock, and if your interesting thread needs that lock at some point while stepping, you'll deadlock.
What is the 'normal' way folks debug threaded apps
You can never debug thread correctness, you can only design it in. In my experience, most of debugging of threaded apps is putting in assertions, and examining state of the world when one of the assertions is violated.

First, you need to enable comfortable for multi-threading debugger behavior with the following commands. No idea why it's disabled by default.
set target-async 1
set non-stop on
I personally put those commands into .gdbinit file. They make your every command to be applied only to the currently focused thread. Note: the thread might be running, so you have to pause it.
To see the focused thread execute the thread.
To switch to another thread append the number of the thread, e.g. thread 2.
To see all threads with their numbers issue info thread.
To apply a command to a particular thread issue something like thread apply threadnum command. E.g. thread apply 4 bt will apply backtrace command to a thread number 4. thread apply all continue continues all paused threads.
There is a small problem though — many commands needs the thread to be paused. I know a few ways of doing that:
interrupt command: interrupts the thread execution, accepts a number of a thread to pause, without an argument breaks the focused one.
Setting a breakpoint somewhere. Note that you may set a breakpoint to a particular thread, so that other threads will ignore it, like break linenum thread threadnum. E.g. break 25 thread 4.
You may also find very useful that you can set a list of commands to be executed when a breakpoint hit through the command commands — so e.g. you may quickly print interesting values, then continue execution.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js