I have a systemd service which runs and does its thing. Periodically I need it to upgrade itself, which requires a shutdown and a restart of the service. For question purposes the upgrade script can be as simple as:
echo "Stopping service..."
systemctl stop myservice
echo "Doing some stuff..."
sleep 10s
echo "Starting service..."
systemctl start myservice
I want to call this within the service itself, preferably using boost::process:
boost::process::child instexe{
boost::process::search_path("bash"),
std::vector<std::string>{"installerscript.sh"},
boost::process::start_dir("/installer/folder"),
boost::process::std_out > "/some/log/file.txt"
};
instexe.detach();
The problem is that as soon as the script calls systemctl stop myservice, the installer script is killed.
Is there a way I can do what I want to do with boost::process? Or how can I do it?
If the upgrades are at predefined period you can think of using crontab.
https://opensource.com/article/17/11/how-use-cron-linux
00 09-17 * * 1-5 /usr/local/bin/installerScript.sh
The above entry in crontab will make the program upgrade every hour between 9 am to 5pm from Monday to Friday. There are many combinations that you can think and configure.
Is there a way I can do what I want to do with boost::process? Or how can I do it?
If you have the child process killing the parent, there's always going to be a race condition by definition.
The quick hack is to put a sleep statement at the start of the installer script, but the correct solution is to explicitly synchronize with the child:
have the installer script detect whether it's running interactively (ie, being run manually from a terminal instead of by your service)
if it is non-interactive (your use case), have it wait for some input in stdin
connect the stdin pipe when you create the child
detach the child and then write something to tell the child it's safe
Other synchronization mechanisms are available, you could use a lockfile or a signal - you just need to make sure the child doesn't do anything until after the parent has detached it.
I turns out (from this question, which leads to the excellent-but-unfindable systemd.kill manpage) that systemd has four different ways of stopping a unit, controlled by the KillMode variable in your unit configuration:
control-group will send SIGTERM (by default, overridable with KillSignal) to every process in the unit's cgroup. That means both parent and child.
mixed will send SIGTERM (or KillSignal) to your main process and SIGKILL to the child.
process will kill only the main process and leave the child alone
none is not recommended, it will just run your ExecStop procedure
You can probably just set KillMode=process, but note that if SendSIGKill or SendSIGUP are true, those signals will still be delivered to your child after TimeoutStopSec.
It seems like it might be simpler to restart your service and have a launch script that can update it at startup, or to perform the update in your ExecStop procedure, than to persuade systemd to leave the child alone until the update is complete, without the risk of a hung child updater hanging around forever.
Either way, your remaining problems are exclusively with systemd rather than with boost.Process.
Related
I have Linux daemon that I have written in C++ that should restart itself when given a "restart"-command from a user over the network through its console. Is this possible? I use a /etc/init.d script. How can I program it to restart itself? Should I launch a new process with a very long delay (one minute) that then fires the shell script again ? Problem is that the daemon may take a very long time to close down and it could take even more than a minute in a worst-case scenario.
There are basically three ways for an application to restart itself:
When the application is told to restart, it does proper clean-up, releases all resources it has allocated, and then re-initializes like it was started from scratch.
Fork a new process, where the new child process execs itself and the parent process exits normally.
The daemon is actually just a wrapper application, much like an init-script. It forks a new process which runs the actual application, while the parent process just waits for it to exit. If the child process (and the real application) returns with a special exit-code, it means that it should be restarted so the forks/execs all over again.
Note that points 2 and 3 are basically the same.
Break down the restart as two steps, stop and start. if your program takes time to stop, it should be handled in the stop function, I can't comment on specifics since I don't know your usecase, but I'd imagine monitoring the process to check if it's terminated will be a graceful way to stop
Do whatever shut-down/clean-up you need to do, then call this:
execl( argv[0], argv, reinterpret_cast< char* >( 0 ) );
Just like fork() and exec(), but skipping the fork. The exec will replace the current process with a new copy of itself. cf. http://linux.die.net/man/3/exec
Your init script should just kill your daemon and start it again. Don't try to restart your daemon FROM your daemon.
So here is the situation, we have a C++ datafeed client program which we run ~30 instances of with different parameters, and there are 3 scripts written to run/stop them: start.sh stop.sh and restart.sh (which runs stop.sh and then start.sh).
When there is a high volume of data the client "falls behind" real time. We test this by comparing the system time to the most recent data entry times listed. If any of the clients falls behind more than 10 minutes or so, I want to call the restart script to start all the binaries fresh so our data is as close to real time as possible.
Normally I call a script using System(script.sh), however the restart script looks up and kills the process using kill, BUT calling System() also makes the current program execution ignore SIGQUIT and SIGINT until system() returns.
On top of this if there are two concurrent executions with the same arguments they will conflict and the program will hang (this stems from establishing database connections), so I can not start the new instance until the old one is killed and I can not kill the current one if it ignores SIGQUIT.
Is there any way around this? The current state of the binary and missing some data does not matter at all if it has reached the threshold, I also can not just have the program restart itself, since if one of the instances falls behind, we want to restart all 30 of the instances (so gaps in the data are at uniform times). Is there a clean way to call a script from within C++ which hands over control and allows the script to restart the program from scratch?
FYI we are running on CentOS 6.3
Use exec() instead of system(). It will replace your process with the new one. Note there is a significant different in how exec() is called and how it behaves: system() passes its string argument to the system shell to run. exec() actually executes an executable file, and you need to supply the arguments to the process one at a time, instead of letting the shell parse them apart for you.
Here's my two cents.
Temporary solution: Use SIGKILL.
Long-term solution: Optimize your code or the general logic of your service tree, using other system calls like exec or by rewritting it to use threads.
If you want better answers maybe you should post some code and or degeneralize the issue.
In a Linux/C++ library I'm launching a process via the system() call,
system("nohup processName > /dev/null&");
This seems to work fine with a simple test application that exits on it's own, but if I use this from inside of a Nodejs/V8 extension which gets a kill signal, the child process gets killed. I did find that running,
system("sudo nohup processName > /dev/null&");
With the sudoers file set up to not require a password manages to make this run even when the parent process (node) exits. Is there someway to entirely detach the child process so signals sent to the parent and the parent exiting have no effect on the child anymore? Preferably within the system() call and not something that requires getting the process ID and doing something with it.
The procedure to detach from the parent process is simple: Run the command under setsid (so it starts in a new session), redirecting standard input, output and error to /dev/null (or somewhere else, as appropriate), in background of a subshell. Because system() starts a new shell, it is equivalent to such a subshell, so
system("setsid COMMAND </dev/null >/dev/null 2>/dev/null &");
does exactly what is needed. In a shell script, the equivalent is
( setsid COMMAND </dev/null >/dev/null 2>/dev/null & )
(Shell scripts need a subshell, because otherwise the COMMAND would be under job control for the current shell. That is not important when using system(), because it starts a new shell just for the command anyway; the shell will exit when the command exits.)
The redirections are necessary to make sure the COMMAND has no open descriptors to the current terminal. (When the terminal closes, a TERM signal is sent to all such processes.) This means standard input, standard output, and standard error all must be redirected. The above redirections work in both Bash and POSIX shells, but might not work in ancient versions of /bin/sh. In particular, it should work in all Linux distros.
setsid starts a new session; the COMMAND becoming the process group leader for its own process group. Signals can be directed to either a single process, or to all processes in a process group. Termination signals are usually sent to entire process groups (since an application may technically consist of multiple related processes). Starting a new session makes sure COMMAND does not get killed if the process group the parent proces belongs to is killed by a process-group wide signal.
My guess is that the whole process group is being killed. You could try setpgid in the child to start a new process group. The first step should be to get rid of system and use fork and execve or posix_spawn.
I've created a linux c++ service (It's basically an app, but it handles requests over TCP/IP quite frequently).
I was wondering if there is any easy way to have it "auto restart" if something goes wrong (like it crashes) or if the server restarts?
I wasn't sure how or even if I should set it up as a service or set up an rc.d script, I'm not 100% familiar w/how to do this on linux (my server is running ubuntu if it matters).
Any advice would be greatly appreciated!
~ Josh
A lot of answers here suggest having a 'parent app' that does it, but you end up with the same problem with the parent app - it's turtles all the way down.
In many unix type systems (especially historically), The init process is the first process that executes, and will execute (and restart automatically) processes as defined in /etc/inittab.
So instead of writing your own watchdog or process to auto-restart - you can use this one that does the job for you automatically, and since it's the init process, if it dies, the system has a lot more to worry about than your service.
#doron suggests another good approach, if your service should spawn a new process for every incoming connection, and only does work when it has an incoming connection.
Finally, these days the init process (and /etc/inittab) has been replaced on Ubuntu type systems with upstart - http://upstart.ubuntu.com/ - a more flexible system for the same thing.
In my product, I've created watchdog process which forks and exec service process in separate process, and waits for its termination. If, for some reason, process terminates, watchdog process will create another thread and it will start process again.
As noted in comments, you should check why it is crashed. For start, you could read program exit value.
Here is simple program to get you started:
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main()
{
create_process();
return 0;
}
void create_process()
{
int exit_code;
if(fork() == 0)
{
exec("./your_service");
}
else
{
wait(&exit_code);
if(WIFEXITED(exit_code))
{
/* Program terminated with exit */
/* If you want, you could decode exit code here using
WEXITSTATUS and you can start program again.
*/
return;
}
else
` {
/* Program didn't terminated with exit, restart */
create_process();
}
}
}
In order to start service on system startup, simply edit the /etc/rc.local script and append command for running your watchdog process.
Create a control app which starts and restarts it if necessary.
Do this in your app - fork a child, run the program there, catch stop/crash and fork new child if necessary. Some working code can be found here: monitoring the main app in c .
You might want to take a look at using the Inet Daemon. The inet daemon start a new process every time a new request comes in. So if there is a crash in your server, it just gets restarted when the next request comes in.
The simplest way to write a parent app which auto-restarts a child process, on *NIX, is just using a shell script:
#!/bin/sh
while true;
do
run_my_program;
done
You can optionally have this redirect output, run itself in the background, etc.
This doesn't address starting the process in the first place, but it's less work (for exactly the same result) as writing a parent process in C++.
I need to execute some commands via "/bin/sh" from a daemon. Some times these commands takes too long to execute, and I need to somehow interrupt them. The daemon is written in C++, and the commands are executed with std::system(). I need the stack cleaned up so that destructors are called when the thread dies. (Catching the event in a C++ exception-handler would be perfect).
The threads are created using boost:thread. Unfortunately, neither boost::thread::interrupt() or pthread_cancel() are useful in this case.
I can imagine several ways to do this, from writing my own version of system(), to finding the child's process-id and signal() it. But there must be a simpler way?
Any command executed using the system command is executed in a new process. Unfortunately system halts the execution of the current process until the new process completes. If the sub process hangs the new process hangs as well.
The way to get round this is to use fork to create a new process and call one of the exec calls to execute the desired command. Your main process can then wait on the child process's Process Id (pid). The timeout can be achieve by generating a SIGALRM using the alarm call before the wait call.
If the sub process times out you can kill it using the kill command. Try first with SIGTERM, if that fails you can try again will SIGKILL, this will certainly kill the child process.
Some more information on fork and exec can be found here
I did not try boost::process, as it is not part of boost. I did however try ACE_Process, which showed some strange behavior (the time-outs sometimes worked and sometimes did not work). So I wrote a simple std::system replacement, that polls for the status of the running process (effectively removing the problems with process-wide signals and alarms on a multi threading process). I also use boost::this_thread::sleep(), so that boost::thread::interrupt() should work as an alternative or in addition to the time-out.
Stackoverflow.com does not work very good with my Firefox under Debian (in fact, I could not reply at all, I had to start Windows in a VM) or Opera (in my VM), so I'm unable to post the code in a readable manner. My prototype (before I moved it to the actual application) is available here: http://www.jgaa.com/files/ExternProcess.cpp
You can try to look at Boost.Process:
Where is Boost.Process?
I have been waiting for a long time for such a class.
If you are willing to use Qt, a nice portable solution is QProcess:
http://doc.trolltech.com/4.1/qprocess.html
Of course, you can also make your own system-specific solution like Let_Me_Be suggests.
Anyway you'd probably have to get rid of the system() function call and replace it by a more powerful alternative.