I have a little problem, I wrote a program, server role, doing an infinite loop waiting for client requests.
But I would like this program to also return his pid.
Thus, I think I should use multithreading.
Here's my main :
int main(int argc, char **argv) {
int pid = (int) getpid();
int port = 5555
ServerSoap *servsoap;
servsoap = new ServerSoap(port, false);
servsoap->StartServer(); //Here starts the infinite loop
return pid; //so it never executes this
}
If it was bash scripting I would add & to run it in background.
Shall I use pthread ? And how to do it please ?
Thanks.
eo
When a program returns (exits), all running threads terminate, so you can't have a background thread continue to run.
In addition, the int return value of main is (usually) truncated to a 7-bit value, so you don't have enough space to return a full pid.
It'd be better just to print the pid to stdout using printf.
If you put the infinite loop in a separate thread, and then return from main it will kill the whole process including your new thread. One solution, keeping to threads, is to make a detached thread. A better solution is probably to create a new process:
int main()
{
int pid = fork();
if (pid == -1)
perror("fork");
else if (pid == 0)
{
ServerSoap serversoap(5555, false);
serversoap.StartServer();
}
return pid;
}
Edit: Also note the limit to the return value from main as noted in the answer from ecatmur.
I have a feeling that you're trying to implement daemon.
To add to #ecatmur answer, if no error has happened program should always return 0 on termination.
PID is usually saved in some file, often times in /var/run/ directory. Some programs use /tmp/ directory.
Your main is attempting to do what your server should do. You're confusing a couple patterns here.
Pattern #1: Daemon
Think of the main as the program that, when on, accepts client requests and performs operations with them. The main has to wait for requests if this is the structure of the program. When a request is received, only then do you perform the requested operation. The main serves only to turn on or off this service. Normally this type of behavior is handled by default with threads. The listener activates a thread calling specific methods with information regarding the request, for instance. Unless you require threads for the work you need done, you shouldn't require threads for this.
Pattern #2: Tool
Alternatively, you could simply call this program as a tool. You'd still need a web service, but this program could be separate from that. Apart from what your tool should do, you shouldn't require threads for this.
In either case, I don't think what you're looking for is to implement threading. You're simply activating a server which does nothing. You should probably look into adding request handlers instead.
Related
as i describe in the header I would like to have in a thread an if statement which is checked every 1 minute and if it is true restart the whole programm.. Any suggestions?
void* checkThread(void* arg)
{
if(statement)
//restart procedure
sleep(60);
}
int main()
{
pthread_create(&thread1, NULL, checkThread, main_object);
pthread_create();
pthread_create();
}
If you are going for the nuke-it-from-orbit approach (i.e. you don't want to trust your code to do a controlled shutdown reliably), then having the kill-and-auto-relaunch mechanism inside the same process space as the other code is not a very robust approach. For example, if one of the other threads were to crash, it would take your auto-restart-thread down with it.
A more fail-safe approach would be to have your auto-restart-thread launch all of the other code in a sub-process (via fork(); calling exec() is allowable but not necessary in this case). After 60 seconds, the parent process can kill the child process it created (by calling kill() on the process ID that fork() returned) and then launch a new one.
The advantage of doing it this way is that the separating of memory spaces protects your relauncher-code from any bugs in the rest of the code, and the killing of the child process means that the OS will handle all the cleanup of memory and other resources for you, so there is less of a worry about things like memory or file-handle leaks.
If you want a "nice" way to do it, you set a flag, and then politely wait for the threads to finish, before relaunching everything.
main_thread() {
do {
kill_and_restart_everything = false;
// create your threads.
pthread_create(&thread1, NULL, checkThread, main_object);
pthread_create(&thread2, ...);
pthread_create(&thread3, ...);
// wait for your threads.
pthread_join(thread1, nullptr);
pthread_join(thread2, nullptr);
pthread_join(thread3, nullptr);
} while (kill_and_restart_everything);
}
void* checkThread(void* arg) {
while (! kill_and_restart_everything) {
if(statement)
kill_and_restart_everything = true;
else
sleep(60);
}
}
void* workerThread(void* arg) {
// do stuff. periodically check
if (kill_and_restart_everything) {
// terminate this thread early.
// do it cleanly too, release any resources, etc (RAII is your friend here).
return nullptr;
}
// do other stuff, remember to have that check happen fairly regularly.
}
This way, whenever if(statement) is true, it will set a boolean that can be used to tell each thread to shut down. Then the program waits for each thread to finish, and then starts it all over again.
Downsides: If you're using any global state, that data will not be cleaned up and can cause problems for you. If a thread doesn't check your signal, you could be waiting a looooong time.
If you want to kill everything (nuke it from orbit) and restart, you could simply wrap this program in a shell script (which can then detect whatever condition you want, kill -9 the program, and relaunch it).
Use the exec system call to restart the process from the start of the program.
you can do it in two parts:
Part1: one thread that checks for the statement and sets a boolean to true when you need to restart the program
This is the "checker" thread
Part2: one thread that computes what you want:
this will "relaunch" the program as long as needed
This "relaunch" consists in a big loop
In the loop:
creates a thread that will actually execute your programme (the task you want to be executed)
ends this taks when the boolean is set to true
creates another thread to replace then one that is terminated
The main of your program consists in launching the "checker" and the "relauncher"
Tell me if you have any questions/remarks I can detail or add some code
I am creating a pipe using popen() and the process is invoking a third party tool which in some rare cases I need to terminate.
::popen(thirdPartyCommand.c_str(), "w");
If I just throw an exception and unwind the stack, my unwind attempts to call pclose() on the third party process whose results I no longer need. However, pclose() never returns as it blocks with the following stack trace on Centos 4:
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x00807dc3 in __waitpid_nocancel () from /lib/libc.so.6
#2 0x007d0abe in _IO_proc_close##GLIBC_2.1 () from /lib/libc.so.6
#3 0x007daf38 in _IO_new_file_close_it () from /lib/libc.so.6
#4 0x007cec6e in fclose##GLIBC_2.1 () from /lib/libc.so.6
#5 0x007d6cfd in pclose##GLIBC_2.1 () from /lib/libc.so.6
Is there any way to force the call to pclose() to be successful before calling it so I can programmatically avoid this situation of my process getting hung up waiting for pclose() to succeed when it never will because I've stopped supplying input to the popen()ed process and wish to throw away its work?
Should I write an end of file somehow to the popen()ed file descriptor before trying to close it?
Note that the third party software is forking itself. At the point where pclose() has hung, there are four processes, one of which is defunct:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
abc 6870 0.0 0.0 8696 972 ? S 04:39 0:00 sh -c /usr/local/bin/third_party /home/arg1 /home/arg2 2>&1
abc 6871 0.0 0.0 10172 4296 ? S 04:39 0:00 /usr/local/bin/third_party /home/arg1 /home/arg2
abc 6874 99.8 0.0 10180 1604 ? R 04:39 141:44 /usr/local/bin/third_party /home/arg1 /home/arg2
abc 6875 0.0 0.0 0 0 ? Z 04:39 0:00 [third_party] <defunct>
I see two solutions here:
The neat one: you fork(), pipe() and execve() (or anything in the exec family of course...) "manually", then it is going to be up to you to decide if you want to let your children become zombies or not. (i.e. to wait() for them or not)
The ugly one: if you're sure you only have one of this child process running at any given time, you could use sysctl() to check if there is any process running with this name before you call pclose()... yuk.
I strongly advise the neat way here, or you could just ask whomever responsible to fix that infinite loop in your third party tool haha.
Good luck!
EDIT:
For you first question: I don't know. Doing some researches on how to find processes by name using sysctl() shoud tell you what you need to know, I myself have never pushed it this far.
For your second and third question: popen() is basically a wrapper to fork() + pipe() + dup2() + execl().
fork() duplicates the process, execl() replaces the duplicated process' image with a new one, pipe() handles inter process communication and dup2() is used to redirect the output... And then pclose() will wait() for the duplicated process to die, which is why we're here.
If you want to know more, you should check this answer where I've recently explained how to perform a simple fork with standard IPC. In this case, it's just a bit more complicated as you have to use dup2() to redirect the standard output to your pipe.
You should also take a look at popen()/pclose() source codes, as they are of course open source.
Finally, here's a brief example, I cannot make it clearer than that:
int pipefd[2];
pipe(pipefd);
if (fork() == 0) // I'm the child
{
close(pipefd[0]); // I'm not going to read from this pipe
dup2(pipefd[1], 1); // redirect standard output to the pipe
close(pipefd[1]); // it has been duplicated, close it as we don't need it anymore
execve()/execl()/execsomething()... // execute the program you want
}
else // I'm the parent
{
close(pipefd[1]); // I'm not going to write to this pipe
while (read(pipefd[0], &buf, 1) > 0) // read while EOF
write(1, &buf, 1);
close(pipefd[1]); // cleaning
}
And as always, remember to read the man pages and to check all your return values.
Again, good luck!
Another solution is to kill all your children. If you know that the only child processes you have are processes that get started when you do popen(), then it's easy enough. Otherwise you may need some more work or use the fork() + execve() combo, in which case you will know the first child's PID.
Whenever you run a child process, it's PPID (parent process ID) is your own PID. It is easy enough to read the list of currently running processes and gather those that have their PPID = getpid(). Repeat the loop looking for processes that have their PPID equal to one of your children's PID. In the end you build a whole tree of child processes.
Since you child processes may end up creating other child processes, to make it safe, you will want to block those processes by sending a SIGSTOP. That way they will stop creating new children. As far as I know, you can't prevent the SIGSTOP from doing its deed.
The process is therefore:
function kill_all_children()
{
std::vector<pid_t> me_and_children;
me_and_children.push_back(getpid());
bool found_child = false;
do
{
found_child = false;
std::vector<process> processes(get_processes());
for(auto p : processes)
{
// i.e. if I'm the child of any one of those processes
if(std::find(me_and_children.begin(),
me_and_children.end(),
p.ppid()))
{
kill(p.pid(), SIGSTOP);
me_and_children.push_back(p.pid());
found_child = true;
}
}
}
while(found_child);
for(auto c : me_and_children)
{
// ignore ourselves
if(c == getpid())
{
continue;
}
kill(c, SIGTERM);
kill(c, SIGCONT); // make sure it continues now
}
}
This is probably not the best way to close your pipe, though, since you probably need to let the command time to handle your data. So what you want is execute that code only after a timeout. So your regular code could look something like this:
void send_data(...)
{
signal(SIGALRM, handle_alarm);
f = popen("command", "w");
// do some work...
alarm(60); // give it a minute
pclose(f);
alarm(0); // remove alarm
}
void handle_alarm()
{
kill_all_children();
}
-- about the alarm(60);, the location is up to you, it could also be placed before the popen() if you're afraid that the popen() or the work after it could also fail (i.e. I've had problems where the pipe fills up and I don't even reach the pclose() because then the child process loops forever.)
Note that the alarm() may not be the best idea in the world. You may prefer using a thread with a sleep made of a poll() or select() on an fd which you can wake up as required. That way the thread would call the kill_all_children() function after the sleep, but you can send it a message to wake it up early and let it know that the pclose() happened as expected.
Note: I left the implementation of the get_processes() out of this answer. You can read that from /proc or with the libprocps library. I have such an implementation in my snapwebsites project. It's called process_list. You could just reap off that class.
I'm using popen() to invoke a child process which doesn't need any stdin or stdout, it just runs for a short time to do its work, then it stops all by itself. Arguably, invoking this type of child process should rather be done with system() ? Anyway, pclose() is used afterwards to verify that the child process exited cleanly.
Under certain conditions, this child process keeps on running indefinitely. pclose() blocks forever, so then my parent process is also stuck. CPU usage runs to 100%, other executables get starved, and my whole embedded system crumbles. I came here looking for solutions.
Solution 1 by #cmc : decomposing popen() into fork(), pipe(), dup2() and execl().
It might just be a matter of personal taste, but I'm reluctant to rewrite perfectly fine system calls myself. I would just end up introducing new bugs.
Solution 2 by #cmc : verifying that the child process actually exists with sysctl(), to make sure that pclose() will return successfully. I find that this somehow sidesteps the problem from the OP #WilliamKF - there is definitely a child process, it just has become unresponsive. Forgoing the pclose() call won't solve that. [As an aside, in the 7 years since #cmc wrote this answer, sysctl() seems to have become deprecated.]
Solution 3 by #Alexis Wilke : killing the child process. I like this approach best. It basically automates what I did when I stepped in manually to resuscitate my dying embedded system. The problem with my stubborn adherence to popen(), is that I get no PID from the child process. I have been trying in vain with
waitid(P_PGID, getpgrp(), &child_info, WNOHANG);
but all I get on my Debian Linux 4.19 system is EINVAL.
So here's what I cobbled together. I'm searching for the child process by name; I can afford to take a few shortcuts, as I'm sure there will only be one process with this name. Ironically, commandline utility ps is invoked by yet another popen(). This won't win any elegance prizes, but at least my embedded system stays afloat now.
FILE* child = popen("child", "r");
if (child)
{
int nr_loops;
int child_pid;
for (nr_loops=10; nr_loops; nr_loops--)
{
FILE* ps = popen("ps | grep child | grep -v grep | grep -v \"sh -c \" | sed \'s/^ *//\' | sed \'s/ .*$//\'", "r");
child_pid = 0;
int found = fscanf(ps, "%d", &child_pid);
pclose(ps);
if (found != 1)
// The child process is no longer running, no risk of blocking pclose()
break;
syslog(LOG_WARNING, "child running PID %d", child_pid);
usleep(1000000); // 1 second
}
if (!nr_loops)
{
// Time to kill this runaway child
syslog(LOG_ERR, "killing PID %d", child_pid);
kill(child_pid, SIGTERM);
}
pclose(child); // Even after it had to be killed
} /* if (child) */
I learned in the hard way, that I have to pair every popen() with a pclose(), otherwise I pile up the zombie processes. I find it remarkable that this is needed after a direct kill; I figure that's because according to the manpage, popen() actually launches sh -c with the child process in it, and it's this surrounding sh that becomes a zombie.
Every connection requires one thread for each, and for now, we're allowing only certain number of connections per period. So every time a user connects, we increment the counter if we're within certain period from the last time we set the check time.
1.get current_time = time(0)
2.if current_time is OUTSIDE certain period from check_time,
set counter = 0, and check_time = current_time.
3.(otherwise, just leave it the way it is)
4.if counter < LIMIT, counter++ and return TRUE
5.Otherwise return FALSE
But this is independent of actually how many threads we have running in the server, so I'm thinking of a way to allow connections depending on this number.
The problem is that we're actually using a third-party api for this, and we don't know exactly how long the connection will last. First I thought of creating a child thread and run ps on it to pass the result to the parent thread, but it seems like it's going to take more time since I'll have to parse the output result to get the total number of threads, etc. I'm actually not sure if I'm making any sense.. I'm using c++ by the way. Do you guys have any suggestions as to how I could implement the new checking method? It'll be very much appreciated.
There will be a /proc/[pid]/task (since Linux 2.6.0-test6) directory for every thread belonging to process [pid]. Look at man proc for documentation. Assuming you know the pid of your thread pool you could just count those directories.
You could use boost::filesystem to do that from c++, as described here:
How do I count the number of files in a directory using boost::filesystem?
I assumed you are using Linux.
Okay, if you know the TID of the thread in use by the connection then you can wait on that object in a separate thread which can then decrement the counter.
At least I know that you can do it with MSVC...
bool createConnection()
{
if( ConnectionMonitor::connectionsMaxed() )
{
LOG( "Connection Request failed due to over-subscription" );
return false;
}
ConnectionThread& connectionThread = ThreadFactory::createNewConnectionThread();
connectionThread.startConnection();
ThreadMonitorThread& monitor = ThreadFactory::createThreadMonitor(connectionThread);
monitor.monitor();
}
and in ThreadMonitorThread
ThreadMonitorThread( const Thread& thread )
{
this.thread = thread;
}
void monitor()
{
WaitForSingleObject( thread.getTid() );
ConnectionMonitor::decrementThreadCounter();
}
Of course ThreadMonitorThread will require some special privileges to call the decrement and the ThreadFactory will probably need the same to increment it.
You also need to worry about properly coding this up... who owns the objects and what about exceptions and errors etc...
Right now I have a C++ client application that uses mysql.h to connect to a MYSQL database and have to preform some logic in case there is a disconnect. I'm wondering if this is the best way to reconnect to a MYSQL database in a situation where my client gets disconnected.
bool MYSQL::Reconnect(const char *host, const char *user, const char *passwd, const char *db)
{
bool out = false;
pid_t command_pid = fork();
if (command_pid == 0)
{
while(1)
{
sleep(1);
if (mysql_real_connect(&m_mysql, host, user, passwd, db, 0, NULL, 0) == NULL )
{
fprintf(stderr, "Failed to connect to database: Error: %s\n",
mysql_error(&m_mysql));
}
else
{
m_connected = true;
out = true;
break;
}
}
exit(0);
}
if (command_pid < 0)
fprintf(stderr, "Could not fork process[reconnect]: %s\n", mysql_error(&m_mysql));
return out;
}
Right now i take in all my parameters and preform a fork. the child process attempts to reconnect every second with a sleep() statement. Is this a good way to do this? Thanks
Sorry, but your code doesn't do what you think it does, Kaiser Wilhelm.
In essence, you're trying to treat a fork like a thread, which it is not.
When you fork a child, the parent process is completely cloned, including file and socket descriptors, which is how your program is connected to the MySQL database server. That is, both the parent and the child end up with their own copy of the same connection to the database server when you fork. I assume the parent only calls this Reconnect() method when it sees the connection drop, and stops using its copy of the now-defunct MySQL connection object, m_mysql. If so, the parent's copy of the connection is just as useless as the client's when you start the reconnect operation.
The thing is, the reverse is not also true: once the child manages to reconnect to the database server, the parent's connection object remains defunct. Nothing the child does propagates back up to the parent. After the fork, the two processes are completely independent, except insofar as they might try to access some I/O resource they initially shared. For example, if you called this Reconnect() while the connection was up and continued using the connection in the parent, the child's attempts to talk to the DB server on the same connection would confuse either mysqld or libmysqlclient, likely causing data corruption or a crash.
As hinted above, one solution to this is to use threads instead of forking. Beware, however, of the many problems with using threads with the MySQL C API.
Given a choice, I'd rather use asynchronous I/O to do the background connection attempt within the application's main thread, but the MySQL C API doesn't allow that.
It seems you're trying to avoid blocking your main application thread while attempting the DB server reconnection. It may be that you can get away with doing it synchronously anyway by setting the connect timeout to 1 second, which is fine when the MySQL server is on the same machine or same LAN as the client. If you could tolerate your main thread blocking for up to a second for connection attempts to fail — worst case happening when the server is on a separate machine and it's physically disconnected or firewalled — this would probably be a cleaner solution than threads. The connection attempt can fail much quicker if the server machine is still running and the port isn't firewalled, such as when it is rebooting and the TCP/IP stack is [still] up.
As far as I can tell, this doesn't do what you intended.
Logical issues
Reconnect doesn't "perform some logic in case there is a disconnect" at all.
It attempts to connect over and over again until it succeeds, then stops. That's it. The state of the connection is never checked again. If the connection drops, this code knows nothing about it.
Technical issues
Also pay close attention to the technical issues that Warren raises.
Sure, it's perfectly OK. You might want to think about replacing the while ( 1 ) loop with something like
while ( NULL == mysql_real_connect( ... )) {
sleep( 1 );
...
}
which is the kind of idiom that one learns by practice, but your code works just fine as far as I can see. Don't forget to put a counter inside the while loop.
is there a way for a forked child to examine another forked child so that, if the other forked child takes more time than usual to perform its chores, the first child may perform predefined steps?
if so, sample code will be greatly appreciated.
Yes. Simply fork the process to be watched, from the process to watch it.
if (fork() == 0) {
// we are the watcher
pid_t watchee_pid = fork();
if (watchee_pid != 0) {
// wait and/or handle timeout
int status;
waitpid(watchee_pid, &status, WNOHANG);
} else {
// we're being watched. do stuff
}
} else {
// original process
}
To emphasise: There are 3 processes. The original, the watcher process (that handles timeout etc.) and the actual watched process.
To do this, you'll need to use some form of IPC, and named shared memory segments makes perfect sense here. Your first child could read a value in a named segment which the other child will set once it has completed it's work. Your first child could set a time out and once that time out expires, check for the value - if the value is not set, then do what you need to do.
The code can vary greatly depending on C or C++, you need to select which. If C++, you can use boost::interprocess for this - which has lots of examples of shared memory usage. If C, then you'll have to put this together using native calls for your OS - again this should be fairly straightforward - start at shmget()
This is some orientative code that could help you to solve the problem in a Linux environment.
pid_t pid = fork();
if (pid == -1) {
printf("fork: %s", strerror(errno));
exit(1);
} else if (pid > 0) {
/* parent process */
int i = 0;
int secs = 60; /* 60 secs for the process to finish */
while(1) {
/* check if process with pid exists */
if (exist(pid) && i > secs) {
/* do something accordingly */
}
sleep(1);
i++;
}
} else {
/* child process */
/* child logic here */
exit(0);
}
... those 60 seconds are not very strict. you could better use a timer if you want more strict timing measurement. But if your system doesn't need critical real time processing should be just fine like this.
exist(pid) refers to a function that you should have code that looks into proc/pid where pid is the process id of the child process.
Optionally, you can implement the function exist(pid) using other libraries designed to extract information from the /proc directory like procps
The only processes you can wait on are your own direct child processes - not siblings, not your parent, not grandchildren, etc. Depending on your program's needs, Matt's solution may work for you. If not, here are some other alternatives:
Forget about waiting and use another form of IPC. For robustness, it needs to be something where unexpected termination of the process you're waiting on results in your receiving an event. The best one I can think of is opening a pipe which both processes share, and giving the writing end of the pipe to the process you want to wait for (make sure no other processes keep the writing end open!). When the process holding the writing end terminates, it will be closed, and the reading end will then indicate EOF (read will block on it until the writing end is closed, then return a zero-length read).
Forget about IPC and use threads. One advantage of threads is that the atomicity of a "process" is preserved. It's impossible for individual threads to be killed or otherwise terminate outside of the control of your program, so you don't have to worry about race conditions with process ids and shared resource allocation in the system-global namespace (IPC objects, filenames, sockets, etc.). All synchronization primitives exist purely within your process's address space.