Simplest IPC from one Linux app to another in C++ on raspberry pi - c++

I need the simplest most reliable IPC method from one C++ app running on the RPi to another app.
All I'm trying to do is send a string message of 40 characters from one app to another
The first app is running as a service on boot, the other app is started at a later time and is frequently exited and restarted for debugging
The frequent debugging for the second app is whats causing problems with the IPCs I've tried so far
I've tried about 3 different methods and here is where they failed:
File FIFO, the problem is one program hangs while the other program is writing to the file
Shared memory: cannot initialize on one thread and read from another thread. Also frequent exiting while debugging causing GDB crashes with the following GDB command is taking too long to complete -stack-list-frames --thread 1
UDP socket with localhost - same issue as above, plus improper exits block the socket, forcing me to reboot device
Non blocking pipe - not getting any messages on the receiving process
What else can I try? I dont want to get the DBus library, seems too complex for this application.
Any simple server and client code or a link to it would be helpful
Here is my non-blockign pipe code, that doesnt work for me,
I assume its because I dont have a reference to the pipe from one app to the other
Code sourced from here: https://www.geeksforgeeks.org/non-blocking-io-with-pipes-in-c/
char* msg1 = "hello";
char* msg2 = "bye !!";
int p[2], i;
bool InitClient()
{
// error checking for pipe
if(pipe(p) < 0)
exit(1);
// error checking for fcntl
if(fcntl(p[0], F_SETFL, O_NONBLOCK) < 0)
exit(2);
//Read
int nread;
char buf[MSGSIZE];
// write link
close(p[1]);
while (1) {
// read call if return -1 then pipe is
// empty because of fcntl
nread = read(p[0], buf, MSGSIZE);
switch (nread) {
case -1:
// case -1 means pipe is empty and errono
// set EAGAIN
if(errno == EAGAIN) {
printf("(pipe empty)\n");
sleep(1);
break;
}
default:
// text read
// by default return no. of bytes
// which read call read at that time
printf("MSG = % s\n", buf);
}
}
return true;
}
bool InitServer()
{
// error checking for pipe
if(pipe(p) < 0)
exit(1);
// error checking for fcntl
if(fcntl(p[0], F_SETFL, O_NONBLOCK) < 0)
exit(2);
//Write
// read link
close(p[0]);
// write 3 times "hello" in 3 second interval
for(i = 0 ; i < 3000000000 ; i++) {
write(p[0], msg1, MSGSIZE);
sleep(3);
}
// write "bye" one times
write(p[0], msg2, MSGSIZE);
return true;
}

Please consider ZeroMQ
https://zeromq.org/
It is lightweight and has wrapper for all major programming languages.

Related

C++ + linux handle SIGPIPE signal

Yes, I understand this issue has been discussed many times.
And yes, I've seen and read these and other discussions:
1
2
3
and I still can't fix my code myself.
I am writing my own web server. In the next cycle, it listens on a socket, connects each new client and writes it to a vector.
Into my class i have this struct:
struct Connection
{
int socket;
std::chrono::system_clock::time_point tp;
std::string request;
};
with next data structures:
std::mutex connected_clients_mux_;
std::vector<HttpServer::Connection> connected_clients_;
and the cycle itself:
//...
bind (listen_socket_, (struct sockaddr *)&addr_, sizeof(addr_));
listen(listen_socket_, 4 );
while(1){
connection_socket_ = accept(listen_socket_, NULL, NULL);
//...
Connection connection_;
//...
connected_clients_mux_.lock();
this->connected_clients_.push_back(connection_);
connected_clients_mux_.unlock();
}
it works, clients connect, send and receive requests.
But the problem is that if the connection is broken ("^C" for client), then my program will not know about it even at the moment:
void SendRespons(HttpServer::Connection socket_){
write(socket_.socket,( socket_.request + std::to_string(socket_.socket)).c_str(), 1024);
}
as the title of this question suggests, my app receives a SIGPIPE signal.
Again, I have seen "solutions".
signal(SIGPIPE, &SigPipeHandler);
void SigPipeHandler(int s) {
//printf("Caught SIGPIPE\n%d",s);
}
but it does not help. At this moment, we have the "№" of the socket to which the write was made, is it possible to "remember" it and close this particular connection in the handler method?
my system:
Operating System: Ubuntu 20.04.2 LTS
Kernel: Linux 5.8.0-43-generic
g++ --version
g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
As stated in the links you give, the solution is to ignore SIGPIPE, and CHECK THE RETURN VALUE of the write calls. This latter is needed for correct operation (short writes) in all but the most trivial, unloaded cases anyways. Also the fixed write size of 1024 that you are using is probably not what you want -- if your response string is shorter, you'll send a bunch of random garbage along with it. You probably really want something like:
void SendRespons(HttpServer::Connection socket_){
auto data = socket_.request + std::to_string(socket_.socket);
int sent = 0;
while (sent < data.size()) {
int len = write(socket_.socket, &data[sent], data.size() - sent);
if (len < 0) {
// there was an error -- might be EPIPE or EAGAIN or EINTR or ever a few other
// obscure corner cases. For EAGAIN or EINTR (which can only happen if your
// program is set up to allow them), you probably want to try again.
// Anything else, probably just close the socket and clean up.
if (errno == EINTR)
continue;
close(socket_.socket);
// should tell someone about it?
break; }
sent += len; }
}

valgrind/helgrind gets killed on stress test

I'm making a web server on linux in C++ with pthreads. I tested it with valgrind for leaks and memory problems - all fixed. I tested it with helgrind for thread problems - all fixed. I'm trying a stress test. I'm getting problem when the probram is run with helgrind
valgrind --tool=helgrind ./chats
It just dies on random places with the text "Killed" as it would do when I kill it with kill -9. The only report I get sometimes from helgrind is that the program exists while still holding some locks, which is normal when gets killed.
When checking for leaks:
valgrind --leak-check=full ./chats
it's more stable, but I managed to make it die once with few hundreds of concurrent connections.
I tried running program alone and couldn't make it crash at all. I tried up to 250 concurrent connections. Each thread delays with 100ms to make it easier to have multiple connections at the same time. No crash.
In all cases threads as well as connections do not get above 10 and I see it crash even with 2 connections, but never with only one connection at the same time (with including main thread and one helper thread is total of 3).
Is it possible that the problem will only happen when run with
helgrind or just helgrind makes it more likely to show?
What be the reason that a program gets killed (by kernel?) Allocating too much memory, too many file descriptors?
I tested a bit more and I found out that it only dies when the client times out and closes the connection. So here is the code which detects that the client closed the socket:
void *TcpClient::run(){
int ret;
struct timeval tv;
char * buff = (char *)malloc(10001);
int br;
colorPrintf(TC_GREEN, "new client starting: %d\n", sockFd);
while(isRunning()){
tv.tv_sec = 0;
tv.tv_usec = 500*1000;
FD_SET(sockFd, &readFds);
ret = select(sockFd+1, &readFds, NULL, NULL, &tv);
if(ret < 0){
//select error
continue;
}else if(ret == 0){
// no data to read
continue;
}
br = read(sockFd, buff, 10000);
buff[br] = 0;
if (br == 0){
// client disconnected;
setRunning(false);
break;
}
if (reader != NULL){
reader->tcpRead(this, std::string(buff, br));
}else{
readBuffer.append(buff, br);
}
//printf("received: %s\n", buff);
}
free(buff);
sendFeedback((void *)1);
colorPrintf(TC_RED, "closing client socket: %d\n", sockFd);
::close(sockFd);
sockFd = -1;
return NULL;
}
// this method writes to socket
bool TcpClient::write(std::string data){
int bw;
int dataLen = data.length();
bw = ::write(sockFd, data.data(), dataLen);
if (bw != dataLen){
return false; // I don't close the socket in this case, maybe I should
}
return true;
}
P.S. Threads are:
main thread. connections are accepted here.
one helper thread which listen for signals and sends signals. It stops signal reception for the app and manually polls the signal queue. The reason is because it's hard to handle signals when using threads. I found this technique here in stackoverflow and it seams to work pretty fine in other projects.
client connection threads
The full code is pretty big, but I can post chunks if someone is interested.
Update:
I managed to trigger the problem with only one connection. It's all happening in client thread. This is what I do:
I read/parse headers. I put delay before writing so the client can timeout (which causes the problem).
Here the client timeouts and leaves (probably closes socket)
I write back headers
I write back the html code.
Here is how I write back
bw = ::write(sockFd, data.data(), dataLen);
// bw is = dataLen = 108 when writing the headers
//then secondary write for HTML kills the program. there is a message before and after write()
bw = ::write(sockFd, data.data(), dataLen); // doesn't go past this point second time
Update 2: Got it :)
gdb sais:
Program received signal SIGPIPE, Broken pipe.
[Switching to Thread 0x41401940 (LWP 10554)]
0x0000003ac2e0d89b in write () from /lib64/libpthread.so.0
Question 1: What should I do to void receiving this signal.
Question 2: How to know that remote side disconnected while writing. On read select returns that there is data but data read is 0. How about write?
Well I just had to handle the SIGPIPE singal and write returned -1 -> I close socket and quit thread gracefully. Works like a charm.
I guess the easiest way is to set signal handler of SIGPIPE to SIG_IGN:
signal(SIGPIPE, SIG_IGN);
Note that first write was successful and didn't kill the program. If you have similar problem check if you are writing once or multiple times. If you are not familiar with gdb this is how to do it:
gdb ./your-program
> run
and gdb will tell you all about signals and sigfaults.

Linux Pipe replace stdio - issues with MPI

This question is the next step after resolving the issue discussed in:
Piping for input/output
I use pipes to pass a string via stdin to an external program called GULP, and receive the stdout of GULP as input for my program. This works fine on one processor, but on two or more processors there's a problem (let's say it's just 2 cores). The program GULP uses a temporary file and it seems that the two processors launch GULP simultaneously and then GULP tries to perform multiple operations on the same file at the same time (maybe simultaneous writes). GULP reports "error opening file".
I am testing this code on a laptop with multiple cores running Ubuntu, but the code is intended for a distributed-memory HPC (I'm using OpenMPI). Assume for the sake of this discussion that I cannot modify GULP.
I'm hoping that there's some straightforward way to get GULP to create two independent temporary files and continue functioning as normal. Am I asking for too much?
Hopefully this pseudo code will help (assume 2 processors):
int main()
{
MPI_Init(&argc,&argv);
MPI_Comm_rank(…);
MPI_Comm_size(…);
int loopmin, loopmax;//distributes the loop among each processor
for (int i = loopmin; i < loopmax; i++)
{
Launch_GULP(…);//launches external program
}
return 0;
}
Launch_GULP(…)
{
int fd_p2c[2], fd_c2p[2];
pipe(fd_p2c);
pipe(fd_c2p);
childpid = fork();
//the rest follows as in accepted answer in above link
//so i'll highlight the interesting stuff
if (childpid < 0)
{
perror("bad");
exit(-1);
}
else if (childpid == 0)
{
//call dup2, etc
execl( …call the program… );
}
else
{
//the interesting stuff
close(fd_p2c[0]);
close(fd_c2p[1]);
write(fd_p2c[1],…);
close(fd_p2c[1]);
while(1)
{
bytes_read = read(fd_c2p[0],…);//read GULP output
if (bytes_read <= 0)
break;
//pass info to read buffer & append null terminator
}
close(fd_c2p[0]);
if(kill(childpid,SIGTERM) != 0)
{
perror("Failed to kill child… tragic");
exit(1);
}
waitpid(childpid, NULL, 0);
}
//end piping… GULP has reported an error via stdout
//that error is stored in the buffer string
//consequently an error is triggered in my code and the program exits
}

Read Write issues with Pseudo Terminal in Linux

I am writing a C++ program that would interact with an external process. The external process is written in C# and runs on mono. Note that I cannot modify the C# code as it is not a program written by me.
In this regard, I first set out by using pipes, which of course as I later realized is fully buffered and hence I faced a lot of sync issues. Essentially the external process had to flush its output after every write and this was not possible.
The next thing that I was about to try out was files, but however I found out that using pseudo-terminals would be more apt in my case. Here is some sample code that I have written:
int main()
{
int fdm, fds, rc, pid;
bool rValue;
/* Setup Master pty*/
rValue = rValue && (fdm = posix_openpt(O_RDWR)) >= 0 &&
(rc = grantpt(fdm)) == 0 && (rc = unlockpt(fdm) == 0);
if (rValue) {
/* Open Slave pty */
fds = open(ptsname(fdm), O_RDWR);
pid = fork();
if(pid < 0)
perror("fork failed");
else if(pid == 0) //child
{
close(fdm); //close master
struct termios slave_orig_term_settings;
struct termios new_term_settings;
tcgetattr(slaveTTY, &slave_orig_term_settings);
new_term_settings = slave_orig_term_settings;
cfmakeraw(&new_term_settings);
tcsetattr(slaveTTY, TCSANOW, &new_term_settings);
//redirect I/O of this process
close(0);
close(1);
close(2);
dup(slaveTTY);
dup(slaveTTY);
dup(slaveTTY);
close(slaveTTY);
setsid();
ioctl(0, TIOCSCTTY, 1);
//launch the external process and replace its image in this process
execve(argv[0],...);
}
else
{
close(fds); //close slave
//Perform some interaction
write(something using fdm);
//Assume fdsets declared and set somewhere here
select(fdm +1,&fdset,NULL,NULL,NULL);
int readBytes = read(someting using fds);
}
}
return EXIT_SUCCESS;
}
Assume that the fdset and fdclr for select are being taken care of.
The following issues are being observed in the parent process:
Sometimes read returns with readBytes > 0 but there is nothing present in the buffer
Sometimes whatever has been written to the terminal is read back
Some garbage values such as ^]]49]1R are being dumped on the terminal (this is the actual terminal i.e. my output window)
P.S: When the external process is written in C/C++, this issue is not occuring. Only when I run a C# program in mono.
I think pexpect in python is a good choice if you don't have to do that in C++, it will save you a lot of time. And also you can use python freeze tools like pyinstaller to convert your python script to standalone binary.

waitpid/wexitstatus returning 0 instead of correct return code

I have the helper function below, used to execute a command and get the return value on posix systems. I used to use popen, but it is impossible to get the return code of an application with popen if it runs and exits before popen/pclose gets a chance to do its work.
The following helper function creates a process fork, uses execvp to run the desired external process, and then the parent uses waitpid to get the return code. I'm seeing odd cases where it's refusing to run.
When called with wait = true, waitpid should return the exit code of the application no matter what. However, I'm seeing stdout output that specifies the return code should be non-zero, yet the return code is zero. Testing the external process in a regular shell, then echoing $? returns non-zero, so it's not a problem w/ the external process not returning the right code. If it's of any help, the external process being run is mount(8) (yes, I know I can use mount(2) but that's besides the point).
I apologize in advance for a code dump. Most of it is debugging/logging:
inline int ForkAndRun(const std::string &command, const std::vector<std::string> &args, bool wait = false, std::string *output = NULL)
{
std::string debug;
std::vector<char*> argv;
for(size_t i = 0; i < args.size(); ++i)
{
argv.push_back(const_cast<char*>(args[i].c_str()));
debug += "\"";
debug += args[i];
debug += "\" ";
}
argv.push_back((char*)NULL);
neosmart::logger.Debug("Executing %s", debug.c_str());
int pipefd[2];
if (pipe(pipefd) != 0)
{
neosmart::logger.Error("Failed to create pipe descriptor when trying to launch %s", debug.c_str());
return EXIT_FAILURE;
}
pid_t pid = fork();
if (pid == 0)
{
close(pipefd[STDIN_FILENO]); //child isn't going to be reading
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO);
if (execvp(command.c_str(), &argv[0]) != 0)
{
exit(EXIT_FAILURE);
}
return 0;
}
else if (pid < 0)
{
neosmart::logger.Error("Failed to fork when trying to launch %s", debug.c_str());
return EXIT_FAILURE;
}
else
{
close(pipefd[STDOUT_FILENO]);
int exitCode = 0;
if (wait)
{
waitpid(pid, &exitCode, wait ? __WALL : (WNOHANG | WUNTRACED));
std::string result;
char buffer[128];
ssize_t bytesRead;
while ((bytesRead = read(pipefd[STDIN_FILENO], buffer, sizeof(buffer)-1)) != 0)
{
buffer[bytesRead] = '\0';
result += buffer;
}
if (wait)
{
if ((WIFEXITED(exitCode)) == 0)
{
neosmart::logger.Error("Failed to run command %s", debug.c_str());
neosmart::logger.Info("Output:\n%s", result.c_str());
}
else
{
neosmart::logger.Debug("Output:\n%s", result.c_str());
exitCode = WEXITSTATUS(exitCode);
if (exitCode != 0)
{
neosmart::logger.Info("Return code %d", (exitCode));
}
}
}
if (output)
{
result.swap(*output);
}
}
close(pipefd[STDIN_FILENO]);
return exitCode;
}
}
Note that the command is run OK with the correct parameters, the function proceeds without any problems, and WIFEXITED returns TRUE. However, WEXITSTATUS returns 0, when it should be returning something else.
Probably isn't your main issue, but I think I see a small problem. In your child process, you have...
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO); //but wait, this pipe is closed!
But I think what you want is:
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd for both, can close
I don't have much experience with forks and pipes in Linux, but I did write a similar function pretty recently. You can take a look at the code to compare, if you'd like. I know that my function works.
execAndRedirect.cpp
I'm using the mongoose library, and grepping my code for SIGCHLD revealed that using mg_start from mongoose results in setting SIGCHLD to SIG_IGN.
From the waitpid man page, on Linux a SIGCHLD set to SIG_IGN will not create a zombie process, so waitpid will fail if the process has already successfully run and exited - but will run OK if it hasn't yet. This was the cause of the sporadic failure of my code.
Simply re-setting SIGCHLD after calling mg_start to a void function that does absolutely nothing was enough to keep the zombie records from being immediately erased.
Per #Geoff_Montee's advice, there was a bug in my redirect of STDERR, but this was not responsible for the problem as execvp does not store the return value in STDERR or even STDOUT, but rather in the kernel object associated with the parent process (the zombie record).
#jilles' warning about non-contiguity of vector in C++ does not apply for C++03 and up (only valid for C++98, though in practice, most C++98 compilers did use contiguous storage, anyway) and was not related to this issue. However, the advice on reading from the pipe before blocking and checking the output of waitpid is spot-on.
I've found that pclose does NOT block and wait for the process to end, contrary to the documentation (this is on CentOS 6). I've found that I need to call pclose and then call waitpid(pid,&status,0); to get the true return value.