Is async_connect really asynchronous under GNU/Linux? - c++

I wanted to check whether Boost Asio really performs an asynchronous connect or not. According to the diagrams corresponding to the asynchronous calls published in Asio's basics, the operation is started when the io_service signals the operating system, and therefore I understand that right after executing the async.connect instruction the system attempts to perform that connection.
What I have understood is, if you don't call run you just miss the results, but the operations are —probably— completed. So I tried that creating a dummy server with nc -l -p 9000 and then using the code attached below.
With a debugger, I have gone statement by statement and I have stopped right before the run call to the io_service. In the server, nothing happens. Nothing about the connect —this is obvious as the dummy server does not tell you about it— nor the async_write.
Immediately after calling the run function the corresponding messages pop up on the server side. I have been asking on Boost's IRC channel and after showing my strace, a really clever person told me that it probably is because the socket is not ready until run is called. Apparently, this does not happen on Windows.
Does that this mean that asynchronous is not really asynchronous under GNU/Linux operating systems? Does this mean that the diagram showed in the website does not correspond to GNU/Linux environments?
NOTE about "is not really asynchronous": Yes, it does not block the call and therefore the thread keeps running and doing things, but I mean asynchronous by starting the operations right after they have been executed.
Thank you very much indeed in advance.
The code
#include <iostream>
#include <string.h>
#include <boost/asio.hpp>
#include <boost/bind.hpp>
void connect_handler(const boost::system::error_code& error)
std::cout << error.message() << std::endl;
std::cout << "Successfully connected!" << std::endl;
void write_handler(const boost::system::error_code& error)
std::cout << error.message() << std::endl;
std::cout << "Yes, we wrote!" << std::endl;
int main()
boost::asio::io_service io_service;
boost::asio::ip::tcp::socket socket(io_service);
boost::asio::ip::tcp::endpoint endpoint(
boost::asio::ip::address::from_string(""), 9000);
socket.async_connect(endpoint, connect_handler);
std::string hello_world("Hello World!");
boost::asio::async_write(socket, boost::asio::buffer(hello_world.c_str(),
hello_world.size()), boost::bind(write_handler,
My strace
futex(0x7f44cd0ca03c, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x7f44cd0ca048, FUTEX_WAKE_PRIVATE, 2147483647) = 0
eventfd2(0, O_NONBLOCK|O_CLOEXEC) = 3
epoll_create1(EPOLL_CLOEXEC) = 4
timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC) = 5
epoll_ctl(4, EPOLL_CTL_ADD, 3, {EPOLLIN|EPOLLERR|EPOLLET, {u32=37842600, u64=37842600}}) = 0
write(3, "\1\0\0\0\0\0\0\0", 8) = 8
epoll_ctl(4, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLERR, {u32=37842612, u64=37842612}}) = 0
epoll_ctl(4, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP|EPOLLET, {u32=37842704, u64=37842704}}) = 0
ioctl(6, FIONBIO, [1]) = 0
connect(6, {sa_family=AF_INET, sin_port=htons(9000), sin_addr=inet_addr("")}, 16) = -1 EINPROGRESS (Operation now in progress)
epoll_ctl(4, EPOLL_CTL_MOD, 6, {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP|EPOLLET, {u32=37842704, u64=37842704}}) = 0
epoll_wait(4, {{EPOLLIN, {u32=37842600, u64=37842600}}, {EPOLLOUT, {u32=37842704, u64=37842704}}}, 128, -1) = 2
poll([{fd=6, events=POLLOUT}], 1, 0) = 1 ([{fd=6, revents=POLLOUT}])
getsockopt(6, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
sendmsg(6, {msg_name(0)=NULL, msg_iov(1)=[{"Hello World!", 12}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 12
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 5), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f44cd933000
write(1, "Successfully connected!\n", 24Successfully connected!
) = 24
epoll_wait(4, {}, 128, 0) = 0
write(1, "Yes, we wrote!\n", 15Yes, we wrote!
) = 15
exit_group(0) = ?
+++ exited with 0 +++

A weird thing about boost.asio -- not unique to it, but generally different from OS-specific asynchronous networking frameworks -- is that it does not rely on auxiliary threads. That has a number of important consequences, which can be boiled down to this: boost.asio is NOT for doing things in the background. Rather, it is for doing multiple things in the foreground.
io_service::run() is the "hub" of boost.asio, and a single-threaded program using boost.asio should expect to spend most of its time waiting inside io_service::run(), or executing a completion handler called by it. Depending on the OS-specific internal implementation, it might or might not be possible to have particular asynchronous operations running before that function gets called, which is why calling it is basically the first thing you do once you've kicked off your initial asynchronous requests.
Think of async_connect as "arming" your io_service with an asynchronous operation. The actual asynchrony happens during io_service::run(). In fact, calling async_connect followed immediately by async_write is a weird thing to do, and I'm a little surprised that it works. Ordinarily, you'd execute (or, rather, "arm") the async_write from within the connect_handler, because it's only at that point that you have a connected socket.


ZeroMQ IPC across several instances of a program

I am having some problems with inter process communication in ZMQ between several instances of a program
I am using Linux OS
I am using zeromq/cppzmq, header-only C++ binding for libzmq
If I run two instances of this application (say on a terminal), I provide one with an argument to be a listener, then providing the other with an argument to be a sender. The listener never receives a message. I have tried TCP and IPC to no avail.
#include <zmq.hpp>
#include <string>
#include <iostream>
int ListenMessage();
int SendMessage(std::string str);
zmq::context_t global_zmq_context(1);
int main(int argc, char* argv[] ) {
std::string str = "Hello World";
if (atoi(argv[1]) == 0) ListenMessage();
else SendMessage(str);
zmq_ctx_destroy(& global_zmq_context);
return 0;
int SendMessage(std::string str) {
std::cout << "Sending \n";
zmq::socket_t publisher(global_zmq_context, ZMQ_PUB);
int linger = 0;
int rc = zmq_setsockopt(publisher, ZMQ_LINGER, &linger, sizeof(linger));
rc = zmq_connect(publisher, "tcp://");
if (rc == -1) {
printf ("E: connect failed: %s\n", strerror (errno));
return -1;
zmq::message_t message(static_cast<const void*> (, str.size());
rc = publisher.send(message);
if (rc == -1) {
printf ("E: send failed: %s\n", strerror (errno));
return -1;
return 0;
int ListenMessage() {
std::cout << "Listening \n";
zmq::socket_t subscriber(global_zmq_context, ZMQ_SUB);
int rc = zmq_setsockopt(subscriber, ZMQ_SUBSCRIBE, "", 0);
int linger = 0;
rc = zmq_setsockopt(subscriber, ZMQ_LINGER, &linger, sizeof(linger));
rc = zmq_bind(subscriber, "tcp://");
if (rc == -1) {
printf ("E: bind failed: %s\n", strerror (errno));
return -1;
std::vector<zmq::pollitem_t> p = {{subscriber, 0, ZMQ_POLLIN, 0}};
while (true) {
zmq::message_t rx_msg;
// when timeout (the third argument here) is -1,
// then block until ready to receive
std::cout << "Still Listening before poll \n";
zmq::poll(, 1, -1);
std::cout << "Found an item \n"; // not reaching
if (p[0].revents & ZMQ_POLLIN) {
// received something on the first (only) socket
std::string rx_str;
rx_str.assign(static_cast<char *>(, rx_msg.size());
std::cout << "Received: " << rx_str << std::endl;
return 0;
This code will work if I running one instance of the program with two threads
std::thread t_sub(ListenMessage);
sleep(1); // Slow joiner in ZMQ PUB/SUB pattern
std::thread t_pub(SendMessage str);
But I am wondering why when running two instances of the program the code above won't work?
Thanks for your help!
In case one has never worked with ZeroMQ,one may here enjoy to first look at "ZeroMQ Principles in less than Five Seconds"before diving into further details
Q : wondering why when running two instances of the program the code above won't work?
This code will never fly - and it has nothing to do with thread-based nor the process-based [CONCURENT] processing.
It was caused by a wrong design of the Inter Process Communication.
ZeroMQ can provide for this either one of the supported transport-classes :{ ipc:// | tipc:// | tcp:// | norm:// | pgm:// | epgm:// | vmci:// } plus having even smarter one for in-process comms, an inproc:// transport-class ready for inter-thread comms, where a stack-less communication may enjoy the lowest ever latency, being just a memory-mapped policy.
The selection of L3/L2-based networking stack for an Inter-Process-Communication is possible, yet sort of the most "expensive" option.
The Core Mistake :
Given that choice, any single processes ( not speaking about a pair of processes ) will collide on an attempt to .bind() its AccessPoint onto the very same TCP/IP-address:port#
The Other Mistake :
Even for the sake of a solo programme launched, both of the spawned threads attempt to .bind() its private AccessPoint, yet none does an attempt to .connect() a matching "opposite" AccessPoint.
At least one has to successfully .bind(), and
at least one has to successfully .connect(), so as to get a "channel", here of the PUB/SUB Archetype.
decide about a proper, right-enough Transport-Class ( best avoid an overkill to operate the full L3/L2-stack for localhost/in-process IPC )
refactor the Address:port# management ( for 2+ processes not to fail on .bind()-(s) to the same ( hard-wired ) address:port#
always detect and handle appropriately the returned {PASS|FAIL}-s from API calls
always set LINGER to zero explicitly ( you never know )

Crash in a modified version of an official ZeroMQ mutithreaded example

I'm new to zmq and cppzmq. While trying to run the multithreaded example in the official guide:
My setup
macOS Mojave, Xcode 10.3
libzmq 4.3.2 via Homebrew
cppzmq GitHub HEAD
I hit a few problems.
Problem 1
When running source code in the guide, it hangs forever without any stdout output shown up.
Here is the code directly copied from the Guide.
Multithreaded Hello World server in C
#include <pthread.h>
#include <unistd.h>
#include <cassert>
#include <string>
#include <iostream>
#include <zmq.hpp>
void *worker_routine (void *arg)
zmq::context_t *context = (zmq::context_t *) arg;
zmq::socket_t socket (*context, ZMQ_REP);
socket.connect ("inproc://workers");
while (true) {
// Wait for next request from client
zmq::message_t request;
socket.recv (&request);
std::cout << "Received request: [" << (char*) << "]" << std::endl;
// Do some 'work'
sleep (1);
// Send reply back to client
zmq::message_t reply (6);
memcpy ((void *) (), "World", 6);
socket.send (reply);
return (NULL);
int main ()
// Prepare our context and sockets
zmq::context_t context (1);
zmq::socket_t clients (context, ZMQ_ROUTER);
clients.bind ("tcp://*:5555");
zmq::socket_t workers (context, ZMQ_DEALER);
workers.bind ("inproc://workers");
// Launch pool of worker threads
for (int thread_nbr = 0; thread_nbr != 5; thread_nbr++) {
pthread_t worker;
pthread_create (&worker, NULL, worker_routine, (void *) &context);
// Connect work threads to client threads via a queue
zmq::proxy (static_cast<void*>(clients),
return 0;
It crashes soon after I put a breakpoint in the while loop of the worker.
Problem 2
Noticing that the compiler prompted me to replace deprecated API calls, I modified the above sample code to make the warnings disappear.
Multithreaded Hello World server in C
#include <pthread.h>
#include <unistd.h>
#include <cassert>
#include <string>
#include <iostream>
#include <cstdio>
#include <zmq.hpp>
void *worker_routine (void *arg)
zmq::context_t *context = (zmq::context_t *) arg;
zmq::socket_t socket (*context, ZMQ_REP);
socket.connect ("inproc://workers");
while (true) {
// Wait for next request from client
std::array<char, 1024> buf{'\0'};
zmq::mutable_buffer request(, buf.size());
socket.recv(request, zmq::recv_flags::dontwait);
std::cout << "Received request: [" << (char*) << "]" << std::endl;
// Do some 'work'
sleep (1);
// Send reply back to client
zmq::message_t reply (6);
memcpy ((void *) (), "World", 6);
try {
socket.send (reply, zmq::send_flags::dontwait);
catch (zmq::error_t& e) {
printf("ERROR: %X\n", e.num());
return (NULL);
int main ()
// Prepare our context and sockets
zmq::context_t context (1);
zmq::socket_t clients (context, ZMQ_ROUTER);
clients.bind ("tcp://*:5555"); // who i talk to.
zmq::socket_t workers (context, ZMQ_DEALER);
workers.bind ("inproc://workers");
// Launch pool of worker threads
for (int thread_nbr = 0; thread_nbr != 5; thread_nbr++) {
pthread_t worker;
pthread_create (&worker, NULL, worker_routine, (void *) &context);
// Connect work threads to client threads via a queue
zmq::proxy (clients, workers);
return 0;
I'm not pretending to have a literal translation of the original broken example, but it's my effort to make things compile and run without obvious memory errors.
This code keeps giving me error number 9523DFB (156384763in Hex) from the try-catch block. I can't find the definition of the error number in official docs, but got it from this question that it's the native ZeroMQ error EFSM:
The zmq_send() operation cannot be performed on this socket at the moment due to the socket not being in the appropriate state. This error may occur with socket types that switch between several states, such as ZMQ_REP.
I'd appreciate it if anyone can point out where I did wrong.
I tried polling according to #user3666197 's suggestion. But still the program hangs. Inserting any breakpoint effectively crashes the program, making it difficult to debug.
Here is the new worker code
void *worker_routine (void *arg)
zmq::context_t *context = (zmq::context_t *) arg;
zmq::socket_t socket (*context, ZMQ_REP);
socket.connect ("inproc://workers");
zmq::pollitem_t items[1] = { { socket, 0, ZMQ_POLLIN, 0 } };
while (true) {
if(zmq::poll(items, 1, -1) < 1) {
printf("Terminating worker\n");
// Wait for next request from client
std::array<char, 1024> buf{'\0'};
socket.recv(zmq::buffer(buf), zmq::recv_flags::none);
std::cout << "Received request: [" << (char*) << "]" << std::endl;
// Do some 'work'
sleep (1);
// Send reply back to client
zmq::message_t reply (6);
memcpy ((void *) (), "World", 6);
try {
socket.send (reply, zmq::send_flags::dontwait);
catch (zmq::error_t& e) {
printf("ERROR: %s\n", e.what());
return (NULL);
Welcome to the domain of the Zen-of-Zero
Suspect #1: the code jumps straight into an unresolveable live-lock due to a move into ill-directed state of the distributed-Finite-State-Automaton:
While I since ever advocate for preferring non-blocking .recv()-s, the code above simply commits suicide right by using this step:
socket.recv( request, zmq::recv_flags::dontwait ); // socket being == ZMQ_REP
kills all chances for any other future life but the very error The zmq_send() operation cannot be performed on this socket at the moment due to the socket not being in the appropriate state.
going into the .send()-able state is possible if and only if a previous .recv()-ed has delivered a real message.
The Best Next Step :
Review the code and may either use a blocking-form of the .recv() before going to .send() or, better, use a { blocking | non-blocking }-form of .poll( { 0 | timeout }, ZMQ_POLLIN ) before entering into an attempt to .recv() and keep doing other things, if there is nothing to receive yet ( so as to avoid the self suicidal throwing the dFSA into an uresolvable collision, flooding your stdout/stderr with a second-spaced flow of printf(" ERROR: %X\n", e.num() ); )
Error Handling :
Better use const char *zmq_strerror ( int errnum ); being fed by int zmq_errno (void);
The Problem 1 :
On the contrary to the suicidal ::dontwait flag in the Problem 2 root cause, the Problem 2 root cause is, that a blocking-form of the first .recv() here moves all the worker-threads into an undeterministically long, possibly infinite, waiting-state, as the .recv()-blocks proceeding to any further step until a real message arrives ( which it does not seem from the MCVE, that it ever will ) and so your pool-of-threads remains in a pool-wide blocked-waiting-state and nothing will ever happen until any message arrived.
Update on how the REQ/REP works :
The REQ/REP Scalable Communication Pattern Archetype works like a distributed pair of people - one, let's call her Mary, asks ( Mary .send()-s the REQ ), while the other one, say Bob the REP listens in a potentially infinitely long blocking .recv() ( or takes a due care, using .poll() to orderly and regularly check, if Mary has asked about something or not and continues to do his own hobbies or gardening otherwise ) and once the Bob's end gets a message, Bob can go and .send() Mary a reply ( not before, as he knows nothing when and what Mary would ( or would not ) ask in the nearer of farther future ) ) and Mary is fair not to ask her next REQ.send()-question to Bob anytime sooner but after Bob has ( REP.send() ) replied and Mary has received Bob's message ( REQ.recv() ) - which is fair and more symmetric, than a real life may exhibit among real people under one roof :o)
The code?
The code is not a reproducible MCVE. The main() creates five Bobs ( hanging waiting a call from Mary, somewhere over inproc:// transport-class ), but no Mary ever calls, or does she? Not visible sign of any Mary trying to do so, the less her ( their, could be a (even a dynamic) community of N:M herd-of-Mary(s):herd-of-5-Bobs relation ) attempt(s) to handle REP-ly(s) coming from either one of the 5-Bobs.
Persevere, ZeroMQ took me some time of scratching my own head, yet the years after I took a due care to learn the Zen-of-Zero are still a rewarding eternal walk in the Gardens of Paradise. No localhost serial-code IDE will ever be able to "debug" a distributed-system (unless a distributed-inspector infrastructure is inplace, a due architecture for a distributed-system monitor/tracer/debugger is another layer of distributed messaging/signaling layer atop of the debugged distributed messaging/signaling system - so do not expect it from a trivial localhost serial-code IDE.
If still in doubts, isolate potential troublemakers - replace inproc:// with tcp:// and if toys do not work with tcp:// (where one can wire-line trace the messages) it won't with inproc:// memory-zone tricks.
About the hanging that I saw in my UPDATED question, I finally figured out what's going on. It's a false expectation on my part.
This very sample code in my question is never meant to be a self-contained service/client code: It is a server-only app with ZMQ_REP socket. It just waits for any client code to send request through ZMQ_REQ sockets. So the "hang" that I was seeing is completely normal!
As soon as I hook up a client app to it, things start rolling instantly. This chapter is somewhere in the middle of the Guide and I was only concerned with multithreading so I skipped many code samples and messaging patterns, which led to my confusion.
The code comments even said it's a server, but I expected to see explicit confirmation from the program. So to be fair the lack of visual cue and the compiler deprecation warning caused me to question the sample code as a new user, but the story that the code tells is valid.
Such a shame on wasted time! But all of a sudden all #user3666197 says in his answer starts to make sense.
For the completeness of this question, the updated server thread worker code that works:
// server.cpp
void *worker_routine (void *arg)
zmq::context_t *context = (zmq::context_t *) arg;
zmq::socket_t socket (*context, ZMQ_REP);
socket.connect ("inproc://workers");
while (true) {
// Wait for next request from client
std::array<char, 1024> buf{'\0'};
socket.recv(zmq::buffer(buf), zmq::recv_flags::none);
std::cout << "Received request: [" << (char*) << "]" << std::endl;
// Do some 'work'
sleep (1);
// Send reply back to client
zmq::message_t reply (6);
memcpy ((void *) (), "World", 6);
try {
socket.send (reply, zmq::send_flags::dontwait);
catch (zmq::error_t& e) {
printf("ERROR: %s\n", e.what());
return (NULL);
The much needed client code:
// client.cpp
int main (void)
void *context = zmq_ctx_new ();
// Socket to talk to server
void *requester = zmq_socket (context, ZMQ_REQ);
zmq_connect (requester, "tcp://localhost:5555");
int request_nbr;
for (request_nbr = 0; request_nbr != 10; request_nbr++) {
zmq_send (requester, "Hello", 6, 0);
char buf[6];
zmq_recv (requester, buf, 6, 0);
printf ("Received reply %d [%s]\n", request_nbr, buf);
zmq_close (requester);
zmq_ctx_destroy (context);
return 0;
The server worker does not have to poll manually because it has been wrapped into the zmq::proxy.

ZeroMQ: Address in use error when re-binding socket

After binding a ZeroMQ socket to an endpoint and closing the socket, binding another socket to the same endpoint requires several attempts. The previous calls to zmq_bind up until the successful one fail with the error "Address in use" (EADDRINUSE).
The following code demonstrates the problem:
#include <cassert>
#include <iostream>
#include "zmq.h"
int main() {
void *ctx = zmq_ctx_new();
assert( ctx );
void *skt;
skt = zmq_socket( ctx, ZMQ_REP );
assert( skt );
assert( zmq_bind( skt, "tcp://*:5555" ) == 0 );
assert( zmq_close( skt ) == 0 );
skt = zmq_socket( ctx, ZMQ_REP );
assert( skt );
int fail = 0;
while ( zmq_bind( skt, "tcp://*:5555" ) ) { ++fail; }
std::cout << fail << std::endl;
I'm using ZeroMQ 4.0.3 on Windows XP SP3, compiler is VS 2008. libzmq.dll has been built with the provided Visual Studio solution.
This prints 1 here when doing a "Debug" build (both of the code above and of libzmq.dll) and 0 using a "Release" build. Strange enough, when running the code above with mixed build configuration (Debug with Release lib), fail counts up to 6.
Pieter Hintjens gave me the hint on the mailing list:
The call to zmq_close initiates the socket shutdown. This is done in a special "reaper" thread started by ZeroMQ to make the call to zmq_close asynchronous and non-blocking. See "The reaper thread" in a whitepaper about ZeroMQ's architecture.
The code above does not wait for the thread doing the actual work, so the endpoint will not become available immediately.
When a TCP socket is closed, it enters a state called TIME_WAIT. This means that while the socket is in that state, it's not really closed, and that in turn means that the address used by the socket is not available until it leave the state.
So if you run your program two times in close succession the socket will be in this TIME_WAIT state from the first run when you try the second run, and you get an error like this.
You might want to read more about TCP, and especially about its operation and states.

send() crashes my program

I'm running a server and a client. i'm testing my program on my computer.
this is the funcion in the server that sends data to the client:
int sendToClient(int fd, string msg) {
cout << "sending to client " << fd << " " << msg <<endl;
int len = msg.size()+1;
cout << "10\n";
/* send msg size */
if (send(fd,&len,sizeof(int),0)==-1) {
cout << "error sendToClient\n";
return -1;
cout << "11\n";
/* send msg */
int nbytes = send(fd,msg.c_str(),len,0); //CRASHES HERE
cout << "15\n";
return nbytes;
when the client exits it sends to the server "BYE" and the server is replying it with the above function. I connect the client to the server (its done on one computer, 2 terminals) and when the client exits the server crashes - it never prints the 15.
any idea why ? any idea how to test why?
thank you.
EDIT: this is how i close the client:
void closeClient(int notifyServer = 0) {
/** notify server before closing */
if (notifyServer) {
int len = SERVER_PROTOCOL[bye].size()+1;
char* buf = new char[len];
strcpy(buf,SERVER_PROTOCOL[bye].c_str()); //c_str - NEED TO FREE????
delete[] buf;
btw, if i skipp this code, meaning just leave the close(_sockfd) without notifying the server everything is ok - the server doesn't crash.
EDIT 2: this is the end of strace.out:
5211 recv(5, "BYE\0", 4, 0) = 4
5211 write(1, "received from client 5 \n", 24) = 24
5211 write(1, "command: BYE msg: \n", 19) = 19
5211 write(1, "BYEBYE\n", 7) = 7
5211 write(1, "response = ALALA!!!\n", 20) = 20
5211 write(1, "sending to client 5 ALALA!!!\n", 29) = 29
5211 write(1, "10\n", 3) = 3
5211 send(5, "\t\0\0\0", 4, 0) = 4
5211 write(1, "11\n", 3) = 3
5211 send(5, "ALALA!!!\0", 9, 0) = -1 EPIPE (Broken pipe)
5211 --- SIGPIPE (Broken pipe) # 0 (0) ---
5211 +++ killed by SIGPIPE +++
broken pipe can kill my program?? why not just return -1 by send()??
You may want to specify MSG_NOSIGNAL in the flags:
int nbytes = send(fd,msg.c_str(), msg.size(), MSG_NOSIGNAL);
You're getting SIGPIPE because of a "feature" in Unix that raises SIGPIPE when trying to send on a socket that the remote peer has closed. Since you don't handle the signal, the default signal-handler is called, and it aborts/crashes your program.
To get the behavior your want (i.e. make send() return with an error, instead of raising a signal), add this to your program's startup routine (e.g. top of main()):
#include <signal.h>
int main(int argc, char ** argv)
Probably the clients exits before the server has completed the sending, thus breaking the socket between them. Thus making send to crash.
This socket was connected but the
connection is now broken. In this
case, send generates a SIGPIPE signal
first; if that signal is ignored or
blocked, or if its handler returns,
then send fails with EPIPE.
If the client exits before the second send from the server, and the connection is not disposed of properly, your server keeps hanging and this could provoke the crash.
Just a guess, since we don't know what server and client actually do.
I find the following line of code strange because you define int len = msg.size()+1;.
int nbytes = send(fd,msg.c_str(),len,0); //CRASHES HERE
What happens if you define int len = msg.size();?
If you are on Linux, try to run the server inside strace. This will write lots of useful data to a log file.
strace -f -o strace.out ./server
Then have a look at the end of the log file. Maybe it's obvious what the program did and when it crashed, maybe not. In the latter case: Post the last lines here.

How to correctly read data when using epoll_wait

I am trying to port to Linux an existing Windows C++ code that uses IOCP. Having decided to use epoll_wait to achieve high concurrency, I am already faced with a theoretical issue of when we try to process received data.
Imagine two threads calling epoll_wait, and two consequetives messages being received such that Linux unblocks the first thread and soon the second.
Example :
Thread 1 blocks on epoll_wait
Thread 2 blocks on epoll_wait
Client sends a chunk of data 1
Thread 1 deblocks from epoll_wait, performs recv and tries to process data
Client sends a chunk of data 2
Thread 2 deblocks, performs recv and tries to process data.
Is this scenario conceivable ? I.e. can it occure ?
Is there a way to prevent it so to avoid implementing synchronization in the recv/processing code ?
If you have multiple threads reading from the same set of epoll handles, I would recommend you put your epoll handles in one-shot level-triggered mode with EPOLLONESHOT. This will ensure that, after one thread observes the triggered handle, no other thread will observe it until you use epoll_ctl to re-arm the handle.
If you need to handle read and write paths independently, you may want to completely split up the read and write thread pools; have one epoll handle for read events, and one for write events, and assign threads to one or the other exclusively. Further, have a separate lock for read and for write paths. You must be careful about interactions between the read and write threads as far as modifying any per-socket state, of course.
If you do go with that split approach, you need to put some thought into how to handle socket closures. Most likely you will want an additional shared-data lock, and 'acknowledge closure' flags, set under the shared data lock, for both read and write paths. Read and write threads can then race to acknowledge, and the last one to acknowledge gets to clean up the shared data structures. That is, something like this:
void OnSocketClosed(shareddatastructure *pShared, int writer)
epoll_ctl(myepollhandle, EPOLL_CTL_DEL, pShared->fd, NULL);
if (writer)
pShared->close_ack_w = true;
pShared->close_ack_r = true;
bool acked = pShared->close_ack_w && pShared->close_ack_r;
if (acked)
I'm assuming here that the situation you're trying to process is something like this:
You have multiple (maybe very many) sockets that you want to receive data from at once;
You want to start processing data from the first connection on Thread A when it is first received and then be sure that data from this connection is not processed on any other thread until you have finished with it in Thread A.
While you are doing that, if some data is now received on a different connection you want Thread B to pick that data and process it while still being sure that no one else can process this connection until Thread B is done with it etc.
Under these circumstances it turns out that using epoll_wait() with the same epoll fd in multiple threads is a reasonably efficient approach (I'm not claiming that it is necessarily the most efficient).
The trick here is to add the individual connections fds to the epoll fd with the EPOLLONESHOT flag. This ensures that once an fd has been returned from an epoll_wait() it is unmonitored until you specifically tell epoll to monitor it again. This ensures that the thread processing this connection suffers no interference as no other thread can be processing the same connection until this thread marks the connection to be monitored again.
You can set up the fd to monitor EPOLLIN or EPOLLOUT again using epoll_ctl() and EPOLL_CTL_MOD.
A significant benefit of using epoll like this in multiple threads is that when one thread is finished with a connection and adds it back to the epoll monitored set, any other threads still in epoll_wait() are immediately monitoring it even before the previous processing thread returns to epoll_wait(). Incidentally that could also be a disadvantage because of lack of cache data locality if a different thread now picks up that connection immediately (thus needing to fetch the data structures for this connection and flush the previous thread's cache). What works best will sensitively depend on your exact usage pattern.
If you are trying to process messages received subsequently on the same connection in different threads then this scheme to use epoll is not going to be appropriate for you, and an approach using a listening thread feeding an efficient queue feeding worker threads might be better.
Previous answers that point out that calling epoll_wait() from multiple threads is a bad idea are almost certainly right, but I was intrigued enough by the question to try and work out what does happen when it is called from multiple threads on the same handle, waiting for the same socket. I wrote the following test code:
#include <netinet/in.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/epoll.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <unistd.h>
struct thread_info {
int number;
int socket;
int epoll;
void * thread(struct thread_info * arg)
struct epoll_event events[10];
int s;
char buf[512];
sleep(5 * arg->number);
printf("Thread %d start\n", arg->number);
do {
s = epoll_wait(arg->epoll, events, 10, -1);
if (s < 0) {
} else if (s == 0) {
printf("Thread %d No data\n", arg->number);
if (recv(arg->socket, buf, 512, 0) <= 0) {
printf("Thread %d got data\n", arg->number);
} while (s == 1);
printf("Thread %d end\n", arg->number);
return 0;
int main()
pthread_attr_t attr;
pthread_t threads[2];
struct thread_info thread_data[2];
int s;
int listener, client, epollfd;
struct sockaddr_in listen_address;
struct sockaddr_storage client_address;
socklen_t client_address_len;
struct epoll_event ev;
listener = socket(AF_INET, SOCK_STREAM, 0);
if (listener < 0) {
memset(&listen_address, 0, sizeof(struct sockaddr_in));
listen_address.sin_family = AF_INET;
listen_address.sin_addr.s_addr = INADDR_ANY;
listen_address.sin_port = htons(6799);
s = bind(listener,
(struct sockaddr*)&listen_address,
if (s != 0) {
s = listen(listener, 1);
if (s != 0) {
client_address_len = sizeof(client_address);
client = accept(listener,
(struct sockaddr*)&client_address,
epollfd = epoll_create(10);
if (epollfd == -1) {
} = EPOLLIN; = client;
if (epoll_ctl(epollfd, EPOLL_CTL_ADD, client, &ev) == -1) {
perror("epoll_ctl: listen_sock");
thread_data[0].number = 0;
thread_data[1].number = 1;
thread_data[0].socket = client;
thread_data[1].socket = client;
thread_data[0].epoll = epollfd;
thread_data[1].epoll = epollfd;
s = pthread_attr_init(&attr);
if (s != 0) {
s = pthread_create(&threads[0],
if (s != 0) {
s = pthread_create(&threads[1],
if (s != 0) {
pthread_join(threads[0], 0);
pthread_join(threads[1], 0);
return 0;
When data arrives, and both threads are waiting on epoll_wait(), only one will return, but as subsequent data arrives, the thread that wakes up to handle the data is effectively random between the two threads. I wasn't able to to find a way to affect which thread was woken.
It seems likely that a single thread calling epoll_wait makes most sense, with events passed to worker threads to pump the IO.
I believe that the high performance software that uses epoll and a thread per core creates multiple epoll handles that each handle a subset of all the connections. In this way the work is divided but the problem you describe is avoided.
Generally, epoll is used when you have a single thread listening for data on a single asynchronous source. To avoid busy-waiting (manually polling), you use epoll to let you know when data is ready (much like select does).
It is not standard practice to have multiple threads reading from a single data source, and I, at least, would consider it bad practice.
If you want to use multiple threads, but you only have one input source, then designate one of the threads to listen and queue the data so the other threads can read individual pieces from the queue.