Why a zmq REQ-REP not working? - c++

I have 1 master ( using REQ ) and 2 slaves ( A, B ) using REP. The master sends a request to one of the slaves and expects a response from him.
The message is being sent to the wrong slave, even if I set the address in the ZMQ envelope. How to specify the slave address? I think I am setting it correctly in master, but it's not working and sending the requests from the master in round robin fashion.
master.cpp
#include "zhelpers.hpp"
#include <string>
int main (int argc, char *argv[])
{
zmq::context_t context(1);
zmq::socket_t requester(context, ZMQ_REQ);
requester.setsockopt(ZMQ_IDENTITY,"M");
requester.bind("tcp://*:5559");
for( int request = 0 ; request < 10 ; request++) {
std::string cmd;
std::cin>>cmd;
s_sendmore (requester, "B");
s_sendmore (requester, "");
s_send (requester, cmd);
s_dump(requester);
}
}
slaveA.cpp
#include "zhelpers.hpp"
int main (int argc, char *argv[])
{
zmq::context_t context(1);
zmq::socket_t responder(context, ZMQ_REQ);
responder.setsockopt(ZMQ_IDENTITY, "A", 1);
responder.connect("tcp://localhost:5559");
while(1)
{
s_dump(responder);
sleep (1);//
// s_sendmore (responder, "M"); //Should I set this ??
// s_sendmore (responder, "");
s_send (responder, "FromSlaveA");
}
}
slaveB.cpp
#include "zhelpers.hpp"
int main (int argc, char *argv[])
{
zmq::context_t context(1);
zmq::socket_t responder(context, ZMQ_REP);
responder.setsockopt(ZMQ_IDENTITY, "B", 1);
responder.connect("tcp://localhost:5559");
while(1)
{
s_dump(responder);
sleep (1);
s_send (responder, "FromSlaveB");
}
}
What is wrong?
OS: Ubuntu 16.04, ZMQ version 4.X.X
Update 1:
Changed the slaveA socket to REP but still master is sending the message to slaveA and SlaveB in a round robin fashion. Now, I think am I setting the message envelop correctly to slaveB ? But when I print the envelope, I get this at slave's that proves I set the envelop to B correctly, isn't ?
[001]B
[000]
[005]jjjjj

Your slaveA.cpp ought use the proper ZeroMQ access-point archetype:
zmq::socket_t responder( context, ZMQ_REP ); // !ZMQ_REQ);
Post Festum: Documentation to be read twice before generating issues
ZMQ_IDENTITY: Set socket identity
The ZMQ_IDENTITY option shall set the identity of the specified socket when connecting to a ROUTER socket. The identity should be from 1 to 255 bytes long and may contain any values.
If two clients use the same identity when connecting to a ROUTER, the results shall depend on the ZMQ_ROUTER_HANDOVER option setting. If that is not set ( or set to the default of zero ), the ROUTER socket shall reject clients trying to connect with an already-used identity. If that option is set to 1, the ROUTER socket shall hand-over the connection to the new client and disconnect the existing one.
Option value type binary data
Option value unit N/A
Default value NULL
Applicable socket types ZMQ_REQ, ZMQ_REP, ZMQ_ROUTER, ZMQ_DEALER.
Sorry,this sort of .setsockopt() settings is:
1) this feature is excellently documented having all relevant details of operation & possible concepts of use
2) meaningful only for objects that are not present in your code at all
3) decide on a functional behaviour that is not relevant to your code at all
4) this feature is absolutely irrelevant to the hard-wired round-robin delivery of the objects that are present in your code, for the outgoing traffic.
So, read the documentation, it is "enough" and a fair way to professionally avoid generating issues.
Q.E.D.

Related

C++ ZMQ Pub and Sub not connecting

I am currently working on a project that requires me to connect two terminals via ZMQ sockets, and my current solution does so via the PUB-SUB Socket functionality. However, when I run the programs, while the publisher sends the messages, the subscriber never receives any of the messages. I've tried changing the IP address between them, and trying to "brute force send" message between the sub and the pub, but to no avail.
Reduced form of the code:
Server.cpp:
#include <zmq.h>
const char* C_TO_S = "tcp://127.0.0.1:5557";
const char* S_TO_C = "tcp://127.0.0.1:5558";
int main() {
zmq::context_t context(1);
zmq::socket_t pub(context, ZMQ_PUB);
zmq::socket_t sub(context, ZMQ_SUB);
int sndhwm = 0;
sub.connect(C_TO_S);
pub.bind(S_TO_C);
sub.setsockopt(ZMQ_SUBSCRIBE, &sndhwm, sizeof(sndhwm));
//cout << C_TO_S << endl;
while(true) {
zmq::message_t rx_msg;
sub.recv(&rx_msg);
cout << "b\n";
// other code goes here
}
}
Client.cpp:
#incldue <zmq.h>
const char* C_TO_S = "tcp://127.0.0.1:5557";
const char* S_TO_C = "tcp://127.0.0.1:5558";
void network_thread() {
zmq::context_t context(1);
zmq::socket_t pub(context, ZMQ_PUB);
zmq::socket_t sub(context, ZMQ_SUB);
int sndhwm = 0;
sub.connect(S_TO_C);
pub.connect(C_TO_S);
sub.setsockopt(ZMQ_SUBSCRIBE, &sndhwm, sizeof(sndhwm));
while (true) {
cout << pub.send("a", strlen("a"), 0);
cout << "AA\n";
}
// Other code that doesn't matter
}
The main in Client.cpp calls network_thread in a separate thread, and then spams the publisher to send the message "a" to the server. However, the server does not get any message from the client. If the server got any message, it would print out "b", however it never does that. I also know that the publisher is sending messages because it prints out "1" upon execution.
Also, assume that the client subscriber and the server publisher has a purpose. While they don't work atm either, a fix to the other set should translate into a fix of those.
I have tried changing the port, spamming send messages, etc. Nothing resulted in the server receiving any messages.
Any help would be appreciated, thank you.
You set a message filter option on the SUB socket. This means that you will only receive messages that begin with the bytes set by the filter.
This code:
int sndhwm = 0;
sub.setsockopt(ZMQ_SUBSCRIBE, &sndhwm, sizeof(sndhwm));
Sets the filter to sizeof(sndhwm) bytes with value 0x00. But your message does not begin with this number of 0x00 bytes. Hence the message is ignored by the SUB socket.
You should remove the setsockopt call.
If your intent was to clear the message filter, you can do this with:
sub.setsockopt(ZMQ_SUBSCRIBE, nullptr, 0);

ZMQ_DEALER with ZMQ_REP

I'm sorry for this kinda stupid question, but I didn't find any other answer. How can I send a message from ZMQ_DEALER to ZMQ_REP?
There is server code:
std::string ans;
zmq::context_t context;
zmq::socket_t socket(context, ZMQ_DEALER);
int port = bind_socket(socket);
std::cout<<port<<std::endl;
std::cout<<"sending\n";
send_message(socket,"test");
std::cout<<"SUCCESS\n";
std::cout<<"trying to get msg from client...\n";
ans=receive_message(socket);
std::cout<<"TOTAL SUCCESS\n";
std::cout<<ans<<std::endl;
close(port);
and there is client code:
zmq::context_t context;
zmq::socket_t socket(context, ZMQ_REP);
std::string recv;
recv=receive_message(socket);
std::cout<<" total successss\n";
send_message(socket,"success");
std::cout<<recv<<std::endl;
Client can't receive message from server. I tried to find something in official ZeroMQ book, and I found this:
"When a ZMQ_DEALER socket is connected to a ZMQ_REP socket each message sent must consist of an empty message part, the delimiter, followed by one or more body parts."
As demanded by the ZeroMQ documentation, the sender has to take care of following the API requirement.
This should meet the need:
#include <string>
#include <zmq.hpp>
int main()
{
zmq::context_t aCTX;
// ----------------------------------------------------------------- Context() instance
zmq::socket_t aDemoSOCKET( aCTX, zmq::socket_type::dealer);
// ----------------------------------------------------------------- ZeroMQ Archetype
aDemoSOCKET.bind( "inproc://DEMO" );
// ----------------------------------------------------------------- ZeroMQ I/F
const std::string_view BLANK_FRAME_MSG = "";
const std::string_view PAYLOAD_FRAME_MSG = "Hello, world ...";
...
aDemoSOCKET.send( zmq::buffer( BLANK_FRAME_MSG ), zmq::send_flags::sndmore ); //[*]
aDemoSOCKET.send( zmq::buffer( PAYLOAD_FRAME_MSG ), zmq::send_flags::dontwait );
...
}
The API-requested empty frame is the trick [*], enforced by the flags-parameter in the c-API. There is no more magic behind this. If in doubts, feel free to seek further in many real-world helpful answers here

ZMQ_HEARTBEAT_TTL does not discard outgoing queue even if ZMQ_LINGER is set

I have a server which uses a ZMQ_ROUTER to communicate with ZMQ_DEALER clients. I set the ZMQ_HEARTBEAT_IVL and ZMQ_HEARTBEAT_TTL options on the client socket to make the client and server ping pong each other. Beside, because of the ZMQ_HEARTBEAT_TTL option, the server will timeout the connection if it does not receive any pings from the client in a time period, according to zmq man page:
The ZMQ_HEARTBEAT_TTL option shall set the timeout on the remote peer for ZMTP heartbeats. If
this option is greater than 0, the remote side shall time out the connection if it does not
receive any more traffic within the TTL period. This option does not have any effect if
ZMQ_HEARTBEAT_IVL is not set or is 0. Internally, this value is rounded down to the nearest
decisecond, any value less than 100 will have no effect.
Therefore, what I expect the server to behave is that, when it does not receive any traffic from a client in a time period, it will close the connection to that client and discard all the messages in the outgoing queue after the linger time expires. I create a toy example to check if my hypothesis is correct and it turns out that it is not. The chain of events is as followed:
The server sends a bunch of data to the client.
The client receives and processes the data, which is slow.
All send commands return successfully.
While the client is still receiving the data, I unplug the internet cable.
After a few seconds (set by the ZMQ_HEARTBEAT_TTL option), the server starts sending FIN signals to the client, which are not being ACKed back.
The outgoing messages are not discarded (I check the memory consumption) even after a while. They are discarded only if I call zmq_close on the router socket.
So my question is, is this suppose to be how one should use the ZMQ heartbeat mechanism? If it is not then is there any solution for what I want to achieve? I figure that I can do heartbeat myself instead of using ZMQ's built in. However, even if I do, it seems that ZMQ does not provide a way to close a connection between a ZMQ_ROUTER and a ZMQ_DEALER, although that another version of ZMQ_ROUTER - ZMQ_STREAM provides a way to do this by sending an identity frame followed by an empty frame.
The toy example is below, any help would be thankful.
Server's side:
#include <zmq.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char **argv)
{
void *context = zmq_ctx_new();
void *router = zmq_socket(context, ZMQ_ROUTER);
int router_mandatory = 1;
zmq_setsockopt(router, ZMQ_ROUTER_MANDATORY, &router_mandatory, sizeof(router_mandatory));
int hwm = 0;
zmq_setsockopt(router, ZMQ_SNDHWM, &hwm, sizeof(hwm));
int linger = 3000;
zmq_setsockopt(router, ZMQ_LINGER, &linger, sizeof(linger));
char bind_addr[1024];
sprintf(bind_addr, "tcp://%s:%s", argv[1], argv[2]);
if (zmq_bind(router, bind_addr) == -1) {
perror("ERROR");
exit(1);
}
// Receive client identity (only 1)
zmq_msg_t identity;
zmq_msg_init(&identity);
zmq_msg_recv(&identity, router, 0);
zmq_msg_t dump;
zmq_msg_init(&dump);
zmq_msg_recv(&dump, router, 0);
printf("%s\n", (char *) zmq_msg_data(&dump)); // hello
zmq_msg_close(&dump);
char buff[1 << 16];
for (int i = 0; i < 50000; ++i) {
if (zmq_send(router, zmq_msg_data(&identity),
zmq_msg_size(&identity),
ZMQ_SNDMORE) == -1) {
perror("ERROR");
exit(1);
}
if (zmq_send(router, buff, 1 << 16, 0) == -1) {
perror("ERROR");
exit(1);
}
}
printf("OK IM DONE SENDING\n");
// All send commands have returned successfully
// While the client is still receiving data, I unplug the intenet cable on the client machine
// After a while, the server starts sending FIN signals
printf("SLEEP before closing\n"); // At this point, the messages are not discarded (memory usage is high).
getchar();
zmq_close(router);
zmq_ctx_destroy(context);
}
Client's side:
#include <zmq.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv)
{
void *context = zmq_ctx_new();
void *dealer = zmq_socket(context, ZMQ_DEALER);
int heartbeat_ivl = 3000;
int heartbeat_timeout = 6000;
zmq_setsockopt(dealer, ZMQ_HEARTBEAT_IVL, &heartbeat_ivl, sizeof(heartbeat_ivl));
zmq_setsockopt(dealer, ZMQ_HEARTBEAT_TIMEOUT, &heartbeat_timeout, sizeof(heartbeat_timeout));
zmq_setsockopt(dealer, ZMQ_HEARTBEAT_TTL, &heartbeat_timeout, sizeof(heartbeat_timeout));
int hwm = 0;
zmq_setsockopt(dealer, ZMQ_RCVHWM, &hwm, sizeof(hwm));
char connect_addr[1024];
sprintf(connect_addr, "tcp://%s:%s", argv[1], argv[2]);
zmq_connect(dealer, connect_addr);
zmq_send(dealer, "hello", 6, 0);
size_t size = 0;
int i = 0;
while (size < (1ll << 16) * 50000) {
zmq_msg_t msg;
zmq_msg_init(&msg);
if (zmq_msg_recv(&msg, dealer, 0) == -1) {
perror("ERROR");
exit(1);
}
size += zmq_msg_size(&msg);
printf("i = %d, size = %ld, total = %ld\n", i, zmq_msg_size(&msg), size); // This causes the cliet to be slow
// Somewhere in this loop I unplug the internet cable.
// The client starts sending FIN signals as well as trying to reconnect. The recv command hangs forever.
zmq_msg_close(&msg);
++i;
}
zmq_close(dealer);
zmq_ctx_destroy(context);
}
PS: I know that setting the highwater mark to unlimited is bad practice, however I figure that the problem will be the same even if the highwater mark is low so let's ignore it for now.

ZeroMQ XPUB/XSUB Serious Flaw?

It seems as though the XPUB/XSUB socket types have a serious flaw that is difficult to work around:
This is my implementation of that center node:
#include <zmq.hpp>
int main()
{
zmq::context_t context(1);
//Incoming publications come here
zmq::socket_t sub(context, ZMQ_XSUB);
sub.bind("ipc://subscriber.ipc");
//Outgoing publications go out through here.
zmq::socket_t pub(context, ZMQ_XPUB);
pub.bind("ipc://publisher.ipc");
zmq::proxy(sub, pub, nullptr);
return 0;
}
The problem is, of course, slow joiner syndrome. If I connect a new publisher to XSUB and publish some messages, they disappear into the void:
#include "zhelpers.hpp"
int main () {
// Prepare our context and publisher
zmq::context_t context(1);
zmq::socket_t publisher(context, ZMQ_PUB);
publisher.connect("ipc://subscriber.ipc");
s_sendmore (publisher, "B");
s_send (publisher, "Disappears into the void!!");
return 0;
}
However, if I sleep(1) after connecting to XSUB, it magically works:
#include "zhelpers.hpp"
int main () {
// Prepare our context and publisher
zmq::context_t context(1);
zmq::socket_t publisher(context, ZMQ_PUB);
publisher.connect("ipc://subscriber.ipc");
sleep(1);
s_sendmore (publisher, "B");
s_send (publisher, "Magically works!!");
return 0;
}
The Guide claims there is a simple solution to this "slow joiners" syndrome, but then never delivers a working synchronized XSUB/XPUB implementation. After much searching it looks like most people are just sleeping, which is really bad.
Why hasn't this ever been fixed? Are there any known workarounds? All my google queries just point me back to the guide...
I found one workaround here, and that is to use PUSH/PULL on the publisher side, and PUB/SUB on the subscriber side. The new topology looks like this:
This is the code you need for the center node:
#include <zmq.hpp>
int main()
{
zmq::context_t context(1);
//Incoming publications come here
zmq::socket_t sub(context, ZMQ_PULL);
sub.bind("ipc://subscriber.ipc");
//Outgoing publications go out through here.
zmq::socket_t pub(context, ZMQ_PUB);
pub.bind("ipc://publisher.ipc");
zmq::proxy(sub, pub, nullptr);
return 0;
}
And then for publishers:
#include "zhelpers.hpp"
int main () {
// Prepare our context and publisher
zmq::context_t context(1);
zmq::socket_t publisher(context, ZMQ_PUSH);
publisher.connect("ipc://subscriber.ipc");
s_sendmore (publisher, "B");
s_send (publisher, "No sleep!");
return 0;
}
This solution seems to work fairly well, and I hope people chime in if they see any downsides to it. If I come across a better answer, I'll post here.
The downside is your publishers are always publishing. In the XSUB/XPUB case, subscriptions are forwarded to your publishers so that they can restrict what they are sending to the proxy. That results in less network traffic and less work for the proxy. In the PULL/PUB case, the publishers never see the subscription information. A worst case scenario would be a subscriber's subscription means they only want data from one publisher. All publishers would still be sending their data to the proxy and the proxy would filter out everything but the one publisher's messages.

standard C++ TCP socket, connect fails with EINTR when using std::async

I am having trouble using the std::async to have tasks execute in parallel when the task involves a socket.
My program is a simple TCP socket server written in standard C++ for Linux. When a client connects, a dedicated port is opened and separate thread is started, so each client is serviced in their own thread.
The client objects are contained in a map.
I have a function to broadcast a message to all clients. I originally wrote it like below:
// ConnectedClient is an object representing a single client
// ConnectedClient::SendMessageToClient opens a socket, connects, writes, reads response and then closes socket
// broadcastMessage is the std::string to go out to all clients
// iterate through the map of clients
map<string, ConnectedClient*>::iterator nextClient;
for ( nextClient = mConnectedClients.begin(); nextClient != mConnectedClients.end(); ++nextClient )
{
printf("%s\n", nextClient->second->SendMessageToClient(broadcastMessage).c_str());
}
I have tested this and it works with 3 clients at a time. The message gets to all three clients (one at a time), and the response string is printed out three times in this loop. However, it is slow, because the message only goes out to one client at a time.
In order to make it more efficient, I was hoping to take advantage of std::async to call the SendMessageToClient function for every client asynchronously. I rewrote the code above like this:
vector<future<string>> futures;
// iterate through the map of clients
map<string, ConnectedClient*>::iterator nextClient;
for ( nextClient = mConnectedClients.begin(); nextClient != mConnectedClients.end(); ++nextClient )
{
printf("start send\n");
futures.push_back(async(launch::async, &ConnectedClient::SendMessageToClient, nextClient->second, broadcastMessage, wait));
printf("end send\n");
}
vector<future<string>>::iterator nextFuture;
for( nextFuture = futures.begin(); nextFuture != futures.end(); ++nextFuture )
{
printf("start wait\n");
nextFuture->wait();
printf("end wait\n");
printf("%s\n", nextFuture->get().c_str());
}
The code above functions as expected when there is only one client in the map. That you see "start send" quickly followed by "end send", quickly followed by "start wait" and then 3 seconds later (I have a three second sleep on the client response side to test this) you see the trace from the socket read function that the response comes in, and then you see "end wait"
The problem is that when there is more than one client in the map. In the part of the SendMessageToClient function that opens and connects to the socket, it fails in the code identified below:
// connected client object has a pipe open back to the client for sending messages
int clientSocketFileDescriptor;
clientSocketFileDescriptor = socket(AF_INET, SOCK_STREAM, 0);
// set the socket timeouts
// this part using setsockopt is omitted for brevity
// host name
struct hostent *server;
server = gethostbyname(mIpAddressOfClient.c_str());
if (server == 0)
{
close(clientSocketFileDescriptor);
return "";
}
//
struct sockaddr_in clientsListeningServerAddress;
memset(&clientsListeningServerAddress, 0, sizeof(struct sockaddr_in));
clientsListeningServerAddress.sin_family = AF_INET;
bcopy((char*)server->h_addr, (char*)&clientsListeningServerAddress.sin_addr.s_addr, server->h_length);
clientsListeningServerAddress.sin_port = htons(mPortNumberClientIsListeningOn);
// The connect function fails !!!
if ( connect(clientSocketFileDescriptor, (struct sockaddr *)&clientsListeningServerAddress, sizeof(clientsListeningServerAddress)) < 0 )
{
// print out error code
printf("Connected client thread: fail to connect %d \n", errno);
close(clientSocketFileDescriptor);
return response;
}
The output reads: "Connected client thread: fail to connect 4".
I looked this error code up, it is explained thus:
#define EINTR 4 /* Interrupted system call */
I searched around on the internet, all I found were some references to system calls being interrupted by signals.
Does anyone know why this works when I call my send message function one at a time, but it fails when the send message function is called using async? Does anyone have a different suggestion how I should send a message to multiple clients?
First, I would try to deal with the EINTR issue. connect ( ) has been interrupted (this is the meaning of EINTR) and does not try again because you are using and asynch descriptor.
What I usually do in such a circumstance is to retry: I wrap the function (connect in this case) in a while cycle. If connect succeeds I break out of the cycle. If it fails, I check the value of errno. If it is EINTR I try again.
Mind that there are other values of errno that deserve a retry (EWOULDBLOCK is one of them)