ZeroMQ PubSub using inproc sockets hangs forever - c++

I'm adapting a tcp PubSub example to using inproc with multithread. It ends up hanging forever.
My setup
macOS Mojave, Xcode 10.3
zmq 4.3.2
The source code reeproducing the issue:
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <thread>
#include "zmq.h"
void hello_pubsub_inproc() {
void* context = zmq_ctx_new();
void* publisher = zmq_socket(context, ZMQ_PUB);
printf("Starting server...\n");
int pub_conn = zmq_bind(publisher, "inproc://*:4040");
void* subscriber = zmq_socket(context, ZMQ_SUB);
printf("Collecting stock information from the server.\n");
int sub_conn = zmq_connect(subscriber, "inproc://localhost:4040");
sub_conn = zmq_setsockopt(subscriber, ZMQ_SUBSCRIBE, 0, 0);
std::thread t_pub = std::thread([&]{
const char* companies[2] = {"Company1", "Company2"};
int count = 0;
for(;;) {
int which_company = count % 2;
int index = (int)strlen(companies[0]);
char update[12];
snprintf(update, sizeof update, "%s",
companies[which_company]);
zmq_msg_t message;
zmq_msg_init_size(&message, index);
memcpy(zmq_msg_data(&message), update, index);
zmq_msg_send(&message, publisher, 0);
zmq_msg_close(&message);
count++;
}
});
std::thread t_sub = std::thread([&]{
int i;
for(i = 0; i < 10; i++) {
zmq_msg_t reply;
zmq_msg_init(&reply);
zmq_msg_recv(&reply, subscriber, 0);
int length = (int)zmq_msg_size(&reply);
char* value = (char*)malloc(length);
memcpy(value, zmq_msg_data(&reply), length);
zmq_msg_close(&reply);
printf("%s\n", value);
free(value);
}
});
t_pub.join();
// Give publisher time to set up.
sleep(1);
t_sub.join();
zmq_close(subscriber);
zmq_close(publisher);
zmq_ctx_destroy(context);
}
int main (int argc, char const *argv[]) {
hello_pubsub_inproc();
return 0;
}
The result
Starting server...
Collecting stock information from the server.
I've also tried adding this before joining threads to no avail:
zmq_proxy(publisher, subscriber, NULL);
The workaround: Replacing inproc with tcp fixes it instantly. But shouldn't inproc target in-process usecases?
Quick research tells me that it couldn't have been the order of bind vs. connect, since that problem is fixed in my zmq version.
The example below somehow tells me I don't have a missing shared-context issue, because it uses none:
ZeroMQ Subscribers not receiving message from Publisher over an inproc: transport class
I read from the Guide in the section Signaling Between Threads (PAIR Sockets) that
You can use PUB for the sender and SUB for the receiver. This will correctly deliver your messages exactly as you sent them and PUB does not distribute as PUSH or DEALER do. However, you need to configure the subscriber with an empty subscription, which is annoying.
What does it mean by an empty subscription?
Where am I doing wrong?

You can use PUB for the sender and SUB for the receiver. This will correctly deliver your messages exactly as you sent them and PUB does not distribute as PUSH or DEALER do. However, you need to configure the subscriber with an empty subscription, which is annoying.
Q : What does it mean by an empty subscription?
This means to set ( configure ) a subscription, driving a Topic-list message-delivery filtering, using an empty subscription string.
Q : Where am I doing wrong?
Here :
// sub_conn = zmq_setsockopt(subscriber, ZMQ_SUBSCRIBE, 0, 0); // Wrong
sub_conn = zmq_setsockopt(subscriber, ZMQ_SUBSCRIBE, "",0); // Empty string
Doubts also here, about using a proper syntax and naming rules :
// int pub_conn = zmq_bind(publisher, "inproc://*:4040");
int pub_conn = zmq_bind(publisher, "inproc://<aStringWithNameMax256Chars>");
as inproc:// transport-class does not use any kind of external stack, but maps the AccessPoint's I/O(s) onto 1+ memory-locations ( a stack-less, I/O-thread not requiring transport-class ).
Given this, there is nothing like "<address>:<port#>" being interpreted by such (here missing) protocol, so the string-alike text gets used as-is for identifying which Memory-location are the message-data going to go into.
So, the "inproc://*:4040" does not get expanded, but used "literally" as a named inproc:// transport-class I/O-Memory-location identified as [*:4040] ( Next, asking a .connect()-method of .connect( "inproc://localhost:4040" ) will, and must do so, lexically miss the prepared Memory-location: ["*:4040"] as the strings do not match
So this ought fail to .connect() - error-handling might be silent, as since the versions +4.x there is not necessary to obey the historical requirement to first .bind() ( creating a "known" named-Memory-Location for inproc:// ) before one may call a .connect() to get it cross-connected with an "already existing" named-Memory-location, so the v4.0+ will most probably not raise any error on calling and creating a different .bind( "inproc://*:4040" ) landing-zone and next asking a non-matching .connect( "inproc://localhost:4040" ) ( which does not have a "previously prepared" landing-zone in an already existing named-Memory-location.

Related

ZeroMQ with NORM - address already in use error was thrown on 2nd .bind() - why?

I'm using ZeroMQ with NACK-Oriented Reliable Multicast ( NORM ) norm:// protocol. The documentation contains only a Python code, so here is my C++ code:
PUB Sender :
string sendHost = "norm://2,127.0.0.1:5556";// <NormNodeId>,<addr:port>
string tag = "MyTag";
string sentMessage = "HelloWorld";
string fullMessage = tag + sentMessage;
zmq::context_t *context = new zmq::context_t( 20 );
zmq::socket_t publisher( *context, ZMQ_PUB );
zmq_connect( publisher, sendHost.c_str() );
zmq_send( publisher,
fullMessage.c_str(),
fullMessage.size(),
0
);
SUB Receiver :
char message[256];
string receiveHost = "norm://1,127.0.0.1:5556";// <NormNodeId>,<addr:port>
string tag = "MyTag";
zmq::context_t *context = new zmq::context_t( 20 );
zmq::socket_t subscriber( *context, ZMQ_SUB );
zmq_bind( subscriber, receiveHost.c_str() );
zmq_setsockopt( subscriber, ZMQ_SUBSCRIBE, tag.c_str(), tag.size() );
zmq_recv( subscriber,
message,
256,
0
);
cout << bytesReceived << endl;
cout << message << endl;
The problem I'm facing is that according to the documentation both .bind() and .connect() are interchangeable.
In my case they both do a .bind(), which causes ZeroMQ to throw an error saying the second bind fails, due to address already in use error.
... they both do a bind, which causes ZeroMQ to throw an error saying the second bind fails
Yes, this is a correct state to fail.
The first .bind() "takes ownership" of the port and this is an exclusive role.
The interchangeability of { .bind() | .connect() } is to be understood so that it does not matter which side .bind()-s and which one .connect()-s.
Until this moment, I saw no one interpreting this property in such a manner, that both sides would try to .connect() ( a non-existent .bind()-(not)-exposed Access Point ), the less to try to .bind() an already "occupied" port ( in case of residing on the same localhost ), or to remain in a nox-et-solitudo state, for the cases that either of the .bind()-s establishes such a .connect()-ready state on both ports on different localhost-s, which both after that remain in a silent solitude ( forever ), as there is ( and will be ) no attempt to make any .connect()-ion going live and operational.
No, you need just 1 .bind(), that may since that moment handle 0+ future .connect()-requests, arriving to establish a live-channel PUB/SUB, for any respective <transport-class> protocol, including the newly added norm://.
Anyways, welcome norm:// to the Family of ZeroMQ protocols.
Confused ?
May enjoy a further 5-seconds read
about the main conceptual differences in [ ZeroMQ hierarchy in less than a five seconds ] or other posts and discussions here.

Is the subscriber able to receive updates from multiple publishers in ZeroMQ with cpp binding?

In my code I am only able to receive the messages from the first publisher (on port 5556) to which I connect.
So do I need to close the first connection (5556) before connecting to second (5557)?
If so, then in the statement of ZeroMQ guide
"A subscriber can connect to more than one publisher, using one connect call each time. Data will then arrive and be interleaved ("fair-queued") so that no single publisher drowns out the others."
Does the phrase "using one connect call each time" mean we need to close the first connection before connecting to second publisher?
How can I connect to multiple publishers at the same time to receive messages from both.
Code:
#include <zmq.hpp>
#include
int main (int argc, char *argv[])
{
zmq::context_t context (1);
zmq::socket_t subscriber (context, ZMQ_SUB);
subscriber.connect("tcp://localhost:5556");
subscriber.connect("tcp://localhost:5557");
subscriber.setsockopt(ZMQ_SUBSCRIBE, "", 0);//subscribe to all messages
// Process 10 updates
int update_nbr;
for (update_nbr = 0; update_nbr < 10 ; update_nbr++) {
zmq::message_t update;
subscriber.recv (&update);
// Prints only the data from publisher bound to port 5556
std::string updt = std::string(static_cast<char*>(update.data()), update.size());
std::cout << "Received Update/Messages/TaskList " << update_nbr <<" : "<< updt << std::endl;
}
return 0;
}
Does it mean we need to close the first before second?
No.
One need not .close() so as to launch another call to the .connect(...) method.
How can I connect to multiple PUB-s to receive messages from both?
In cases, where both Fair-Queueing SUB-side Policy and identical Topic-filter Policy logic processing on { PUB | SUB }-side ( version dependent ... ) remains plausible:
int main (int argc, char *argv[])
{
zmq::context_t context (1);
zmq::socket_t subscriber (context, ZMQ_SUB);
subscriber.connect( "tcp://localhost:5556" ); // ipc://first will have less
subscriber.connect( "tcp://localhost:5557" ); // ipc://second protocol overheads
subscriber.setsockopt( ZMQ_SUBSCRIBE, "", 0 );// subscribe to .recv() any message
...
}
In cases where it is not, use multiple SUB-side socket-instances, each .connect()-ed to respective non-balanced PUB-s and use a non-blocking .poll() inside an event-loop, tightly engineered so as to ad-hoc monitor and handle all the non-balanced message-arrival event-streams under the non-coherent Topic-filter Policy being processed per each PUB/SUB ( or XPUB/XSUB ) co-existing message "event-streams".

Any way to change the behavior of synchronous Windows API SendARP?

I'm writing a local network scanner on Windows to find online hosts with IP Helper Functions, which is equivalent to nmap -PR but without WinPcap. I know SendARP will block and send arp request 3 times if the remote host doesn't respond, so I use std::aync to create one threads for each host, but the problem is I want to send an ARP request every 20ms so it would not be too much arp packets in a very short time.
#include <iostream>
#include <future>
#include <vector>
#include <winsock2.h>
#include <iphlpapi.h>
#pragma comment(lib, "iphlpapi.lib")
#pragma comment(lib, "ws2_32.lib")
using namespace std;
int main(int argc, char **argv)
{
ULONG MacAddr[2]; /* for 6-byte hardware addresses */
ULONG PhysAddrLen = 6; /* default to length of six bytes */
memset(&MacAddr, 0xff, sizeof (MacAddr));
PhysAddrLen = 6;
IPAddr SrcIp = 0;
IPAddr DestIp = 0;
char buf[64] = {0};
size_t start = time(NULL);
std::vector<std::future<DWORD> > vResults;
for (auto i = 1; i< 255; i++)
{
sprintf(buf, "192.168.1.%d", i);
DestIp = inet_addr(buf);
vResults.push_back(std::async(std::launch::async, std::ref(SendARP), DestIp, SrcIp, MacAddr, &PhysAddrLen));
Sleep(20);
}
for (auto it= vResults.begin(); it != vResults.end(); ++it)
{
if (it->get() == NO_ERROR)
{
std::cout<<"host up\n";
}
}
std::cout<<"time elapsed "<<(time(NULL) - start)<<std::endl;
return 0;
}
At first I can do this by calling Sleep(20) after launching a thread, but once SendARP in these threads re-send ARP requests if no replies from remote host, it's out of my control, and I see many requests in a very short time(<10ms) in Wireshark, so my question is:
Any way to make SendARP asynchronous?
if not, can I control the sent timing of SendARP in threads?
There doesn't seem to be any way to force SendARP to act in a non-blocking manner, it would appear that when a host is unreachable, it will try to re-query several times before giving up.
As for the solution, nothing you want to hear. the MSDN Docs state that there's a newer API that deprecates SendARP called ResolveIpNetEntry2 that can also do the same thing, but it also appears to behave in the same manner.
The struct it receives contains a field called ReachabilityTime.LastUnreachable which is: The time, in milliseconds, that a node assumes a neighbor is unreachable after not having received a reachability confirmation.
However, it does not appear to have any real effect.
The best way to do it, is to use WinPCap or some other driver, there doesn't seem to be a way of solving your problem in userland.

Context independent C++ TCP Server Class

I'm coding a TCP Server class based on the I/O multiplexing (select) way.
The basic idea is explained in this chunk of code:
GenericApp.cpp
TServer *server = new Tserver(/*parameters*/);
server->mainLoop();
For now the behavior of the server is independent from the context but in a way that i nedd to improove.
Actual Status
receive(sockFd , buffer);
MSGData * msg= MSGFactory::getInstance()->createMessage(Utils::getHeader(buffer,1024));
EventHandler * rightHandler =eventBinder->getHandler(msg->type());
rightHandler->callback(msg);
At this version the main loop reads from the socket, instantiates the right type of message object and calls the appropriate handler(something may not work properly because it compiles but i have not tested it).
As you can notice this allows a programmer to define his message types and appropriate handlers but once the main loop is started nothing can be done.
I need to make this part of the server more customizable to adapt this class to a bigger
quantity of problems.
MainLoop Code
void TServer::mainLoop()
{
int sockFd;
int connFd;
int maxFd;
int maxi;
int i;
int nready;
maxFd = listenFd;
maxi = -1;
for(i = 0 ; i< FD_SETSIZE ; i++) clients[i] = -1; //Should be in the constructor?
FD_ZERO(&allset); //Should be in the constructor?
FD_SET(listenFd,&allset); //Should be in the constructor?
for(;;)
{
rset = allset;
nready = select (maxFd + 1 , &rset , NULL,NULL,NULL);
if(FD_ISSET( listenFd , &rset ))
{
cliLen = sizeof(cliAddr);
connFd = accept(listenFd , (struct sockaddr *) &cliAddr, &cliLen);
for (i = 0; i < FD_SETSIZE; i++)
{
if (clients[i] < 0)
{
clients[i] = connFd; /* save descriptor */
break;
}
}
if (i == FD_SETSIZE) //!!HANDLE ERROR
FD_SET(connFd, &allset); /* add new descriptor to set */
if (connFd > maxFd) maxFd = connFd; /* for select */
if (i > maxi) maxi = i; /* max index in client[] array */
if (--nready <= 0) continue;
}
for (i = 0; i <= maxi; i++)
{
/* check all clients for data */
if ( (sockFd = clients[i]) < 0) continue;
if (FD_ISSET(sockFd, &rset))
{
//!!SHOULD CLEAN BUFFER BEFORE READ
receive(sockFd , buffer);
MSGData * msg = MSGFactory::getInstance()->createMessage(Utils::getHeader(buffer,1024));
EventHandler * rightHandler =eventBinder->getHandler(msg->type());
rightHandler->callback(msg);
}
if (--nready <= 0) break; /* no more readable descriptors */
}
}
}
Do you have any suggestions on a good way to do this?
Thanks.
Your question requires more than just a stack overflow question. You can find good ideas in these book:
http://www.amazon.com/Pattern-Oriented-Software-Architecture-Concurrent-Networked/dp/0471606952/ref=sr_1_2?s=books&ie=UTF8&qid=1405423386&sr=1-2&keywords=pattern+oriented+software+architecture
http://www.amazon.com/Unix-Network-Programming-Volume-Networking/dp/0131411551/ref=sr_1_1?ie=UTF8&qid=1405433255&sr=8-1&keywords=unix+network+programming
Basically what you're trying to do is a reactor. You can find open source library implementing this pattern. For instance:
http://www.cs.wustl.edu/~schmidt/ACE.html
http://pocoproject.org/
If you want yout handler to have the possibility to do more processing you could give them a reference to your TCPServer and a way to register a socket for the following events:
read, the socket is ready for read
write, the socket is ready for write
accept, the listening socket is ready to accept (read with select)
close, the socket is closed
timeout, the time given to wait for the next event expired (select allow to specify a timeout)
So that the handler can implement all kinds of protocols half-duplex or full-duplex:
In your example there is no way for a handler to answer the received message. This is the role of the write event to let a handler knows when it can send on the socket.
The same is true for the read event. It should not be in your main loop but in the socket read handler.
You may also want to add the possibility to register a handler for an event with a timeout so that you can implement timers and drop idle connections.
This leads to some problems:
Your handler will have to implement a state-machine to react to the network events and update the events it wants to receive.
You handler may want to create and connect new sockets (think about a Web proxy server, an IRC client with DCC, an FTP server, and so on...). For this to work it must have the possibility to create a socket and to register it in your main loop. This means the handler may now receive callbacks for one of the two sockets and there should be a parameter telling the callback which socket it is. Or you will have to implement a handler for each socket and they will comunnicate with a queue of messages. The queue is needed because the readiness of one socket is independent of the readiness of the other. And you may read something on one and not being ready to send it on the other.
You will have to manage the timeouts specified by each handlers which may be different. You may end up with a priority queue for timeouts
As you see this is no simple problem. You may want to reduce the genericity of your framework to simplify its design. (for instance handling only half-duplex protocols like simple HTTP)

Strange behavior with Isend

The following is a simple code which just sends data of a processor i to i+1 using Isend and probes if all sends are complete.
Code:
std::vector<double> sendbuffer;
int myrank, nprocs;
std::vector <MPI::Request> req_v;
MPI::Init ();
nprocs = MPI::COMM_WORLD.Get_size ();
myrank = MPI::COMM_WORLD.Get_rank ();
sendbuffer.resize(20000);
int startin = 0;
if(myrank != nprocs-1)
for (int i = myrank+1; i <= (myrank+1);i++)
{
int sendrank = i;
int msgtag = myrank * sendrank;
int msgsz = sendbuffer.size();
double *sdata = &(sendbuffer[startin]);
MPI::Request req;
req = MPI::COMM_WORLD.Isend (sdata, msgsz, MPI::DOUBLE, sendrank, msgtag);
req_v.push_back (req);
}
printf("Size-%d %d\n", myrank,(int) req_v.size());
if(req_v.size() > 0)
MPI::Request::Waitall (req_v.size(), &(req_v[0]));
printf("Over-%d\n", myrank);
MPI::COMM_WORLD.Barrier();
MPI::Finalize ();
The code completes for a buffer size of 1500 but for 20000 it stalls. The behavior seems a bit strange. I thought matching receives are not needed for Isend. Please suggest possible reasons for this behavior.
You never actually receive the messages. Without the corresponding calls to Recv (or Irecv), the send calls may never complete.
You can get more details at this question, but generally, sends are allowed to complete without the corresponding receive having been posted as long as the messages can be buffered within MPI (either on the sender side or the receiver side). Eventually, your system will run out of buffers and the send call will stop completing until you call the corresponding receive calls and free up some system buffers.