How to operate multiple ZeroMQ Socket Types In The Same Process? - c++

I am looking to use ZeroMQ to facilitate IPC in my embedded systems application, however, I'm not able to find many examples on using multiple 0MQ socket types in the same process.
For example, say I have a process called "antenna_mon" that monitors an antenna. I want to be able to send messages to this process and get responses back - a classic REQ-REP pattern. However, I also have a "cm" process, that publishes configuration changes to subscribers. I want antenna_mon to also subscribe to antenna configuration changes - PUB-SUB.
I found this example of reading from multiple sockets in the same process, but it seems sub optimal, because now you no longer block waiting for messages, you inefficiently check for messages constantly and go back to sleep.
Has anyone encountered this problem before? Am I just thinking about it wrong? Maybe I should have two threads - one for CM changes, one for REQ-REP servicing?
I would love any insights or examples of solving this type of problem.

Welcome to the very nature of distributed computing!
Yes, there are new perspectives one has to solve, once assembling a Project for a multi-agent domain, where more than one process works and communicates with it's respective peers ad-hoc.
A knowledge base, acquired from a soft Real-Time System or embedded systems design experience will help a lot here. If none such available, some similarities might be also chosen from GUI design, where a centerpiece is something like a lightweight .mainloop() scheduler, and most of the hard-work is embedded into round-robin polled GUI-devices and internal-state changes or external MMI-events are marshalled into event-triggered handlers.
ZeroMQ infrastructure gives one all the tools needed for such non-blocking, controllably poll-able ( scaleable, variable or adaptively ad-hoc adjustable poll-timeouts, not to overcome the given, design defined, round-trip duration of the controller .mainloop() ) and transport-agnostic, asynchronously operated, message dispatcher ( with thread-mapped performance scaling & priority tuning ).
What else one may need?
Well, just imagination and a lot of self-discipline to adhere the Zero-Copy, Zero-Sharing and Zero-Blocking design maxims.
The rest is in your hands.
Many "academic" examples may seem trivial and simplified, so as to illustrate just the currently discussed, or a feature demonstrated in some narrow perspective.
Not so in the real-life situations.
As an example, my distributed ML-engine uses a tandem of several PUSH/PULL pipelines for moving state data updates transfers and prediction forcasts + another PUSH/PULL for remote keyboard + a reversed .bind()/.connect() on PUB/SUB for easy broadcasting of distributed agents' telemetry to a remote centrally operated syslog and some additional PAIR/PAIR pipes, as processing requires.
( nota bene: one shall always bear in mind, that robust and error-resilient systems ought avoid to use a default REQ/REP Scaleable Formal Communication Pattern, as there is non-zero probability of falling the pairwise-stepped REQ/REP dual-FSA into an unsalvageable deadlock. Do not hesitate to read more about this smart tool. )

Related

ZeroMQ Session based request dispatching or conditional routing

I am trying to solve the following problem and was wondering what is the best approach to apply?
I'd like to setup a versioned communication via ZeroMQ, which effectively means that any client first makes a handshake stating the version of its messaging protocol and than all subsequent requests are forwarded only to specific set of workers, i.e. the ones which can understand this protocol.
I saw the example of Router/Dealer but there forwarding occurs always to all workers.
IMO this is something like a session, which is established based on handshake and all future requests are made in a particular context. Can this be done with ZeroMQ?
I understand that I can send back some ID to the client and ask it to put in all future requests, but would like to avoid that kind of intrusiveness.
Just a side note: I implement this approach in C++. I don't mind if you answer represents a general idea taking in account features available in either in ZeroMQ C API or cppzmq wrapper. No need to write a fully fledged solution, just how it might be done.
Yes, this seems doable:
With a full respect not to enter into "in-band" signalling via ID/multipart message-processing, one may build the wish-to-have infrastructure using a mix of static and dynamic use of as-is ZeroMQ resources.
Step 0: your central authority handles initial "client" contact / handshaking / identity validation
Step 1: each "client" receives a set of directions, as it's identity/version got approved, based upon 0)
Step 2: ad-hoc instructed "client" may { .connect() | .bind() } with appropriate access-point
Step 3: as an architecture bonus, this can be enjoyed as distributed platform with re-negotiation(s) and node re-discoveries for even more robust, scalable-performance and raised security motivated scenarios
Our own imagination is the only ceiling:
In a few word, one may soon forget about the standard Scalable Formal Communication Patterns, these serve as rather a set of building blocks for ad-hoc defined architectures. That is the biggest power of the ZeroMQ or nanomsg to realise.
May read more on advanced use-cases in this ( and check the book there ... ).

How to avoid dropping messages zeromq pub sub

I have seen several questions about this, but none have answers I found satisfactory. This question, zeromq pattern: pub/sub with guaranteed delivery in particular is similar, though I am open to using any other zeromq mechanism to achieve the same effect.
My question is, is there any way to send messages in a fan-out pattern like publisher-subscriber in ZeroMQ with the assurance that the messages will be delivered? It seems as though a Dealer with zero-copy could do this okay, but it would be much messier than pub-sub. Is there a better option? What are the drawbacks of doing it this way besides having to write more code?
Reason for needing this:
I am writing a code to analyze data coming from instrumentation. The module which connects to the instrumentation needs to be able to broadcast data to other modules for them to analyze. They, in turn, need to broadcast their analyzed data to output modules.
At first glance pub-sub with ZeroMQ seemed perfect for the job, but messages get dropped if any subscriber slows down and hits the high watermark. In the case of this system, it is not acceptable for messages to be dropped at only a fraction of the modules because of event continuity. All the modules need to analyze an event for the output to be meaningful. However, if no modules received the messages for an event, that would be fine. For this reason, it would be okay to block the publisher (the instrumentation module) if one of the analysis modules hit the high watermark.
I suppose another alternative is to deal with missed messages after the fact, but that just wastes processing time on events that would be discarded later.
EDIT:
I guess thinking about this further, I currently expect a message sent = message delivered because I'm using inproc and communicating between threads. However, if I were to send messages over TCP there is a chance that the message could be lost even if ZeroMQ does not drop it on purpose. Does this mean I probably need to deal with dropped messages even if I use a blocking send? Are there any guarantees about message delivery with inproc?
In general, I think there's no way of providing a guarantee for pub/sub on its own with 0MQ. If you really need completely reliable messaging, you're going to have to roll your own.
Networks are inherently unreliable, which is why TCP does so much handshaking just to get packets across.
As ever, it's a balance between latency and throughput. If you're prepared to sacrifice throughput, you can do message handshaking yourself - perhaps using REQ/REP - and handle the broadcasting yourself.
The 0MQ guide has some ideas on how to go about at least part of what you want here.
I agree with SteveL. If you truly need 100% reliability (or close to it), ZeroMq is probably not your solution. You're better off going with commercial messaging products where guaranteed message delivery and persistence are addressed, otherwise, you'll be coding reliability features in ZeroMq and likely pull your hair out in the process. Would you implement your own app server if you required ACID compliance between your application and database? Unless you want to implement your own transaction manager, you'd buy WebLogic, WebSphere, or JBoss to do it for you.
Does this mean I probably need to deal with dropped messages even if I
use a blocking send?
I'd stay away from explicitly blocking anything, it's too brittle. A synchronous sender could hang indefinitely if something goes wrong on the consumption side. You could address this using polling and timeouts, but again, it's brittle and messy code; stick with asynchronous.
Are there any guarantees about message delivery with inproc?
Well, one thing is guaranteed; you're not dealing with physical sockets, so any network issues are eliminated.
This question comes up on search engines, so I just wanted to update.
You can stop ZeroMQ from dropping messages when using PUB sockets. You can set the ZMQ_XPUB_NODROP option, which will instead raise an error when the send buffer is full.
With that information, you could create something like a dead letter queue, as mentioned here, and keep trying to resend with sleeps in between.
Efficiently handling this problem may not be possible currently, as there does not appear to be a way to be notified when the send buffer in ZeroMQ is no longer full, which means timed sleeps / polling may be the only way to find out if the send queue has room again so the messages can be published.

Looking for best approach to sending the same data to multiple destinations using sockets

Looking for the best approach to sending the same message to multiple destinations using TCP/IP sockets. I'm working with an existing VS 2010 C++ application on Windows. Hoping to use a standard library/design pattern approach that has many of the complexities already worked out if possible.
Here's one approach I'm thinking about.. One main thread retrieves messages from a database and adds them to some sort of thread safe queue. The application also has one thread for each client socket connection to some destination server. Each one of these threads would read from the thread safe queue, and send the message over a tcp/ip socket.
There may be better/simpler/more robust approaches than this one though..
The issues I have to be concerned about mostly are latency. The destinations could be anywhere, and there may be significant latency between one socket connection and another.
The messages must go in an exact FIFO order to all the destinations.
Also one destination will be considered the primary destination.. all messages must get to this destination, no exceptions. For the other destinations, i.e. non-primary, the messages are just copies and it's not absolutely critical if the non-primary destinations do not receive a few messages. At any point, one of the non-primary destinations could become the primary destination. If one of the destinations falls too far behind, then that thread would need to catch up to the primary destination, but skipping some messages.
Looking for any suggestions. Preliminary research so far, my situation appears to be something akin to a single producer and multiple consumers pattern, or possibly master-worker pattern in Java.
I need to implement this in C++ on Windows, and the application must use tcp/ip sockets using an existing defined protocol.
Any help at all would be greatly appreciated.
You need exactly two threads, one that saturates the IO channel to the database and another that saturates the IO channel to the network leading to the 12 servers. Unless you have multiple network interfaces (which you should think about!) you don't send things faster by using multiple threads. Also, since you don't have multiple threads taking care of the network, you don't have to sync them.
What you definitely need to know about is select(). In the case of WinSock, also take a look at WSAEventSelect/WaitForMultipleObjects. Basically, you take a message from the queue and then send it to all clients when they're ready. select() tells you when one of a set of sockets is ready to accept data, so you don't waste time waiting or block trying to send data. What you need to come up with is a schema to reconnect after broken connections, when to drop messages to lagging clients etc. Also, in case the throughput to the different targets varies a lot, you need to think about handling multiple messages in parallel. If they are small (less than a network packet's payload) it makes sense combining them anyway to avoid overhead.
I hope this short overview helps getting you started, otherwise I can elaborate on the details.

Is it helpful to use ZeroMQ to build a peer-to-peer workload scheduler?

I am coding a workload scheduler. I would like my piece of software to be a peer-to-peer scheduler, ie. a node only knows some neighbours (other nodes) and use them to reach other nodes.
Each node would have its own weighted-routing table to send messages to other peers (basically based on the number of hops), ie. "I want the master to give me my schedule" or "is resource A available on node B ?" : which neighbor is the closest to my target ?
For instance I have written my own routing protocol using XML-RPC (xmlrpc-c) and std::multimaps / std::maps.
I am thinking of using ZeroMQ to optimze my data streams :
queueing can reduce the network load between peers ;
subscriptions can be used to publish upgrades.
As a consequence :
I would need to open as many sockets as I would create new types of connections ;
Each node would need to be a client, a server, a publisher, a subscriber, a broker and a directory ;
I am not sure that my "peer-to-peer architecture" is compatible with the main purpose of ZeroMQ.
Do you think that ZeroMQ can be a helpful concept to use ?
It would be helpful to know exactly what you mean by "routing protocol".
That sounds like you mean the business logic of routing to a particular peer.
Knowing more fully what you're looking to achieve with ZeroMQ would also be helpful.
Have you read the ZeroMQ Guide?
ZeroMQ is a pretty different beast and without spending some time to play with it, you'll
likely find yourself confused. As a bonus, reading the guide will also help you answer
this question for yourself, since you know your requirements better.
ZeroMQ was designed to build robust distributed and multi-threaded applications. Since distributed applications can often take the form of "peer-to-peer", ZeroMQ could indeed be a good fit for your needs.

Web application background processes, newbie design question

I'm building my first web application after many years of desktop application development (I'm using Django/Python but maybe this is a completely generic question, I'm not sure). So please beware - this may be an ultra-newbie question...
One of my user processes involves heavy processing in the server (i.e. user inputs something, server needs ~10 minutes to process it). On a desktop application, what I would do it throw the user input into a queue protected by a mutex, and have a dedicated background thread running in low priority blocking on the queue using that mutex.
However in the web application everything seems to be oriented towards synchronization with the HTTP requests.
Assuming I will use the database as my queue, what is best practice architecture for running a background process?
There are two schools of thought on this (at least).
Throw the work on a queue and have something else outside your web-stack handle it.
Throw the work on a queue and have something else in your web-stack handle it.
In either case, you create work units in a queue somewhere (e.g. a database table) and let some process take care of them.
I typically work with number 1 where I have a dedicated windows service that takes care of these things. You could also do this with SQL jobs or something similar.
The advantage to item 2 is that you can more easily keep all your code in one place--in the web tier. You'd still need something that triggers the execution (e.g. loading the web page that processes work units with a sufficiently high timeout), but that could be easily accomplished with various mechanisms.
Since:
1) This is a common problem,
2) You're new to your platform
-- I suggest that you look in the contributed libraries for your platform to find a solution to handle the task. In addition to queuing and processing the jobs, you'll also want to consider:
1) status communications between the worker and the web-stack. This will enable web pages that show the percentage complete number for the job, assure the human that the job is progressing, etc.
2) How to ensure that the worker process does not die.
3) If a job has an error, will the worker process automatically retry it periodically?
Will you or an operations person be notified if a job fails?
4) As the number of jobs increase, can additional workers be added to gain parallelism?
Or, even better, can workers be added on other servers?
If you can't find a good solution in Django/Python, you can also consider porting a solution from another platform to yours. I use delayed_job for Ruby on Rails. The worker process is managed by runit.
Regards,
Larry
Speaking generally, I'd look at running background processes on a different server, especially if your web server has any kind of load.
Running long processes in Django: http://iraniweb.com/blog/?p=56