ZeroMQ - pub / sub latency - c++

I'm looking into ZeroMQ to see if it's a fit for a soft-realtime application. I was very pleased to see that the latency for small payloads were in the range of 30 micro-seconds or so. However in my simple tests, I'm getting about 300 micro-seconds.
I have a simple publisher and subscriber, basically copied from examples off the web and I'm sending one byte through localhost.
I've played around for about two days w/ different sockopts and I'm striking out.
Any help would be appreciated!
publisher:
#include <iostream>
#include <zmq.hpp>
#include <unistd.h>
#include <sys/time.h>
int main()
{
zmq::context_t context (1);
zmq::socket_t publisher (context, ZMQ_PUB);
publisher.bind("tcp://*:5556");
struct timeval timeofday;
zmq::message_t msg(1);
while(true)
{
gettimeofday(&timeofday,NULL);
publisher.send(msg);
std::cout << timeofday.tv_sec << ", " << timeofday.tv_usec << std::endl;
usleep(1000000);
}
}
subscriber:
#include <iostream>
#include <zmq.hpp>
#include <sys/time.h>
int main()
{
zmq::context_t context (1);
zmq::socket_t subscriber (context, ZMQ_SUB);
subscriber.connect("tcp://localhost:5556");
subscriber.setsockopt(ZMQ_SUBSCRIBE, "", 0);
struct timeval timeofday;
zmq::message_t update;
while(true)
{
subscriber.recv(&update);
gettimeofday(&timeofday,NULL);
std::cout << timeofday.tv_sec << ", " << timeofday.tv_usec << std::endl;
}
}

Is the Task Definition real?
Once speaking about *-real-time design, the architecture-capability validation is more important, than the following implementation itself.
If taking your source code as-is, your readings ( which would be ideally posted together with your code snippets for a cross-validation of the replicated MCVE-retest ) will not serve much, as the numbers do not distinguish what portions ( what amounts of time ) were spent on sending-side loop-er, on sending side zmq-data-acquisition/copy/scheduling/wire-level formatting/datagram-dispatch and on receiving side unloading from media/copy/decode/pattern-match/propagate to receiver buffer(s)
If interested in ZeroMQ internals, there are good performance-related application notes available.
If striving for a minimum-latency design do:
remove all overheads
replace all tcp-header processing from the proposed PUB/SUB channel
avoid all non-cardinal logic overheads from processing ( no sense to spend time on subscribe-side ( sure, newer versions of ZMQ have moved into publisher-side filtering, but the idea is clear ) with pattern-matching encoded in the selected archetype processing ( using ZMQ_PAIR avoids any such, independently from the transport class ) - if it is intended to block something, then rather change the signalling socket layout accordingly, so as to principally avoid blocking ( this ought to be a real-time system, as you have said above)
apply a "latency-masking" where possible in the target multi-core / many-core hardware architectures so as to squeeze the last drops of spare-time from your hardware / tools capabilities ... benchmark with experiments setups with more I/O-threads' help zmq::context_t context( N );, where N > 1
Missing target:
As Alice in the Wonderlands stated more than a century ago, whenever there was no goal defined, any road leads to the target.
Having a soft-real time ambition, there shan´t be an issue to state a maximum allowed end-to-end latency and from that derive a constraint for transport-layer latency.
Having not done so, 30 us, 300 us or even 3 ms have no meaning per se, so no-one can decide, whether these figures are "enough" for some subsystem or not.
A reasonable next step:
define real-time stability horizon(s) ... if using for a real-time control
define real-time design constraints ... for signal / data acquisition(s), for processing task(s), for self-diagnostic & control services
avoid any blocking, design-wise & validate / prove no blocking will ever appear under all possible real-world operations circumstances [formal proof methods are ready for such task] ( no one would like to see an AlertPanel [ Waiting for data] during your next jet landing or have the last thing to see, before an autonomous car crashes right into the wall, a lovely looking [hour-glass] animated-icon as it moves the sand while the control system got busy, whatever a reason for that was behind it, in a devastatingly blocking manner.
Quantified targets make sense for testing.
If a given threshold permits to have 500 ms stability horizon (which may be a safe value for a slo-mo hydraulic-actuator/control-loop, but may fail to work for a guided missile control system, the less for any [mass&momentum-of-inertia]-less system (alike DSP family of RT-control-systems)), you can test end-to-end if your processing fits in between.
If you know, your incoming data-stream brings about 10 kB each 500 us, you can test your design if it can keep the pace with the burst traffic or not.
If you test, your mock-up design does miss the target (not meeting the performance / time-constrained figures) you know pretty well, where the design or where the architecture needs to get improved.

First make sure you run producer and consumer on different physical cores (not HT).
Second, it depends A LOT on the hardware and OS. Last time I measured kernel IO (4-5 years ago) the results were indeed 10 to 20us around send/recv system calls.
You have to optimize your kernel settings to low latency and set TCP_NODELAY.

Related

ZeroMQ: how to reduce multithread-communication latency with inproc?

I'm using inproc and PAIR to achieve inter-thread communication and trying to solve a latency problem due to polling. Correct me if I'm wrong: Polling is inevitable, because a plain recv() call will usually block and cannot take a specific timeout.
In my current case, among N threads, each of the N-1 worker threads has a main while-loop. The N-th thread is a controller thread which will notify all the worker threads to quit at any time. However, worker threads have to use polling with a timeout to get that quit message. This introduces a latency, the latency parameter is usually 1000ms.
Here is an example
while (true) {
const std::chrono::milliseconds nTimeoutMs(1000);
std::vector<zmq::poller_event<std::size_t>> events(n);
size_t nEvents = m_poller.wait_all(events, nTimeoutMs);
bool isToQuit = false;
for (auto& evt : events) {
zmq::message_t out_recved;
try {
evt.socket.recv(out_recved, zmq::recv_flags::dontwait);
}
catch (std::exception& e) {
trace("{}: Caught exception while polling: {}. Skipped.", GetLogTitle(), e.what());
continue;
}
if (!out_recved.empty()) {
if (IsToQuit(out_recved))
isToQuit = true;
break;
}
}
if (isToQuit)
break;
//
// main business
//
...
}
To make things worse, when the main loop has nested loops, the worker threads then need to include more polling code in each layer of the nested loops. Very ugly.
The reason why I chose ZMQ for multithread communication is because of its elegance and the potential of getting rid of thread-locking. But I never realized the polling overhead.
Am I able to achieve the typical latency when using a regular mutex or an std::atomic data operation? Should I understand that the inproc is in fact a network communication pattern in disguise so that some latency is inevitable?
An above posted statement ( a hypothesis ):
"...a plain recv() call will usually block and cannot take a specific timeout."
is not correct:
a plain .recv( ZMQ_NOBLOCK )-call will never "block",
a plain .recv( ZMQ_NOBLOCK )-call can get decorated so as to mimick "a specific timeout"
An above posted statement ( a hypothesis ):
"...have to use polling with a timeout ... introduces a latency, the latency parameter is usually 1000ms."
is not correct:
- one need not use polling with a timeout
- the less one need not set 1000 ms code-"injected"-latency, spent obviously only on-no-new-message state
Q : "Am I able to achieve the typical latency when using a regular mutex or an std::atomic data operation?"
Yes.
Q : "Should I understand that the inproc is in fact a network communication pattern in disguise so that some latency is inevitable?"
No. inproc-transport-class is the fastest of all these kinds as it is principally protocol-less / stack-less and has more to do with ultimately fast pointer-mechanics, like in a dual-end ring-buffer pointer-management.
The Best Next Step:
1 )Re-factor your code, so as to always harness but the zero-wait { .poll() | .recv() }-methods, properly decorated for both { event- | no-event- }-specific looping.
2 )
If then willing to shave the last few [us] from the smart-loop-detection turn-around-time, may focus on improved Context()-instance setting it to work with larger amount of nIOthreads > N "under the hood".
optionally 3 )
For almost hard-Real-Time systems' design one may finally harness a deterministically driven Context()-threads' and socket-specific mapping of these execution-vehicles onto specific, non-overlapped CPU-cores ( using a carefully-crafted affinity-map )
Having set 1000 [ms] in code, no one is fair to complain about spending those very 1000 [ms] waiting in a timeout, coded by herself / himself. No excuse for doing this.
Do not blame ZeroMQ for behaviour, that was coded from the application side of the API.
Never.

C++ Boost Serialization: Input Stream Error

Hi fellow C++ developers,
I'm trying to send a C++ class over the network with zmq and boost::serialization.
The concept is to serialize the class PlayCommand on the client. Then send it to the server with zmq. And the deserialize it on the server.
This works fine in the rest of the application. For some reason I get input stream errors while deserializing the PlayCommand on the server from time to time. I can not figure out why it is sometimes throwing this exception and sometimes not.
It seems to be a time sensitive problem. Do I have to wait at some point to let boost do its thing ?
std::shared_ptr<PlayCommand> _exe(dynamic_cast<PlayCommand*>(_cmd.get()));
zmq::context_t _ctx(1);
zmq::socket_t _skt(_ctx, ZMQ_PUB);
_skt.connect("tcp://0.0.0.0:" + this->kinect_daemon_com_port);
std::stringstream _type_stream;
std::stringstream _exe_stream;
boost::archive::text_oarchive _type_archive(_type_stream);
boost::archive::text_oarchive _exe_archive(_exe_stream);
_type_archive << _type;
_exe_archive << *_exe.get();
std::string _type_msg_str = _type_stream.str();
std::string _exe_msg_str = _exe_stream.str();
zmq::message_t _type_msg(_type_msg_str.length());
zmq::message_t _exe_msg(_exe_msg_str.length());
memcpy(_type_msg.data(), _type_msg_str.data(), _type_msg_str.length());
memcpy(_exe_msg.data(), _exe_msg_str.data(), _exe_msg_str.length());
_skt.send(_type_msg, ZMQ_SNDMORE);
_skt.send(_exe_msg, 0);
void ZMQMessageResolver::resolve_message(std::shared_ptr<Event> _event, unsigned _unique_thread_id)
{
std::cout << "ZMQMessageResolver::resolve_message(std::shared_ptr<Event> _event, unsigned _unique_thread_id)" << std::endl;
std::shared_ptr<ZMQMessageEvent> _zmq_event = std::static_pointer_cast<ZMQMessageEvent>(_event);
//(static_cast<ZMQMessageEvent*>(_event.get()));
ZMQMessageType _type;
PlayCommand _cmd;
auto _messages = _zmq_event->get_data();
auto _type_string = std::string(static_cast<char*>(_messages->front()->data()), _messages->front()->size());
auto _cmd_string = std::string(static_cast<char*>(_messages->back()->data()), _messages->back()->size());
std::stringstream _type_stream{_type_string};
std::istringstream _cmd_stream{_cmd_string};
boost::archive::text_iarchive _type_archive{_type_stream};
boost::archive::text_iarchive _cmd_archive{_cmd_stream};
std::cout << "1" << std::endl;
_type_archive >> _type;
std::cout << "2" << std::endl;
_cmd_archive & _cmd;
std::cout << "3" << std::endl;
std::shared_ptr<ThreadEvent> _thread_event = std::make_shared<ThreadEvent>(_zmq_event->get_event_message());
_cmd.execute(_thread_event);
std::lock_guard<std::mutex> _lock{*this->thread_mutex};
this->finished_threads.push_back(_unique_thread_id);
}
The complete project is on github: rgbd-calib and rgbd-calib-py.
The important files are /framework/ZMQMessageResolver.cpp in rgbd-calib and /src/KinectDaemon.cpp in rgbd-calib-py.
I would appreciate any help.
First insights
I checked for shared zmq::socket_t instances. I could not find any so thread safety should be a non issue.
I found out that other developers are also experiencing problems with ZMQ multi part messages. Maybe that could be an issue in my case as well. Maybe someone as experiences with those. Do I have to take any safety measures when sending and receiving multi part messages ?
If it's timing sensitive, no doubt it's unrelated to boost: the boost code shown is completely synchronous and local. If you don't always receive the full streams you get this error. Likewise if there's a protocol error interpreting the received data you might get corrupt data.
Both cases would easily lead to "input stream error".
I have no experience with 0MQ so I don't know whether the code as shown could receive incomplete messages, but I'd look into that.
A minor note is that it's rather strange to have one stringstream and the other istringstream. There might be differences in seek behaviour.
Let me add a few cents on ZeroMQ part of the story:
Fact #1:ZeroMQ does never deliver a piece of trash it delivers either a complete message( as it was sent ) or nothing at all
this principal design feature helps to sort out one of the claimed potential issues.
If an application indeed receives a ZeroMQ message delivered, one can be sure of its being of a shape and sound copy of what has been dispatched from the remote process. It is simply the same. Pullstop.
Fact #2: ZeroMQ architects and evangelists do in every chapter since the very beginning of the ZeroMQ API v2.xx warn to never share
which seems from the code depicted above unclear.
If one instantiates a ZeroMQ-socket AccessPoint ( of a SUB-type in the above context ), the thread owning such AccessPoint resource is the only thread that may manipulate with such resource and never "let" any other touch this toy. Never. While there might be some recent talks and efforts to re-design the ZeroMQ core, so as to add a thread-safety ex-post, I remain skeptical as per these moves, being principally sure that a non-blocking high-performance + low-latency motivated designs in distributed computing should never share a common piece, right because of the costs of overheads and lost principal safety ( which is not easy to be got just ex-post bought back by any inter-thread signalling / locking / blocking ).
You may review the code, so as to confirm or deny any kind of sharing ZeroMQ instances ( a Context being another, separate subject in this ) and for cases, where shared pieces were detected, your team ought re-design the code so as to avoid it.
Yes. Avoid sharing and ZeroMQ tools will serve you as a hell.

Busy Loop/Spinning sometimes takes too long under Windows

I'm using a windows 7 PC to output voltages at a rate of 1kHz. At first I simply ended the thread with sleep_until(nextStartTime), however this has proven to be unreliable, sometimes working fine and sometimes being of by up to 10ms.
I found other answers here saying that a busy loop might be more accurate, however mine for some reason also sometimes takes too long.
while (true) {
doStuff(); //is quick enough
logDelays();
nextStartTime = chrono::high_resolution_clock::now() + chrono::milliseconds(1);
spinStart = chrono::high_resolution_clock::now();
while (chrono::duration_cast<chrono::microseconds>(nextStartTime -
chrono::high_resolution_clock::now()).count() > 200) {
spinCount++; //a volatile int
}
int spintime = chrono::duration_cast<chrono::microseconds>
(chrono::high_resolution_clock::now() - spinStart).count();
cout << "Spin Time micros :" << spintime << endl;
if (spinCount > 100000000) {
cout << "reset spincount" << endl;
spinCount = 0;
}
}
I was hoping that this would work to fix my issue, however it produces the output:
Spin Time micros :9999
Spin Time micros :9999
...
I've been stuck on this problem for the last 5 hours and I'd very thankful if somebody knows a solution.
According to the comments this code waits correctly:
auto start = std::chrono::high_resolution_clock::now();
const auto delay = std::chrono::milliseconds(1);
while (true) {
doStuff(); //is quick enough
logDelays();
auto spinStart = std::chrono::high_resolution_clock::now();
while (start > std::chrono::high_resolution_clock::now() + delay) {}
int spintime = std::chrono::duration_cast<std::chrono::microseconds>
(std::chrono::high_resolution_clock::now() - spinStart).count();
std::cout << "Spin Time micros :" << spintime << std::endl;
start += delay;
}
The important part is the busy-wait while (start > std::chrono::high_resolution_clock::now() + delay) {} and start += delay; which will in combination make sure that delay amount of time is waited, even when outside factors (windows update keeping the system busy) disturb it. In case that the loop takes longer than delay the loop will be executed without waiting until it catches up (which may be never if doStuff is sufficiently slow).
Note that missing an update (due to the system being busy) and then sending 2 at once to catch up might not be the best way to handle the situation. You may want to check the current time inside doStuff and abort/restart the transmission if the timing is wrong by more then some acceptable amount.
On Windows I dont think its possible to ever get such precise timing, because you can not garuntee your thread is actually running at the time you desire. Even with low CPU usage and setting your thread to real time priority, it can still be interuptted (Hardware interupts as I understand. Never fully investigate but even a simple while(true) ++i; type loop at realtime Ive seen get interupted then moved between CPU cores). While such interrupts and switching for a realtime thread is very quick, its still significant if your trying to directly drive a signal without buffering.
Instead you really want to read and write buffers of digital samples (so at 1KHz each sample is 1ms). You need to be sure to queue another buffer before the last one is completed, which will constrain how small they can be, but at 1KHz at realtime priority if the code is simple and no other CPU contention a single sample buffer (1ms) might even be possible, which is at worst 1ms extra latency over "immediate" but you would have to test. You then leave it up to the hardware and its drivers to handle the precise timing (e.g. make sure each output sample is "exactly" 1ms to the accuracy the vendor claims).
This basically means your code only has to be accurate to 1ms in worst case, rather than trying to persue somthing far smaller than the OS really supports such as microsecond accuracy.
As long as you are able to queue a new buffer before the hardware used up the previous buffer, it will be able to run at the desired frequency without issue (to use audio as an example again, while the tolerated latencies are often much higher and thus the buffers as well, if you overload the CPU you can still sometimes hear auidble glitches where an application didnt queue up new raw audio in time).
With careful timing you might even be able to get down to a fraction of a millisecond by waiting to process and queue your next sample as long as possible (e.g. if you need to reduce latency between input and output), but remember that the closer you cut it the more you risk submitting it too late.

ZeroMQ (cppzmq) subscriber skips first message

I'm trying to use ZMQ with the CPPZMQ C++ wrapper, as it seems it is the one suggested in C++ Bindings.
The client/server (REQ/REP) seems to work fine.
When trying to implement a publish/subscribe pair of programs, it looks like the first message is lost in the subscriber. Why?
publisher.cpp:
#include <boost/date_time/posix_time/posix_time.hpp>
#include <boost/thread/thread.hpp>
#include <boost/format.hpp>
#include <zmq.hpp>
#include <string>
#include <iostream>
int main()
{
zmq::context_t context(1);
zmq::socket_t publisher(context, ZMQ_PUB);
publisher.bind("tcp://*:5555");
for(int n = 0; n < 3; n++) {
zmq::message_t env1(1);
memcpy(env1.data(), "A", 1);
std::string msg1_str = (boost::format("Hello-%i") % (n + 1)).str();
zmq::message_t msg1(msg1_str.size());
memcpy(msg1.data(), msg1_str.c_str(), msg1_str.size());
std::cout << "Sending '" << msg1_str << "' on topic A" << std::endl;
publisher.send(env1, ZMQ_SNDMORE);
publisher.send(msg1);
zmq::message_t env2(1);
memcpy(env2.data(), "B", 1);
std::string msg2_str = (boost::format("World-%i") % (n + 1)).str();
zmq::message_t msg2(msg2_str.size());
memcpy(msg2.data(), msg2_str.c_str(), msg2_str.size());
std::cout << "Sending '" << msg2_str << "' on topic B" << std::endl;
publisher.send(env2, ZMQ_SNDMORE);
publisher.send(msg2);
boost::this_thread::sleep(boost::posix_time::milliseconds(1000));
}
return 0;
}
subscriber.cpp:
#include <zmq.hpp>
#include <string>
#include <iostream>
int main()
{
zmq::context_t context(1);
zmq::socket_t subscriber(context, ZMQ_SUB);
subscriber.connect("tcp://localhost:5555");
subscriber.setsockopt(ZMQ_SUBSCRIBE, "B", 1);
while(true)
{
zmq::message_t env;
subscriber.recv(&env);
std::string env_str = std::string(static_cast<char*>(env.data()), env.size());
std::cout << "Received envelope '" << env_str << "'" << std::endl;
zmq::message_t msg;
subscriber.recv(&msg);
std::string msg_str = std::string(static_cast<char*>(msg.data()), msg.size());
std::cout << "Received '" << msg_str << "'" << std::endl;
}
return 0;
}
Program output:
$ ./publisher
Sending 'Hello-1' on topic A
Sending 'World-1' on topic B
Sending 'Hello-2' on topic A
Sending 'World-2' on topic B
Sending 'Hello-3' on topic A
Sending 'World-3' on topic B
$ ./subscriber
Received envelope 'B'
Received 'World-2'
Received envelope 'B'
Received 'World-3'
(note: subscriber is executed before executing publisher)
Bonus question: By the way, is it my impression or this C++ wrapper it is quite low level? I see no direct support for std::string and the code to transmit a simple string looks quite verbose.
Found the answer in the ZeroMQ Guide:
There is one more important thing to know about PUB-SUB sockets: you
do not know precisely when a subscriber starts to get messages. Even
if you start a subscriber, wait a while, and then start the publisher,
the subscriber will always miss the first messages that the publisher
sends. This is because as the subscriber connects to the publisher
(something that takes a small but non-zero time), the publisher may
already be sending messages out.
This "slow joiner" symptom hits enough people often enough that we're
going to explain it in detail. Remember that ZeroMQ does asynchronous
I/O, i.e., in the background. Say you have two nodes doing this, in
this order:
Subscriber connects to an endpoint and receives and counts messages.
Publisher binds to an endpoint and immediately sends 1,000 messages.
Then the subscriber will most likely not receive anything. You'll
blink, check that you set a correct filter and try again, and the
subscriber will still not receive anything.
Making a TCP connection involves to and from handshaking that takes
several milliseconds depending on your network and the number of hops
between peers. In that time, ZeroMQ can send many messages. For sake
of argument assume it takes 5 msecs to establish a connection, and
that same link can handle 1M messages per second. During the 5 msecs
that the subscriber is connecting to the publisher, it takes the
publisher only 1 msec to send out those 1K messages.
In Chapter 2 - Sockets and Patterns we'll explain how to synchronize a
publisher and subscribers so that you don't start to publish data
until the subscribers really are connected and ready. There is a
simple and stupid way to delay the publisher, which is to sleep. Don't
do this in a real application, though, because it is extremely fragile
as well as inelegant and slow. Use sleeps to prove to yourself what's
happening, and then wait for Chapter 2 - Sockets and Patterns to see
how to do this right.
The alternative to synchronization is to simply assume that the
published data stream is infinite and has no start and no end. One
also assumes that the subscriber doesn't care what transpired before
it started up. This is how we built our weather client example.
So the client subscribes to its chosen zip code and collects 100
updates for that zip code. That means about ten million updates from
the server, if zip codes are randomly distributed. You can start the
client, and then the server, and the client will keep working. You can
stop and restart the server as often as you like, and the client will
keep working. When the client has collected its hundred updates, it
calculates the average, prints it, and exits.
Bonus answer:
ZeroMQ has been designed for high-performance messaging / signalling and as such has some design-maxims, around which the core-parts have been developed.
Zero-Copy and Zero-Sharing are those more well-known, Zero-(almost)-Latency might be ( a bit ) provocative one, and a Zero-Warranty is perhaps a one, you would like least to hear about.
Yes, ZeroMQ does not strive to provide any explicit warranty to be assumed ( naturally, due to many reasons common in worlds of distributed-systems ), but yet it gives you one warranty of this kind -- any message is either delivered atomically ( i.e. complete, error-free ) -- or not at all ( so one will indeed never have to pay any extra costs, associated with detecting and discarding any runts and/or broken message-payloads ).
So may rather forget to worry about any packets undelivered, and what if these were delivered etc etc. You simply get as much as possible, and the rest is not under your influence ( "Late-joiner" cases could be considered as a boundary, where ( if ) one were in such a position to be able to enforce more time for "slow-joiner"(s), then none such observable difference would change the code-design, so rather try to design distributed-systems to be robust against ( principally ) possible undelivered signals / messages ).
API? Wrapper...
If interested in this level-of-detail, would recommend to read API, since some v2.x, so that one may better realise all the thoughts, that were put behind the strive for maximum performance ( Zero-Copy motivated set of message-preparation steps, advanced API-calls for messages, that would get re-sent, memory-leaks prevention, advanced IO-thread-Pool maps for increased IO-throughput / reduced latency / relative-prioritisations et al ).
After this, one may review how well ( or how poor ) any respective non-native language-binding ( wrapper ) did reflect these initial design-efforts into cross-ported programming environment.
Most of such efforts have got into troubles right with finding a reasonable balance between a user-programming comfort, the target programming-environment expressivity constraints and minimising sins of leaking memory or compromised quality of API-binding/wrapper.
It is fair to note, that designing a non-native language binding is one of a few most challenging tasks. Thus one ought bear with such brave teams who decided to step into this territory ( and sometimes failed to mirror all the native-API strengths without degraded performance and/or clarity of original intents -- needless to add, that many native-API features might even get excluded from becoming accessible from environments, that cannot provide seamless integration within the scope of such non-native language expressivity, so care is to be taken once evaluating an API-binding/wrapper ( and original native-API will always help to get to the roots of ZeroMQ original powers ) - anyway - in most corner cases, one may try to inline in critical sections ).

0MQ telnet data C++

I'm trying to get working sending telnet commands with 0MQ with C++ on VS2013.
I used HW client sample code from ZMQ hompage.
But what I see on WireShark is telnet packet with no data inside.
This code is prototype, what I need is just to be able to send this command.
After making it work, it will get some cleaning.
//
// Hello World client in C++
// Connects REQ socket to tcp://localhost:5555
// Sends "Hello" to server, expects "World" back
//
#include <zmq.hpp>
#include <zmq.h>
#include <string>
#include <iostream>
int main()
{
// Prepare our context and socket
zmq::context_t context(1);
zmq::socket_t socket(context, ZMQ_REQ);
std::cout << "Connecting to hello world server…" << std::endl;
socket.connect("tcp://10.40.6.226:23");
// Do 10 requests, waiting each time for a response
for (int request_nbr = 0; request_nbr != 1; request_nbr++) {
zmq::message_t request(2);
memcpy(request.data(), "Hello", 5);
std::cout << "Sending Hello " << request_nbr << "…" << std::endl;
socket.send(request);
//client_socket
// Get the reply.
/*zmq::message_t reply;
socket.recv(&reply);
std::cout << "Received World " << request_nbr << std::endl;*/
}
return 0;
}
So everything looks good beside I'm cannot see the string "Hello" in telnet packet.
Original sample http://zguide.zeromq.org/cpp:hwclient
Yes, one can send telnet commands over ZeroMQ
There is no principal obstacle in doing this. Once you correctly setup the end-to-end relation over ZeroMQ, your telnet-commands may smoothly flow across the link, meeting all the required underlying protocol-specific handshaking and event-handling.
Why it does not work here?
The strongest reason "behind" the observed scenario is, that you have missed the essence of the ZeroMQ Formal Communication Patterns framework.
ZeroMQ sockets are not "plain"-sockets as might the re-use of the word socket remind. There would be close to none benefit if ZeroMQ would just mimick a dumb-socket already available from the operating system. The greatest intellectual value one may benefit from ZeroMQ is based right on the opposite approach. Thanks to a several thousands man*years of experience that were put into the birth of AQMP and ZeroMQ & their younger ancestors, there are smart features built-in the framework which we are happy to re-use in our application domains, rather than trying to re-invent the wheel again.
The best next step?
Supposing one's interest in smart messaging is not lost, the best next step IMHO is to spend one's time on reading a great book "Code Connected, Vol.1" from Pieter HINTJENS, a co-father of the ZeroMQ >>> https://stackoverflow.com/a/25742744/3666197
+ a minor note, why the code does not move any data over a wire
A good design practice brought into the ZeroMQ architecture, have separated a transport per-se from the connection-state of a socket-archetype. That said, one may "pump-data-into" a local end of a-socket-archetype ( your code .send()-s 10x in a for loop ) but a remote-end need not be online throughout that whole episode ( or at all ). This means, the PHY-layer ( the wire ) will see and transport any data if-and-only-if both endo-points of the Formal Communication Pattern agree to do so.
In the REQ/REP scenario that means
{REQ|REP}.bind() <-online-visibility-episode-> {REP|REQ}.connect() state
REQ.send()-> REP.recv()
REP.send()-> ( REQ.recv())
REQ.send()->
keeping the nature of the Merry-Go-Round policy of the REQ/REP Formal Communication Pattern "forward-stepping".
In the posted for(){...} code-block this means that if step 1. is met you may wire-detect just the first and the only one message from REQ to REP, as you seem not to take care to perform mandatory steps 2. & 3. to .recv() a response from REP before the REQ-behavioural model will allow to send any next request ( which is the core nature of the REQ/REP pattern, isn't it? ).
Once your ZeroMQ insight gets farther, you would also get used to check errors associated with respective function calls.
Invoking a .connect() attempt, directed ( fortunately over port 23 ) to the hands of a telnet-daemon will be visible on a wire-level, however a protocol-level handshaking would hardly allow a correctly formulated ZeroMQ-wire-level protocol message ( which will for sure surprise the wire-level sniffer if in non-transparent mode ( assuming a telnet ) ) to make happy the telnet-daemon process, which is waiting for nothing else but a telnet-protocol-session setup dialogue, which in described scenario simply must fail to get met.