ZeroMQ (cppzmq) subscriber skips first message - c++

I'm trying to use ZMQ with the CPPZMQ C++ wrapper, as it seems it is the one suggested in C++ Bindings.
The client/server (REQ/REP) seems to work fine.
When trying to implement a publish/subscribe pair of programs, it looks like the first message is lost in the subscriber. Why?
publisher.cpp:
#include <boost/date_time/posix_time/posix_time.hpp>
#include <boost/thread/thread.hpp>
#include <boost/format.hpp>
#include <zmq.hpp>
#include <string>
#include <iostream>
int main()
{
zmq::context_t context(1);
zmq::socket_t publisher(context, ZMQ_PUB);
publisher.bind("tcp://*:5555");
for(int n = 0; n < 3; n++) {
zmq::message_t env1(1);
memcpy(env1.data(), "A", 1);
std::string msg1_str = (boost::format("Hello-%i") % (n + 1)).str();
zmq::message_t msg1(msg1_str.size());
memcpy(msg1.data(), msg1_str.c_str(), msg1_str.size());
std::cout << "Sending '" << msg1_str << "' on topic A" << std::endl;
publisher.send(env1, ZMQ_SNDMORE);
publisher.send(msg1);
zmq::message_t env2(1);
memcpy(env2.data(), "B", 1);
std::string msg2_str = (boost::format("World-%i") % (n + 1)).str();
zmq::message_t msg2(msg2_str.size());
memcpy(msg2.data(), msg2_str.c_str(), msg2_str.size());
std::cout << "Sending '" << msg2_str << "' on topic B" << std::endl;
publisher.send(env2, ZMQ_SNDMORE);
publisher.send(msg2);
boost::this_thread::sleep(boost::posix_time::milliseconds(1000));
}
return 0;
}
subscriber.cpp:
#include <zmq.hpp>
#include <string>
#include <iostream>
int main()
{
zmq::context_t context(1);
zmq::socket_t subscriber(context, ZMQ_SUB);
subscriber.connect("tcp://localhost:5555");
subscriber.setsockopt(ZMQ_SUBSCRIBE, "B", 1);
while(true)
{
zmq::message_t env;
subscriber.recv(&env);
std::string env_str = std::string(static_cast<char*>(env.data()), env.size());
std::cout << "Received envelope '" << env_str << "'" << std::endl;
zmq::message_t msg;
subscriber.recv(&msg);
std::string msg_str = std::string(static_cast<char*>(msg.data()), msg.size());
std::cout << "Received '" << msg_str << "'" << std::endl;
}
return 0;
}
Program output:
$ ./publisher
Sending 'Hello-1' on topic A
Sending 'World-1' on topic B
Sending 'Hello-2' on topic A
Sending 'World-2' on topic B
Sending 'Hello-3' on topic A
Sending 'World-3' on topic B
$ ./subscriber
Received envelope 'B'
Received 'World-2'
Received envelope 'B'
Received 'World-3'
(note: subscriber is executed before executing publisher)
Bonus question: By the way, is it my impression or this C++ wrapper it is quite low level? I see no direct support for std::string and the code to transmit a simple string looks quite verbose.

Found the answer in the ZeroMQ Guide:
There is one more important thing to know about PUB-SUB sockets: you
do not know precisely when a subscriber starts to get messages. Even
if you start a subscriber, wait a while, and then start the publisher,
the subscriber will always miss the first messages that the publisher
sends. This is because as the subscriber connects to the publisher
(something that takes a small but non-zero time), the publisher may
already be sending messages out.
This "slow joiner" symptom hits enough people often enough that we're
going to explain it in detail. Remember that ZeroMQ does asynchronous
I/O, i.e., in the background. Say you have two nodes doing this, in
this order:
Subscriber connects to an endpoint and receives and counts messages.
Publisher binds to an endpoint and immediately sends 1,000 messages.
Then the subscriber will most likely not receive anything. You'll
blink, check that you set a correct filter and try again, and the
subscriber will still not receive anything.
Making a TCP connection involves to and from handshaking that takes
several milliseconds depending on your network and the number of hops
between peers. In that time, ZeroMQ can send many messages. For sake
of argument assume it takes 5 msecs to establish a connection, and
that same link can handle 1M messages per second. During the 5 msecs
that the subscriber is connecting to the publisher, it takes the
publisher only 1 msec to send out those 1K messages.
In Chapter 2 - Sockets and Patterns we'll explain how to synchronize a
publisher and subscribers so that you don't start to publish data
until the subscribers really are connected and ready. There is a
simple and stupid way to delay the publisher, which is to sleep. Don't
do this in a real application, though, because it is extremely fragile
as well as inelegant and slow. Use sleeps to prove to yourself what's
happening, and then wait for Chapter 2 - Sockets and Patterns to see
how to do this right.
The alternative to synchronization is to simply assume that the
published data stream is infinite and has no start and no end. One
also assumes that the subscriber doesn't care what transpired before
it started up. This is how we built our weather client example.
So the client subscribes to its chosen zip code and collects 100
updates for that zip code. That means about ten million updates from
the server, if zip codes are randomly distributed. You can start the
client, and then the server, and the client will keep working. You can
stop and restart the server as often as you like, and the client will
keep working. When the client has collected its hundred updates, it
calculates the average, prints it, and exits.

Bonus answer:
ZeroMQ has been designed for high-performance messaging / signalling and as such has some design-maxims, around which the core-parts have been developed.
Zero-Copy and Zero-Sharing are those more well-known, Zero-(almost)-Latency might be ( a bit ) provocative one, and a Zero-Warranty is perhaps a one, you would like least to hear about.
Yes, ZeroMQ does not strive to provide any explicit warranty to be assumed ( naturally, due to many reasons common in worlds of distributed-systems ), but yet it gives you one warranty of this kind -- any message is either delivered atomically ( i.e. complete, error-free ) -- or not at all ( so one will indeed never have to pay any extra costs, associated with detecting and discarding any runts and/or broken message-payloads ).
So may rather forget to worry about any packets undelivered, and what if these were delivered etc etc. You simply get as much as possible, and the rest is not under your influence ( "Late-joiner" cases could be considered as a boundary, where ( if ) one were in such a position to be able to enforce more time for "slow-joiner"(s), then none such observable difference would change the code-design, so rather try to design distributed-systems to be robust against ( principally ) possible undelivered signals / messages ).
API? Wrapper...
If interested in this level-of-detail, would recommend to read API, since some v2.x, so that one may better realise all the thoughts, that were put behind the strive for maximum performance ( Zero-Copy motivated set of message-preparation steps, advanced API-calls for messages, that would get re-sent, memory-leaks prevention, advanced IO-thread-Pool maps for increased IO-throughput / reduced latency / relative-prioritisations et al ).
After this, one may review how well ( or how poor ) any respective non-native language-binding ( wrapper ) did reflect these initial design-efforts into cross-ported programming environment.
Most of such efforts have got into troubles right with finding a reasonable balance between a user-programming comfort, the target programming-environment expressivity constraints and minimising sins of leaking memory or compromised quality of API-binding/wrapper.
It is fair to note, that designing a non-native language binding is one of a few most challenging tasks. Thus one ought bear with such brave teams who decided to step into this territory ( and sometimes failed to mirror all the native-API strengths without degraded performance and/or clarity of original intents -- needless to add, that many native-API features might even get excluded from becoming accessible from environments, that cannot provide seamless integration within the scope of such non-native language expressivity, so care is to be taken once evaluating an API-binding/wrapper ( and original native-API will always help to get to the roots of ZeroMQ original powers ) - anyway - in most corner cases, one may try to inline in critical sections ).

Related

ZeroMq: Too many open files.. Number of fd usage growing continuosly on the same object

Through the same class object which includes 2 zeromq subscriber and 1 zeromq request socket, I create objects in different threads. I use inproc zeromq sockets and that belong to same ZContext.
Each time I create the object the number of open files (lsof | wc -l) in the server (operating Centos 7) system increases incrementally. After creating the first object the open file # increases by amount of 300 and the second one increases the open file number by 304 and continuously growing.
As my programme can use many of these objects during runtime this can result in too many open files error for zeromq even though I set the limit to 524288 (ulimit -n). As the # of objects getting higher each object consumes the open file limit much more as some of them around 1500.
During runtime my programme crashes with the too many open files error at the times of many objects created and threads doing their work (sending messages to another server or clients) on the objects.
How can I overcome this through?
example code:
void Agent::run(void *ctx) {
zmq::context_t *_context = (zmq::context_t *) ctx;
zmq::socket_t dataSocket(*(_context),ZMQ_SUB);
zmq::socket_t orderRequestSocket(*(_context),ZMQ_REQ);//REQ
std::string bbpFilter = "obprice.1;
std::string bapFilter = "obprice.2"
std::string orderFilter = "order";
dataSocket.connect("inproc://ordertrade_publisher");
dataSocket.connect("inproc://orderbook_prices_pub");
orderRequestSocket.connect("inproc://frontend_oman_agent");
int rc;
try {
zmq::message_t filterMessage;
zmq::message_t orderMessage;
rc = dataSocket.recv(&filterMessage);
dataSocket.recv(&orderMessage);
//CALCULATION AND SEND ORDER
// end:
return;
}
catch(std::exception& e) {
std::cerr<< "Exception:" << e.what() << std::endl;
Order.cancel_order(orderRequestSocket);
return;
}
}
I'm running into this as well. I'm not sure I have a solution, but I see that a context (zmq::context_t) has a maximum number of sockets. See zmq_ctx_set for more detail. This limit defaults to ZMQ_MAX_SOCKETS_DFLT which appears to be 1024.
You might just need to increase the number of sockets your context can have, although I suspect there might be some leaking going on (at least in my case).
UPDATE:
I was able to fix my leak through a combination of socket options:
ZMQ_RCVTIMEO - I was already using this to avoid waiting forever if the other end wasn't there. My system handles this by only making one request on a socket, then closing it.
ZMQ_LINGER - set to 0 so the socket doesn't wait around trying to send the failed message. The default behavior is infinite linger. This is probably the key to your problem
ZMQ_IMMEDIATE - this option restricts the queueing of messages to only completed connections. Without a queue, there's no need for the socket to linger.
I can't say for sure if I need both linger and immediate, but they both seemed appropriate to my use case; they might help yours. With these options set, my number of open files does not grow infinitely.

How to get ZeroMQ Timestamp?

I'm writing a C++/ZMQ script that has a subscriber getting data from a publisher run by a separate script. I can't edit the publisher code, and I need to get the time that the ZeroMQ subscriber receives a message.
Basically, I have:
void *zmq_subscriber_ = zmq_socket( context, ZMQ_SUB );
zmq_setsockopt( zmq_subscriber_, ZMQ_SUBSCRIBE, NULL, 0 );
while ( ( zmq_msg_recv( &msg, zmq_subscriber_, ZMQ_DONTWAIT ) ) < 0 )
{ usleep( 1000 ); }
I need to know when the subscriber receives the message. Is there a way to get this information from ZeroMQ? Thanks in advance to anyone that can help!
Is there a way to get this information from ZeroMQ ?
Fortunately not directly from ZeroMQ API as-is ( in 2018/Q2 ).
Any options?
Given a coarse TimeDOMAIN resolution is fine, just store a Timestamp every time your code re-loops the while(){...; <here> } codeblock. This approach has a blind-spot of about the usleep()-duration - a latency, where a more precise moment of the receipt is undecideable.
Given this does not suffice, start using a non-blocking mode of a Poller.poll() method, and reduce any such latency to a level your intent can work with. Once handle an almost-zero-latency .poll() having zero-wait duration "inside" a Poller.poll() plus avoid spending any such usleep() so as to minimise the blind-spot.
If in an extreme need, refactor the code and introduce a new (private) API extension, so as to read such detail from Context()-instance internal state-registers. This would get you closer, if not the closest, to the actual moment of a message arrival into the hands of the SUB-side Context()'s internal processing.

C++ Boost Serialization: Input Stream Error

Hi fellow C++ developers,
I'm trying to send a C++ class over the network with zmq and boost::serialization.
The concept is to serialize the class PlayCommand on the client. Then send it to the server with zmq. And the deserialize it on the server.
This works fine in the rest of the application. For some reason I get input stream errors while deserializing the PlayCommand on the server from time to time. I can not figure out why it is sometimes throwing this exception and sometimes not.
It seems to be a time sensitive problem. Do I have to wait at some point to let boost do its thing ?
std::shared_ptr<PlayCommand> _exe(dynamic_cast<PlayCommand*>(_cmd.get()));
zmq::context_t _ctx(1);
zmq::socket_t _skt(_ctx, ZMQ_PUB);
_skt.connect("tcp://0.0.0.0:" + this->kinect_daemon_com_port);
std::stringstream _type_stream;
std::stringstream _exe_stream;
boost::archive::text_oarchive _type_archive(_type_stream);
boost::archive::text_oarchive _exe_archive(_exe_stream);
_type_archive << _type;
_exe_archive << *_exe.get();
std::string _type_msg_str = _type_stream.str();
std::string _exe_msg_str = _exe_stream.str();
zmq::message_t _type_msg(_type_msg_str.length());
zmq::message_t _exe_msg(_exe_msg_str.length());
memcpy(_type_msg.data(), _type_msg_str.data(), _type_msg_str.length());
memcpy(_exe_msg.data(), _exe_msg_str.data(), _exe_msg_str.length());
_skt.send(_type_msg, ZMQ_SNDMORE);
_skt.send(_exe_msg, 0);
void ZMQMessageResolver::resolve_message(std::shared_ptr<Event> _event, unsigned _unique_thread_id)
{
std::cout << "ZMQMessageResolver::resolve_message(std::shared_ptr<Event> _event, unsigned _unique_thread_id)" << std::endl;
std::shared_ptr<ZMQMessageEvent> _zmq_event = std::static_pointer_cast<ZMQMessageEvent>(_event);
//(static_cast<ZMQMessageEvent*>(_event.get()));
ZMQMessageType _type;
PlayCommand _cmd;
auto _messages = _zmq_event->get_data();
auto _type_string = std::string(static_cast<char*>(_messages->front()->data()), _messages->front()->size());
auto _cmd_string = std::string(static_cast<char*>(_messages->back()->data()), _messages->back()->size());
std::stringstream _type_stream{_type_string};
std::istringstream _cmd_stream{_cmd_string};
boost::archive::text_iarchive _type_archive{_type_stream};
boost::archive::text_iarchive _cmd_archive{_cmd_stream};
std::cout << "1" << std::endl;
_type_archive >> _type;
std::cout << "2" << std::endl;
_cmd_archive & _cmd;
std::cout << "3" << std::endl;
std::shared_ptr<ThreadEvent> _thread_event = std::make_shared<ThreadEvent>(_zmq_event->get_event_message());
_cmd.execute(_thread_event);
std::lock_guard<std::mutex> _lock{*this->thread_mutex};
this->finished_threads.push_back(_unique_thread_id);
}
The complete project is on github: rgbd-calib and rgbd-calib-py.
The important files are /framework/ZMQMessageResolver.cpp in rgbd-calib and /src/KinectDaemon.cpp in rgbd-calib-py.
I would appreciate any help.
First insights
I checked for shared zmq::socket_t instances. I could not find any so thread safety should be a non issue.
I found out that other developers are also experiencing problems with ZMQ multi part messages. Maybe that could be an issue in my case as well. Maybe someone as experiences with those. Do I have to take any safety measures when sending and receiving multi part messages ?
If it's timing sensitive, no doubt it's unrelated to boost: the boost code shown is completely synchronous and local. If you don't always receive the full streams you get this error. Likewise if there's a protocol error interpreting the received data you might get corrupt data.
Both cases would easily lead to "input stream error".
I have no experience with 0MQ so I don't know whether the code as shown could receive incomplete messages, but I'd look into that.
A minor note is that it's rather strange to have one stringstream and the other istringstream. There might be differences in seek behaviour.
Let me add a few cents on ZeroMQ part of the story:
Fact #1:ZeroMQ does never deliver a piece of trash it delivers either a complete message( as it was sent ) or nothing at all
this principal design feature helps to sort out one of the claimed potential issues.
If an application indeed receives a ZeroMQ message delivered, one can be sure of its being of a shape and sound copy of what has been dispatched from the remote process. It is simply the same. Pullstop.
Fact #2: ZeroMQ architects and evangelists do in every chapter since the very beginning of the ZeroMQ API v2.xx warn to never share
which seems from the code depicted above unclear.
If one instantiates a ZeroMQ-socket AccessPoint ( of a SUB-type in the above context ), the thread owning such AccessPoint resource is the only thread that may manipulate with such resource and never "let" any other touch this toy. Never. While there might be some recent talks and efforts to re-design the ZeroMQ core, so as to add a thread-safety ex-post, I remain skeptical as per these moves, being principally sure that a non-blocking high-performance + low-latency motivated designs in distributed computing should never share a common piece, right because of the costs of overheads and lost principal safety ( which is not easy to be got just ex-post bought back by any inter-thread signalling / locking / blocking ).
You may review the code, so as to confirm or deny any kind of sharing ZeroMQ instances ( a Context being another, separate subject in this ) and for cases, where shared pieces were detected, your team ought re-design the code so as to avoid it.
Yes. Avoid sharing and ZeroMQ tools will serve you as a hell.

0MQ telnet data C++

I'm trying to get working sending telnet commands with 0MQ with C++ on VS2013.
I used HW client sample code from ZMQ hompage.
But what I see on WireShark is telnet packet with no data inside.
This code is prototype, what I need is just to be able to send this command.
After making it work, it will get some cleaning.
//
// Hello World client in C++
// Connects REQ socket to tcp://localhost:5555
// Sends "Hello" to server, expects "World" back
//
#include <zmq.hpp>
#include <zmq.h>
#include <string>
#include <iostream>
int main()
{
// Prepare our context and socket
zmq::context_t context(1);
zmq::socket_t socket(context, ZMQ_REQ);
std::cout << "Connecting to hello world server…" << std::endl;
socket.connect("tcp://10.40.6.226:23");
// Do 10 requests, waiting each time for a response
for (int request_nbr = 0; request_nbr != 1; request_nbr++) {
zmq::message_t request(2);
memcpy(request.data(), "Hello", 5);
std::cout << "Sending Hello " << request_nbr << "…" << std::endl;
socket.send(request);
//client_socket
// Get the reply.
/*zmq::message_t reply;
socket.recv(&reply);
std::cout << "Received World " << request_nbr << std::endl;*/
}
return 0;
}
So everything looks good beside I'm cannot see the string "Hello" in telnet packet.
Original sample http://zguide.zeromq.org/cpp:hwclient
Yes, one can send telnet commands over ZeroMQ
There is no principal obstacle in doing this. Once you correctly setup the end-to-end relation over ZeroMQ, your telnet-commands may smoothly flow across the link, meeting all the required underlying protocol-specific handshaking and event-handling.
Why it does not work here?
The strongest reason "behind" the observed scenario is, that you have missed the essence of the ZeroMQ Formal Communication Patterns framework.
ZeroMQ sockets are not "plain"-sockets as might the re-use of the word socket remind. There would be close to none benefit if ZeroMQ would just mimick a dumb-socket already available from the operating system. The greatest intellectual value one may benefit from ZeroMQ is based right on the opposite approach. Thanks to a several thousands man*years of experience that were put into the birth of AQMP and ZeroMQ & their younger ancestors, there are smart features built-in the framework which we are happy to re-use in our application domains, rather than trying to re-invent the wheel again.
The best next step?
Supposing one's interest in smart messaging is not lost, the best next step IMHO is to spend one's time on reading a great book "Code Connected, Vol.1" from Pieter HINTJENS, a co-father of the ZeroMQ >>> https://stackoverflow.com/a/25742744/3666197
+ a minor note, why the code does not move any data over a wire
A good design practice brought into the ZeroMQ architecture, have separated a transport per-se from the connection-state of a socket-archetype. That said, one may "pump-data-into" a local end of a-socket-archetype ( your code .send()-s 10x in a for loop ) but a remote-end need not be online throughout that whole episode ( or at all ). This means, the PHY-layer ( the wire ) will see and transport any data if-and-only-if both endo-points of the Formal Communication Pattern agree to do so.
In the REQ/REP scenario that means
{REQ|REP}.bind() <-online-visibility-episode-> {REP|REQ}.connect() state
REQ.send()-> REP.recv()
REP.send()-> ( REQ.recv())
REQ.send()->
keeping the nature of the Merry-Go-Round policy of the REQ/REP Formal Communication Pattern "forward-stepping".
In the posted for(){...} code-block this means that if step 1. is met you may wire-detect just the first and the only one message from REQ to REP, as you seem not to take care to perform mandatory steps 2. & 3. to .recv() a response from REP before the REQ-behavioural model will allow to send any next request ( which is the core nature of the REQ/REP pattern, isn't it? ).
Once your ZeroMQ insight gets farther, you would also get used to check errors associated with respective function calls.
Invoking a .connect() attempt, directed ( fortunately over port 23 ) to the hands of a telnet-daemon will be visible on a wire-level, however a protocol-level handshaking would hardly allow a correctly formulated ZeroMQ-wire-level protocol message ( which will for sure surprise the wire-level sniffer if in non-transparent mode ( assuming a telnet ) ) to make happy the telnet-daemon process, which is waiting for nothing else but a telnet-protocol-session setup dialogue, which in described scenario simply must fail to get met.

ZeroMQ - pub / sub latency

I'm looking into ZeroMQ to see if it's a fit for a soft-realtime application. I was very pleased to see that the latency for small payloads were in the range of 30 micro-seconds or so. However in my simple tests, I'm getting about 300 micro-seconds.
I have a simple publisher and subscriber, basically copied from examples off the web and I'm sending one byte through localhost.
I've played around for about two days w/ different sockopts and I'm striking out.
Any help would be appreciated!
publisher:
#include <iostream>
#include <zmq.hpp>
#include <unistd.h>
#include <sys/time.h>
int main()
{
zmq::context_t context (1);
zmq::socket_t publisher (context, ZMQ_PUB);
publisher.bind("tcp://*:5556");
struct timeval timeofday;
zmq::message_t msg(1);
while(true)
{
gettimeofday(&timeofday,NULL);
publisher.send(msg);
std::cout << timeofday.tv_sec << ", " << timeofday.tv_usec << std::endl;
usleep(1000000);
}
}
subscriber:
#include <iostream>
#include <zmq.hpp>
#include <sys/time.h>
int main()
{
zmq::context_t context (1);
zmq::socket_t subscriber (context, ZMQ_SUB);
subscriber.connect("tcp://localhost:5556");
subscriber.setsockopt(ZMQ_SUBSCRIBE, "", 0);
struct timeval timeofday;
zmq::message_t update;
while(true)
{
subscriber.recv(&update);
gettimeofday(&timeofday,NULL);
std::cout << timeofday.tv_sec << ", " << timeofday.tv_usec << std::endl;
}
}
Is the Task Definition real?
Once speaking about *-real-time design, the architecture-capability validation is more important, than the following implementation itself.
If taking your source code as-is, your readings ( which would be ideally posted together with your code snippets for a cross-validation of the replicated MCVE-retest ) will not serve much, as the numbers do not distinguish what portions ( what amounts of time ) were spent on sending-side loop-er, on sending side zmq-data-acquisition/copy/scheduling/wire-level formatting/datagram-dispatch and on receiving side unloading from media/copy/decode/pattern-match/propagate to receiver buffer(s)
If interested in ZeroMQ internals, there are good performance-related application notes available.
If striving for a minimum-latency design do:
remove all overheads
replace all tcp-header processing from the proposed PUB/SUB channel
avoid all non-cardinal logic overheads from processing ( no sense to spend time on subscribe-side ( sure, newer versions of ZMQ have moved into publisher-side filtering, but the idea is clear ) with pattern-matching encoded in the selected archetype processing ( using ZMQ_PAIR avoids any such, independently from the transport class ) - if it is intended to block something, then rather change the signalling socket layout accordingly, so as to principally avoid blocking ( this ought to be a real-time system, as you have said above)
apply a "latency-masking" where possible in the target multi-core / many-core hardware architectures so as to squeeze the last drops of spare-time from your hardware / tools capabilities ... benchmark with experiments setups with more I/O-threads' help zmq::context_t context( N );, where N > 1
Missing target:
As Alice in the Wonderlands stated more than a century ago, whenever there was no goal defined, any road leads to the target.
Having a soft-real time ambition, there shan´t be an issue to state a maximum allowed end-to-end latency and from that derive a constraint for transport-layer latency.
Having not done so, 30 us, 300 us or even 3 ms have no meaning per se, so no-one can decide, whether these figures are "enough" for some subsystem or not.
A reasonable next step:
define real-time stability horizon(s) ... if using for a real-time control
define real-time design constraints ... for signal / data acquisition(s), for processing task(s), for self-diagnostic & control services
avoid any blocking, design-wise & validate / prove no blocking will ever appear under all possible real-world operations circumstances [formal proof methods are ready for such task] ( no one would like to see an AlertPanel [ Waiting for data] during your next jet landing or have the last thing to see, before an autonomous car crashes right into the wall, a lovely looking [hour-glass] animated-icon as it moves the sand while the control system got busy, whatever a reason for that was behind it, in a devastatingly blocking manner.
Quantified targets make sense for testing.
If a given threshold permits to have 500 ms stability horizon (which may be a safe value for a slo-mo hydraulic-actuator/control-loop, but may fail to work for a guided missile control system, the less for any [mass&momentum-of-inertia]-less system (alike DSP family of RT-control-systems)), you can test end-to-end if your processing fits in between.
If you know, your incoming data-stream brings about 10 kB each 500 us, you can test your design if it can keep the pace with the burst traffic or not.
If you test, your mock-up design does miss the target (not meeting the performance / time-constrained figures) you know pretty well, where the design or where the architecture needs to get improved.
First make sure you run producer and consumer on different physical cores (not HT).
Second, it depends A LOT on the hardware and OS. Last time I measured kernel IO (4-5 years ago) the results were indeed 10 to 20us around send/recv system calls.
You have to optimize your kernel settings to low latency and set TCP_NODELAY.

Categories