How to "publish" a large number of actors in CAF? - c++

I've just learned about CAF, the C++ Actor Framework.
The one thing that surprised me is that the way to make an actor available over the network is to "publish" it to a specific TCP port.
This basically means that the number of actors that you can publish is limited by the number of ports you have ( 64k ). Since you need both one port to publish an actor and one port to access a remote actor, I assume that two processes would each be able to share at best about 32k actors each, while they could probably each hold a million actors on a commodity server. This would be even worse, if the cluster had, say, 10 nodes.
To make the publishing scalable, each process should only need to open 1 port, for each and every actor in one system, and open 1 connection to each actor system that they want to access.
Is there a way to publish one actor as a proxy for all actors in an actor system ( preferably without any significant performance loss )?

Let me add some background. The middleman::publish/middleman::remote_actor function pair does two things: connecting two CAF instances and giving you a handle for communicating to a remote actor. The actor you "publish" to a given port is meant to act as an entry point. This is a convenient rendezvous point, nothing more.
All you need to communicate between two actors is a handle. Of course you need to somehow learn new handles if you want to talk to more actors. The remote_actor function is simply a convenient way to implement a rendezvous between two actors. However, after you learn the handle you can freely pass it around in your distributed system. Actor handles are network transparent.
Also, CAF will always maintain a single TCP connection between two actor system. If you publish 10 actors on host A and "connect" to all 10 actors from host B via remote_actor, you'll see that CAF will initially open 10 connections (because the target node could run multiple actor system) but all but one connection will get closed.
If you don't care about the rendezvous for actors offered by publish/remote_actor then you can also use middleman::open and middleman::connect instead. This will only connect two CAF instances without exchanging actor handles. Instead, connect will return a node_id on success. This is all you need for some features. For example remote spawning of actors.
Is there a way to publish one actor as a proxy for all actors in an actor system ( preferably without any significant performance loss )?
You can publish one actor at a port that's sole purpose it is to model a rendezvous point. If that actor sends 1000 more actor handles to a remote actor this will not cause any additional network connections.
Writing a custom actor that explicitly models the rendezvous between multiple systems by offering some sort dictionary is the recommended way.
Just for the sake of completeness: CAF also has a registry mechanism. However, keys are limited to atom values, i.e., 10-characters-or-less. Since the registry is generic it also only stores strong_actor_ptr and leaves type safety to you. However, if that's all you need: you put handles to the registry (see actor_system::registry) and then access this registry remotely via middleman::remote_lookup (you only need a node_id to do this).

Smooth scaling with ( almost ) no limits is alpha & omega
One way, used in agent-based systems ( not sure if CAF has implemented tools for going this way ) is to use multiple transport-classes { inproc:// | ipc:// | tcp:// | .. | vmci:// } and thus be able to pick from, on an as needed basis.
While building a proxy may sound attractive, welding together two different actor-models one "atop" the other does not mean that it is as simple to achieve as it sounds ( eventloops are fragile to get tuned / blocking-prevented / event-handled in a fair manner - the do not like any other master to try to take their own Hat ... ).
In case CAF provides at the moment no other transport-means but TCP:
still one may resort to use O/S-level steps and measures and harness the features of the ISO-OSI-model up to the limits or as necessary:
sudo ip address add 172.16.100.17/24 dev eth0
or better, make the additional IP-addresses permanent - i.e. edit the file /etc/network/interfaces ( or Ubuntu ) and add as many stanzas, so that it looks like:
iface eth0 inet static
address 172.16.100.17/24
iface eth0 inet static
address 172.16.24.11/24
This way the configuration-space could get extended for cases the CAF does not provide any other means for such actors but the said TCP (address:port#)-transport-class.

Related

How do I use ZeroMQ to listen to and parse UDP-data on a specific port?

I am trying to build a c++ application that must use ZeroMQ to listen to encoded packets being forwarded to port 8080 via UDP on my machine at a rate of 10 [Hz].
How do I setup a zmq socket/server/etc.. such that I can receive and decode the incoming data?
I am on a linux machine, running Ubuntu 16.04
UPDATE + ANSWER:
ZMQ does not listen to generic UDP packets, as #tadman stated. Therefore, considering I was unable to modify the system that was sending the packets, this would not be an appropriate use for ZMQ. I ended up using a generic UDP endpoint as #tadman recommended.
How do I use ZeroMQ to listen to and parse UDP-data on a specific port ?
Greetings to Dearborn/UoM, let's first demystify the problem, ok?
ZeroMQ is not a self-isolating tool, it can and does talk or listen to non-ZeroMQ sockets too.
#tadman was right and wrong at the same time.
ZeroMQ doesn't listen to UDP packets. // == True; ( as of known in 2018-Q2, API ~ 4.2.2 )It listens to ZeroMQ packets. // == False;
Since ZeroMQ native API ~ 4.+, ZeroMQ can both listen and talk to non-ZeroMQ sockets, i.e. your wish may lead to a ZeroMQ Context()-engine working with a plain socket.
If new to ZeroMQ distributed-system's design eco-systems, you may like first a brief dis-ambiguation read into the main conceptual differences in the [ ZeroMQ hierarchy in less than a five seconds ] Section, so as to better touch the roots of the problem to solve.
ZeroMQ has udp:// <transport-class>,can be used for { ZMQ_RADIO | ZMQ_DISH } Archetypes only
While ZeroMQ has the udp:// transport-class ready to use for both unicast and multicast AccessPoint addresses, it is not yet possible to make the Context() instantiate such data-pump for a non-ZeroMQ, plain-socket peers.
ZeroMQ can talk to non-ZeroMQ peers,yet just over a tcp:// <transport-class>
non-ZeroMQ peers can get connected using a plain socked, redressed ( due to many architecture / API design reasons ) inside the ZeroMQ implementation into a ZeroMQ-compliant Scalable Formal Communication Archetype named ZMQ_STREAM ). This is cool and permits to use homogeneous strategies to handle also these types of communicating peers, yet, there is just a need to use the tcp:// transport-class, if this is necessary.
How to ?
Given your source of the dataflow is under your control, try to make it use ZeroMQ eco-system, since which it can be comfortably served as any other ZeroMQ udp://-cross-connected AccessPoint.
If design or "political" constraints prevent you from doing so, the receiving side cannot be ZeroMQ directly, so decide about making an application-specific protocol gateway, mediating Non-ZeroMQ-udp traffic to any form of ZeroMQ "consumable", be it a ZMQ_STREAM over plain-tcp: ( if decided to make a functionally minimalistic design of the proxy, or decide to equip such proxy straight with any other, smarter ZeroMQ archetype, to communicate on way higher level of comfort with your main data-collector / processor ).
If audio is the intended payload and the accumulating latency is a concern, best also read more details on how easily the main engine can get performance tuned - scaled up the number of IOthreads, wisely mapped ZMQ_AFFINITY and ZMQ_PRIORITY settings - all that can influence the target latency + throughput performance envelopes.
Last, but not least, the 10 [Hz] requirement
this one is indeed a nice part, that will test one's insights into asynchronous process coordination. ZeroMQ main engine ( the Context()-instance(s) ) work in an asynchronous and uncoordinated manner.
This means, there is no direct way to avoid accumulated latency or to inspect any of the Broker-less, per-peer managed, async by desing message-queue buffer, so as to "travel"-"back"-in-time, upon a Hard-Real-Time 10 [Hz] probing.
If this is going to work in a weak / "soft" ( not a strict R/T ) flow-of-time system coordination ( having no control-system stability constraints / critical-system / life-supporting or similar system responsibility, as hard R/T system designs do have ), thus tolerating a certain amount of code-execution related jitter RTT- / [ transport + (re-)processing ]-latencies a smart-designed .poll()-based non-blocking inspections and possibly some fast queue pre-emptying policies may help you get into acceptably fast, soft-RT behaviour to make the 10 [Hz]-monitor robust enough.
So, indeed cool days with ZeroMQ in front of you - Good Luck, Sir. If not already overdue with Project's Plan or deadline coming on Monday, best take a read of a fabulous Pieter HINTJENS' book "Code Connected, Volume 1", where most gems of the Zen-of-Zero are well discussed and inspected for distributed-systems designs.

How to operate multiple ZeroMQ Socket Types In The Same Process?

I am looking to use ZeroMQ to facilitate IPC in my embedded systems application, however, I'm not able to find many examples on using multiple 0MQ socket types in the same process.
For example, say I have a process called "antenna_mon" that monitors an antenna. I want to be able to send messages to this process and get responses back - a classic REQ-REP pattern. However, I also have a "cm" process, that publishes configuration changes to subscribers. I want antenna_mon to also subscribe to antenna configuration changes - PUB-SUB.
I found this example of reading from multiple sockets in the same process, but it seems sub optimal, because now you no longer block waiting for messages, you inefficiently check for messages constantly and go back to sleep.
Has anyone encountered this problem before? Am I just thinking about it wrong? Maybe I should have two threads - one for CM changes, one for REQ-REP servicing?
I would love any insights or examples of solving this type of problem.
Welcome to the very nature of distributed computing!
Yes, there are new perspectives one has to solve, once assembling a Project for a multi-agent domain, where more than one process works and communicates with it's respective peers ad-hoc.
A knowledge base, acquired from a soft Real-Time System or embedded systems design experience will help a lot here. If none such available, some similarities might be also chosen from GUI design, where a centerpiece is something like a lightweight .mainloop() scheduler, and most of the hard-work is embedded into round-robin polled GUI-devices and internal-state changes or external MMI-events are marshalled into event-triggered handlers.
ZeroMQ infrastructure gives one all the tools needed for such non-blocking, controllably poll-able ( scaleable, variable or adaptively ad-hoc adjustable poll-timeouts, not to overcome the given, design defined, round-trip duration of the controller .mainloop() ) and transport-agnostic, asynchronously operated, message dispatcher ( with thread-mapped performance scaling & priority tuning ).
What else one may need?
Well, just imagination and a lot of self-discipline to adhere the Zero-Copy, Zero-Sharing and Zero-Blocking design maxims.
The rest is in your hands.
Many "academic" examples may seem trivial and simplified, so as to illustrate just the currently discussed, or a feature demonstrated in some narrow perspective.
Not so in the real-life situations.
As an example, my distributed ML-engine uses a tandem of several PUSH/PULL pipelines for moving state data updates transfers and prediction forcasts + another PUSH/PULL for remote keyboard + a reversed .bind()/.connect() on PUB/SUB for easy broadcasting of distributed agents' telemetry to a remote centrally operated syslog and some additional PAIR/PAIR pipes, as processing requires.
( nota bene: one shall always bear in mind, that robust and error-resilient systems ought avoid to use a default REQ/REP Scaleable Formal Communication Pattern, as there is non-zero probability of falling the pairwise-stepped REQ/REP dual-FSA into an unsalvageable deadlock. Do not hesitate to read more about this smart tool. )

Simulate network conditions with a C/C++ Socket

I'm looking for a way to add network emulation to a socket.
The basic solution would be some way to add bandwidth limitation to a connection.
The ideal solution for me would:
Support advanced network properties (latency, packet-loss)
Open-source
Have a similar API as standard sockets (or wraps around them)
Work on both Windows and Linux
Support IPv4 and IPv6
I saw a few options that work on the system level, or even as proxy (Dummynet, WANem, neten, etc.), but that won't work for me, because I want to be able to emulate each socket manually (for example, open one socket with modem emulation and one with 3G emulation. Basically I want to know how these tools do it.
EDIT: I need to embed this functionality in my own product, therefore using an extra box or a third-party tool that needs manual configuration is not acceptable. I want to write code that does the same thing as those tools do, and my question is how to do it.
Epilogue: In hindsight, my question was a bit misleading. Apparently, there is no way to do what I wanted directly on the socket. There are two options:
Add delays to send/receive operation (Based on #PaulCoccoli's answer):
by adding a delay before sending and receiving, you can get a very crude network simulation (constant delay for latency, delay sending, as to not send more than X bytes per second, for bandwidth).
Paul's answer and comment were great inspiration for me, so I award him the bounty.
Add the network simulation logic as a proxy (Based on #m0she and others answer):
Either send the request through the proxy, or use the proxy to intercept the requests, then add the desired simulation. However, it makes more sense to use a ready solution instead of writing your own proxy implementation - from what I've seen Dummynet is probably the best choice (this is what webpagetest.org does). Other options are in the answers below, I'll also add DonsProxy
This is the better way to do it, so I'm accepting this answer.
You can compile a proxy into your software that would do that.
It can be some implementation of full fledged socks proxy (like this) or probably better, something simpler that would only serve your purpose (and doesn't require prefixing your communication with the destination and other socks overhead).
That code could run as a separate process or a thread within your process.
Adding throttling to a proxy shouldn't be too hard. You can:
delay forwarding of data if it passes some bandwidth limit
add latency by adding timer before read/write operations on buffers.
If you're working with connection based protocol (like TCP), it would be senseless to drop packets, but with a datagram based protocol (UDP) it would also be simple to implement.
The connection creation API would be a bit different from normal posix/winsock (unless you do some macro or other magic), but everything else (send/recv/select/close/etc..) is the same.
If you're building this into your product, then you should implement a layer of abstraction over the sockets API so you can select your own implementation at run time. Alternatively, you can implement wrappers of each socket function and select whether to call your own version or the system's version.
As for adding latency, you could have your implementation of the sockets API spin off a thread. In that thread, have a priority queue ordered by time (i.e. this background thread does a very basic discrete event simulation). Each "packet" you send or receive could be enqueued along with a delivery time. Each delivery time should have some amount of delay added. I would use some kind of random number generator with a Gaussian distribution.
The background thread would also have to simulate the other side of the connection, though it sounds like you may have already implemented that part?
I know only Network Link Conditioner for Mac OS X Lion. You should be mac developer to download it, so i cannot put download link there. Only description from 9to5mac.com: http://9to5mac.com/2011/08/10/new-in-os-x-lion-network-link-conditioner-utility-lets-you-simulate-internet-and-bandwidth-conditions/
This answer might be a partial solution for you when using linux:
Simulate delayed and dropped packets on Linux. It refers to a kernel module called netem, which can simulate all kinds of network problems.
If you want to work with TCP connections, having "packet loss" could be problematic since a lot of error-handling (like recovering lost packages) is done in the kernel. Simulating this in a cross-platform way could be hard.
you usually add a network device to your network that throttles the bandwidth or latency, on a port by port basis, you can then achieve what you want just by connecting to the port allocated to the particular type of crappy network you want to test, with no code changes or modifications required.
The easiest ways to do this is just add iptables rules to a Linux server acting as a proxy.
If you want it to work without the separate device, try trickle that is a software package that throttles your network on your client PC. (or for Windows)
You may would like to check WANem http://wanem.sourceforge.net/ . WANEM is Open Source and licensed under the GNU General Public License.
WANem allows the application development team to setup a transparent application gateway which can be used to simulate WAN characteristics like Network delay, Packet loss, Packet corruption, Disconnections, Packet re-ordering, Jitter, etc.
I think you could use a tool like Network Simulator. It's free, for Windows.
The only thing to do is to setup your program to use the right ports (and the settings for the network, of course).
If you want a software only solution that you control, you will have to implement it yourself. I know of no such existing package.
While a wrapper layer over a socket may give you the ability to introduce delay, it won't be sufficient to introduce loss or out of order delivery. In order to simulate those activities, you actually need intercept the data in transit between the two TCP stacks.
The approach I would recommend is to use a tunneling device (say tunX). Routes should be set so the client believes the way to the server is through tunX. Additional code (perhaps running in a different thread) would promiscuously intercept traffic on tunX, and perform your augmented behavior, before forwarding packets over the true physical interface that will get the traffic to your server. The reverse would happen for packets arriving from the server on the physical interface. Those packets would be intercepted by the client code, behavior augmented, before forwarding through tunX.
However, since you are testing client software, I am unclear as to why you would want to embed this code in your released software, unless the software itself is a WAN simulating client.

How to count SYN/ESTABLISHED connection to server?

I want to get number of SYN and ESTABLISHED connection to my server with C/C++. But I don't want to call popen to run netstat, or any other Linux command. I've managed to scan /proc/net/ip_conntrack and get the numbers. But I realize that scanning ip_conntrack requires great resources, each time my application invoke that method. Is there any other simple way?
Scanning /proc/net/ip_conntrack is not reliable because it only works if netfilter/connection tracking is enabled. And it doesn't only count connections to your server but also through your server (if it's acting as a router).
Better would be to get the information in the same places as netstat does: /proc/net/tcp, /proc/net/tcp6 (and similar files for UDP and other protocols if you care about those). That amount more or less to reimplementing netstat inside your application though. You have to wonder if it's worth it. Also, it's portable (more or less) to call netstat whereas reading those files directly is Linux-specific.
I know you are concerned about the resources requires to scan the full table every time, but I don't think there's a say to "subscribe" and get notifications when new connections are established or torn down. The closest thing I can think of to something like that would be to sniff the network interface (using libpcap) and keeping track of connection setups and teardowns yourself.

Middleware with generic communication media layer

Greetings all,
I'm trying to implement middleware (driver) for an embedded device with generic communication media layer. Not sure what is the best way to do it so I'm seeking an advice from more experienced stackoverflow users:). Basically we've got devices around the country communicating with our servers (or a pda/laptop in used in field). Usual form of communication is over TCP/IP, but could be also using usb, RF dongle, IR, etc. The plan is to have object corresponding with each of these devices, handling the proprietary protocol on one side and requests/responses from other internal systems on the other.
The thing is how create something generic in between the media and the handling objects. I had a play around with the TCP dispatcher using boost.asio but trying to create something generic seems like a nightmare :). Anybody tried to do something like that? What is the best way how to do it?
Example: Device connects to our Linux server. New middleware instance is created (on the server) which announces itself to one of the running services (details are not important). The service is responsible for making sure that device's time is synchronized. So it asks the middleware what is the device's time, driver translates it to device language (protocol) and sends the message, device responses and driver again translates it for the service. This might seem as a bit overkill for such a simple request but imagine there are more complex requests which the driver must translate, also there are several versions of the device which use different protocol, etc. but would use the same time sync service. The goal is to abstract the devices through the middleware to be able to use the same service to communicate with them.
Another example: we find out that the remote communications with the device are down. So we send somebody out with PDA, he connects to the device using USB cable. Starts up the application which has the same functionality as the timesync service. Again middleware instance is created (on the PDA) to translate communication between application and the device this time only using USB/serial media not TCP/IP as in previous example.
I hope it makes more sense now :)
Cheers,
Tom
The thing is how create something generic in between the media and the handling objects. I had a play around with the TCP dispatcher using boost.asio but trying to create something generic seems like a nightmare :). Anybody tried to do something like that? What is the best way how to do it?
I haven't used Boost, but the way I usually handled that kind of problem was to create a Device base class which the server interacts with, and then subclassed it for each device type, and made the subclasses deal with the device oddness. That way, the Device class becomes a definition of your protocol. Also, the Device class would need to be portable, but the subclasses would not.
If you had to get fancier than that, you could use the Factory pattern to create the actual subclassed objects.
As far as actually communicating, I'd see if I could just run one process per Device. If you have to have more than one Device per process, on Linux I'd just use select() and its friends to manage I/O between the various Device instances. I don't know how to do that on Windows; its select only works for sockets, not serial ports or other file-like things.
Other things that come to mind that might be useful include dbus and the MPI (Message Passing Interface) library, though they aren't complete solutions for your problem (dbus doesn't do inter-computer communications, IIRC).
Does this help at all?
EDIT: Needed a formatted response to Tom's reply...
Does your device class contain the communication specific parts? Because that's the thing I wanted to avoid.
The subclasses contain the communication specific parts. That's the whole point of using subclasses here; the generic stuff goes in the base class, and the specifics go in the subclass.
I was thinking about something like this: Say there is a dispatcher specific for media used which creates Connection object for each connection (media specific), Device obj. would be created as well but just a generic one and the Connection would pass the incoming data to Device and the Device would pass the responses back to Connection.
I think that may be a bit complex, and you're expecting a generic Device to deal with a specific Connection, which can get hard to maintain fast.
What I'd recommend is a Device subclass specifically for handling that type of Connection which takes the Connection from the dispatcher and owns it until the connection closes. Then your manager can talk to the generic Device and the Connection can mess with the specifics.
An example: Say you have a temperature sensor USB thingamajig. You have some dispatcher that catches the "USB thing plugged in" signal. When it sees the USB thing plugged in:
Dispatcher creates a USBTemperatureThingConnection.
Dispatcher creates a USBTemperatureDevice, which is a subclass of Device, giving the USBTemperatureThingConnection to the USBTemperatureDevice as a constructor parameter.
USBTemperatureDevice::USBTemperatureDevice(USBTemperatureThingConnection* conn) goes and sets up whatever it needs locally to finish setting up the Connection, then sends a message to the Device Manager saying it has set itself up.
Some time later, the Device Manager wants to set the time on all devices. So it iterates through its list of devices and calls the generic (maybe even abstract) Device::SetTime(const struct timespec_t&) method on each of them.
When it gets to your temperature device, it calls USBTemperatureDevice::SetTime(const struct timespec_t&), since USBTemperatureDevice overrode the one in Device (which was either abstract, i.e. virtual void SetTime(const struct timespec_t&) = 0; or a no-op, i.e. virtual void SetTime(const struct timespec_t&) {}, so you don't have to override it for devices that can't set time). USBTemperatureDevice::SetTime(const struct timespec_t&) does whatever USB Temperature sensor-specific things are needed, using the USBTemperatureThingConnection, to get the time set.
Some time later, the device might send back a "Time Set Result" message, saying if it worked or not. That comes in on the USBTemperatureThingConnection, which wakes up your thread and you need to deal with it. So your USBTemperatureDevice::DealWithMessageFromSensor() method (which only exists in USBTemperatureDevice) dives into the message contents and figures out if the time setting worked or not. It then takes that result, turns it into a value defined in enum Device::ResultCode and calls Device::TimeSetComplete(ResultCode result), which records the result, sets a flag (bool Device::timeComplete) saying the result is in, and then hits a Semaphore or Condition to wake up the Device Manager and get it to check all the Device's, in case it was blocked waiting for all the devices to finish setting time before continuing.
I have no idea what that pattern is called. If pressed, I'd say "subclassing", or "object-oriented design" if I felt grumpy. The "middleware" is the Device class, the DeviceManager, and all their underlings. The application then just talks to the Device Manager, or at most to the generic Device interface of a specific device.
Btw. Factory pattern was planned, each object would run in separate thread :)
Good to hear.
I'm assuming by TCP/IP you mean remote nodes, and by USB, etc. the local devices connected to the same physical box. I think what I'm missing in your explanation is the part that announces the new local devices to the server process ( i.e. the analog of a listening socket) Again, assuming something along the lines of Linux uevent, I would start with the following structure:
Controller: knows correct time, manages event sources, reacts to events
Event source: produces "new/delete device" events, knows its device class
server socket
local device monitor
etc.
Device class: encapsulates device-specific logic and manages/enumerates devices
remote device type (connected socket)
USB device type
USB device version X.Y.Z type
etc.
The high-level protocol is very simple - on receipt or "new device" event, query the "Device class" for time from given device, then update the time on the device. The "Device class" is the driver/translator/bridge that implements the conversion from query/update interface to device-specific commands (network exchange for remote nodes.) It also holds a list of its devices.
This should easily map to a class diagram. Was there something else that I missed?