How do we handle persistence in case we are using akka routers. We would want to persist the unprocessed messages in the mailbox and be able to process them in case of server crashes and restarts. i know classic dispatchers provide filebasedmailbox for persistence
Related
If remote actor dies parent actor gets notified but what happens to mailbox attached to remote actor?
If there is no way to retrieve it then how can we say akka is fault tolerant?
One way is to implement Akka Persistence:
By default, a persistent actor is automatically recovered on start and on restart by replaying journaled messages. New messages sent to a persistent actor during recovery do not interfere with replayed messages. New messages will only be received by a persistent actor after recovery completes.
http://doc.akka.io/docs/akka/2.4.4/java/lambda-persistence.html#Recovery
You can also make sure that your remote actor that your sending messages to is a supervisor that spawns actors to handle the remote requests. That way the work and failures are contained by those children and not your main remote actor receiver.
I can't find any definitive answer here. My IoT service needs to tollerate flaky connections. Currently, I manage a local cache myself and retry a cloud-blob transfer as often as required. Could I replace this with an Azure EventHub service? i.e. will the EventHub client (on IoT-Core) buffer events until the connection is available? If so, where is the info on this?
It doesn't seem so according to:
https://azure.microsoft.com/en-us/documentation/articles/event-hubs-programming-guide/
You are resposible for sending and caching it seems:
Send asynchronously and send at scale
You can also send events to an Event Hub asynchronously. Sending
asynchronously can increase the rate at which a client is able to send
events. Both the Send and SendBatch methods are available in
asynchronous versions that return a Task object. While this technique
can increase throughput, it can also cause the client to continue to
send events even while it is being throttled by the Event Hubs service
and can result in the client experiencing failures or lost messages if
not properly implemented. In addition, you can use the RetryPolicy
property on the client to control client retry options.
I have a specific use-case for an Akka implementation.
I have a set of agents who send heartbeats to Akka. Akka takes this heartbeat, and assigns actors to send them to my meta-data server (a separate server). This part is done.
Now my meta-data server also needs to send action information to the agents. However, since these agents may be behind firewalls, Akka cannot communicate to them directly so it needs to send the action as a response to the Heartbeat. Thus when the meta-data server sends an action Akka stores it in a DurableMessageQueue (separate one for each agentID) and keeps the mapping of agent-ID to DurableMessageQueue in a HashMap. Then whenever the heartbeat comes, before responding it checks this queue and piggybacks the action in the response.
The issue with this is that the HashMap will be in a single JVM and therefor I cannot scale this. Am I missing something or is there a better way to do it?
I have Akka running behind Mina server running which received and sends messages.
I have an AMQP application that has a persistent RabbitMQ queue in the client side and a RabbitMQ queue in the server side. The client always writes in the local persistent queue and those messages are transmited to server using shovel plugin.
Producer -> Local Queue --------- SHOVEL ---------- Remote queue -> Consumer
Whether the server is not present the app stills works and shovel does the send when possible. In the other hand the server doesn't require to know the location of the clients becaiuse it consmes always from local queues. I would like to migrate this topology to AKKA using the FilePersistent Mailbox. Is it even possible? Is there something like Federation or Shovel plugin in Akka core libraries.
PS: What I want to achieve is replacing completetly AMQP to get rid of RabbitMQ. It works fine but is another piece of software to install, configure and mantain. I would like to provide all this functionality from my application using just libraries and not another server like RabbitMQ.
Just to clarify a little more what I'm looking to achieve is something like this:
Actor1 -> DurableMailBox1 ----Shovel? Federation?---- DurableMailbox2 <- Actor2
[EDIT]
It looks like there's no way to communicate directly mailbox to mailbox. The possible topologies that can be implemented with AKKA are these:
Remote Actor1 -> [DurableMailBox1 <- Actor2]
Where the arrow can be secured in order to ensure message delivery but is not possible to copy messages from one Mailbox to other Mailbox automatically.
Take a look at Akka Remoting and the Reliable Proxy Pattern.
Sending via a ReliableProxy makes the message send exactly as reliable
as if the represented target were to live within the same JVM,
provided that the remote actor system does not terminate. In effect,
both ends (i.e. JVM and actor system) must be considered as one when
evaluating the reliability of this communication channel. The benefit
is that the network in-between is taken out of that equation.
See this enhancement to the ReliableProxy that mitigates the problem with the remote actor system terminating.
We have a significantly complex Django application currently served by
apache/mod_wsgi and deployed on multiple AWS EC2 instances behind a
AWS ELB load balancer. Client applications interact with the server
using AJAX. They also periodically poll the server to retrieve notifications
and updates to their state. We wish to remove the polling and replace
it with "push", using web sockets.
Because arbitrary instances handle web socket requests from clients
and hold onto those web sockets, and because we wish to push data to
clients who may not be on the same instance that provides the source
data for the push, we need a way to route data to the appropriate
instance and then from that instance to the appropriate client web
socket.
We realize that apache/mod_wsgi do not play well with web sockets and
plan to replace these components with nginx/gunicorn and use the
gevent-websocket worker. However, if one of several worker processes
receive requests from clients to establish a web socket, and if the
lifetime of worker processes is controlled by the main gunicorn
process, it isn't clear how other worker processes, or in fact
non-gunicorn processes can send data to these web sockets.
A specific case is this one: A user who issues a HTTP request is
directed to one EC2 instance (host) and the desired behavior is that data is
to be sent to another user who has a web socket open in a completely
different instance. One can easily envision a system where a message
broker (e.g. rabbitmq) running on each instance can be sent a message
containing the data to be sent via web sockets to the client connected
to that instance. But how can the handler of these messages access
the web socket, which was received in a worker process of gunicorn?
The high-level python web socket objects created gevent-websocket and
made available to a worker cannot be pickled (they are instance
methods with no support for pickling), so they cannot easily be shared
by a worker process to some long-running, external process.
In fact, the root of this question comes down to how can web sockets
which are initiated by HTTP requests from clients and handled by WSGI
handlers in servers such as gunicorn be accessed by external
processes? It doesn't seem right that gunicorn worker processes,
which are intended to handle HTTP requests would spawn long-running
threads to hang onto web sockets and support handling messages from
other processes to send messages to the web sockets that have been
attached through those worker processes.
Can anyone explain how web sockets and WSGI-based HTTP request
handlers can possibly interplay in the environment I've described?
Thanks.
I think you've made the correct assesment that mod_wsgi + websockets is a nasty combination.
You would find all of your wsgi workers hogged by the web sockets and an attempt to (massively) increase the size of the worker pool would probably choke the server because of the memory usage and context switching.
If you like to stick with the synchronous wsgi worker architecture (as opposed to the reactive approach implemented by gevent, twisted, tornado etc), I would suggest looking into uWSGI as a application server. Recent versions can handle some URLs in the old way (i.e. your existing django views would still work the same as before), and route other urls to a async websocket handler. This might be a relatively smooth migration path for you.
It doesn't seem right that gunicorn worker processes, which are intended to handle HTTP requests would spawn long-running threads to hang onto web sockets and support handling messages from other processes to send messages to the web sockets that have been attached through those worker processes.
Why not? This is a long-running connection, after all. A long-running thread to take care of such a connection would seem... absolutely natural to me.
Often in these evented situations, writing is handled separately from reading.
A worker that is currently handling a websocket connection would wait for relevant message to come down from a messaging server, and then pass that down the websocket.
You can also use gevent's async-friendly Queues to handle in-code message passing, if you like.