All,
I'm dealing with what seems like a simple case, but one that poses some design challenges:
There's a local actor system with a client, which reaches out to a
remote system that runs the bulk of the business logic.
The remote system will have a fixed IP address, port, etc. -
therefore, one can use the context.actorSelection(uri) strategy to
get a hold of the ActorRef for the current incarnation of the actor
(or a group of routees behind a router).
The remote system, being a server, shouldnt be in the business of
knowing the location of the client.
Given this, it's pretty straightforward to propagate messages from the client to the server, process them, and send a message back to the client. Even if there are several steps, one can propagate the responses through the hierarchy until one reaches the top-level remote actor that the client called, which will know who the sender was on the client side.
Let's say on the server side, we have a Master actor that has a router of Worker actors. You can have the worker respond directly to the client, since the message received from the Client by the Master can be sent to the Worker via the router as "router.tell(message, sender)" instead of "router ! message." Of course, you can also propagate responses from the Worker to the Master and then to the Client.
However, let's say the Worker throws an exception. If its parent (the Master) is its supervisor and it handles the Workers' failures, the Master can do the usual Restart/Resume/Stop. But let's say that we also want to notify the Client of the failure, e.g. for UI updating purposes. Even if we handle the failure of the Worker via the Master's SupervisorStrategy, we won't know who the original caller was (the Client) that had the processing request payload at the time when the Master intercepted the Worker's failure.
Here's a diagram
Client (local) -> Master (remote) -> Router (remote) -> Worker (remote)
Worker throws an exception, Master handles it. Now the Master can restart the Worker, but it doesn't know which Client to notify, in case there are several, their IP addresses change, etc.
If there's one Client and the Client has a host/port/etc. known to the server, then one could use context.actorSelection(uri) to look up the client and send it a message. However, with the server not being in the business of knowing where the Client is coming from (#3), this shouldn't be a requirement.
One obvious solution to this is to propagate messages from the Client to the Worker with the Client's ActorRef in the payload, in which case the Master would know about whom to send the failure notification to. It seems ugly, though. Is there a better way?
I suppose the Client can have the Workers on DeathWatch, but the Client shouldn't really have to know the details of the actor DAG on the server. So, I guess I'm coming back to the issue of whether the message sent from the Client should contain not just the originally intended payload, but also the ActorRef of the Client.
Also, this brings another point. Akka's "let it crash" philosophy suggests that the actor's supervisor should handle the actor's failures. However, if we have a Client, a Master (with a router) and a Worker, if the Worker fails, the Master can restart it - but it would have to tell the Client that something went wrong. In such a case, the Master would have to correlate the messages from the Client to the Workers to let the Client know about the failure. Another approach is to send the ActorRef of the Client along with the payload to the Worker, which would allow the Client to use the standard try/catch approach to intercept a failure, send a message to the client before failing, and then throw an exception that would be handled by the Master. However, this seems against Akka's general philosophy. Would Akka Persistence help in this case, since Processors track message IDs?
Thanks in advance for your help!
Best,
Marek
Quick answer, use this:
def preRestart(reason: Throwable, message: Option[Any]): Unit
More elaborate answer that gives no easy answers (as I struggle with this myself):
There are several ideas on how you can achieve what you need.
The question you asked should the worker answer the client or the master. Well that depends.
Let's assume that client sends you some work W1 and you pass it to the worker. The worker fails. Now the question is, if that work was important? If so, the master should still hold the reference to the W1 as it will probably retry the attempt in some near future. Maybe it was some data that should be persisted and the connection to database was lost for a second?
It the work was not important you may just set a timeout on the client that the operation was unsuccesfull and you're done. This way you will lost the exception details. But maybe it does not matter? You only want to check the logs afterwards and you just give a '500 Server Error' response.
This is not as easy to answer as it seems at first.
One possibility is to side-step most of this complexity by changing your approach. This may or may not be feasible for your use case.
For example, if the Exception can be anticipated and it is not of a sort that requires a restart of the worker actor, then don't let the master supervisor handle it. Simply build an appropriate response for that Exception (possible the Exception itself, or something containing the Exception), and send that to the client as a normal response message. You could send a scala Try message, for example, or create whatever messages make sense.
Of course, there are non-expected Exceptions, but in this case, the actor dealing with the UI can simply time-out and return a general error. Since the exception is unexpected, you probably wouldn't be able to do better than a general UI error anyway (e.g. a 500 error if the UI is HTTP-based), even if the exception was propagated to that layer. One downside of course is that the timeout will take longer to report the problem to the UI than if the error was propagated explicitly.
Lastly, I don't think there is anything wrong at all with sending ActorRef's as part of the payload, to handle this case from within the master actor as you suggested. I believe ActorRef was designed explicitly with the intent of sending them between actors (including remote actors). From the ScalaDoc of ActorRef:
Immutable and serializable handle to an actor, which may or may not reside on the local host or inside the same ActorSystem.
...
ActorRefs can be freely shared among actors by message passing.
Related
I want to implement long polling in a web service. I can set a sufficiently long time-out on the client. Can I give a hint to intermediate networking components to keep the response open? I mean NATs, virus scanners, reverse proxies or surrounding SSH tunnels that may be in between of the client and the server and I have not under my control.
A download may last for hours but an idle connection may be terminated in less than a minute. This is what I want to prevent. Can I inform the intermediate network that an idle connection is what I want here, and not because the server has disconnected?
If so, how? I have been searching around four hours now but I don’t find information on this.
Should I send 200 OK, maybe some headers, and then nothing?
Do I have to respond 102 Processing instead of 200 OK, and everything is fine then?
Should I send 0x16 (synchronous idle) bytes every now and then? If so, before or after the initial HTTP status code, before or after the header? Do they make it into the transferred file, and may break it?
The web service / server is in C++ using Boost and the content file being returned is in Turtle syntax.
You can't force proxies to extend their idle timeouts, at least not without having administrative access to them.
The good news is that you can design your long polling solution in such a way that it can recover from a connection being suddenly closed.
One such design would be as follows:
Since long polling is normally used for event notifications (think the Observer pattern), you associate a serial number with each event.
The client makes a GET request carrying the serial number of the last event it has seen, either as part of the URL or in a cookie.
The server maintains a buffer of recent events. Upon receiving a GET request from the client, it checks if any of the buffered events need to be sent to the client, based on their serial numbers and the serial number provided by the client. If so, all such events are sent in one HTTP response. The response finishes at that point, in case there is a proxy that wants to buffer the whole response before relaying it further.
If the client is up to date, that is it didn't miss any of the buffered events, the server is delaying its response till another event is generated. When that happens, it's sent as one complete HTTP response.
When the client receives a response, it immediately sends a new one. When it detects the connection was closed, it creates a new one and makes a new request.
When using cookies to convey the serial number of the last event seen by the client, the client side implementation becomes really simple. Essentially you just enable cookies on the client side and that's it.
I have to design a server which can able to send a same objects to many clients. clients may send some request to the server if it wants to update something in the database.
Things which are confusing:
My server should start the program (where I perform some operation and produce 'results' , this will be send to the client).
My server should listen to the incoming connection from the client, if any it should accept and start sending the ‘results’.
Server should accept as many clients as possible (Not more than 100).
My ‘result' should be secured. I don’t want some one take my ‘result' and see what my program logics look like.
I thought point 1. is one thread. And point 2. is another thread and it going to create multiple threads within its scope to serve point 3. Point 4 should be taken by my application logic while serialising the 'result' rather the server.
Is it a bad idea? If so where can i improve?
Thanks
Putting every connection on a thread is very bad, and is apparently a common mistake that beginners do. Every thread costs about 1 MB of memory, and this will overkill your program for no good reason. I did ask the very same question before, and I got a very good answer. I used boost ASIO, and the server/client project is finished since months, and it's a running project now beautifully.
If you use C++ and SSL (to secure your connection), no one will see your logic, since your programs are compiled. But you have to write your own communication protocol/serialization in that case.
I'm learning Akka, and I'm struggling to find a good pattern to share a single, limited resource among the whole actor hierarchy.
My use case is that I have a HTTP REST endpoint to which I'm only allowed 10 simultaneous connections at any time. Different actors at different levels of the hierarchy need to be able to make HTTP REST calls. I'm using non-blocking I/O to make the HTTP requests (AsyncHttpClient).
The obvious solution is to have a single actor in charge of this REST resource, and have any actors who want to access it send a message to it and expect a reply at a later stage, however:
Having a single actor in charge of this resource feels a bit fragile to me
How should any "client" actor know how to reach this resource manager actor? Is it best to create it at a well known location like /user/rest-manager and use an actor selection, or is it better to try to pass its ActorRef to every actor that needs it (but meaning it will need to be passed down in a lot of actors that don't use it, just so they can in turn pass it down)
In addition, how to deal with "blocking" the client actors when 10 connections are already in progress, especially since I'm using non-blocking I/O? Is it best practice to re-send a message to self (perhaps after some time) as a wait pattern?
I also thought of a token-based approach where the resource manager actor could reply with "access tokens" to client actors that needs to access the resource until exhaustion. However it means that client actors are supposed to "return" the token once they're done which doesn't sound ideal, and I will also need cater for actors dying without returning the token (with some sort of expiration timeout I guess).
What are the patterns / best practices to deal with that situation?
Updated: To indicate I'm using non-blocking I/O
My suggestions would be:
Use the Error Kernel pattern, as the REST endpoint, as you said, is a fragile code (I/O operations can generate any kind of errors). In other words, Master/Worker actor hierarchy, where Workers do the job, while Master does any supervision
Connection limit could be handled by Akka Routing feature, where number of Routees is, in your case, 10. This also drops into Master/Worker category
Addressing - either way sounds good
Connection timeout - to be handled by a client code, as it's always done in a majority of network libs.
I have a specific use-case for an Akka implementation.
I have a set of agents who send heartbeats to Akka. Akka takes this heartbeat, and assigns actors to send them to my meta-data server (a separate server). This part is done.
Now my meta-data server also needs to send action information to the agents. However, since these agents may be behind firewalls, Akka cannot communicate to them directly so it needs to send the action as a response to the Heartbeat. Thus when the meta-data server sends an action Akka stores it in a DurableMessageQueue (separate one for each agentID) and keeps the mapping of agent-ID to DurableMessageQueue in a HashMap. Then whenever the heartbeat comes, before responding it checks this queue and piggybacks the action in the response.
The issue with this is that the HashMap will be in a single JVM and therefor I cannot scale this. Am I missing something or is there a better way to do it?
I have Akka running behind Mina server running which received and sends messages.
I'm new to akka and intend to use it in my new project as a data replication mechanism.
In this scenario, there is a master server and a replicate data server. The replicate data should contain the same data as the master. Each time a data change occurred in the master, it sends an update message to the replicate server. Here the master server is the Sender, and the Replicate server is the Receiver.
But after digging the docs I'm still not sure how to satisfy the following use cases:
When the receiver crashes, the sender should pile up messages to send, none messages should be lost. It should be able to reconnect to the receiver later and continue with last successful message.
when the sender crashes, it should restart and no messages between restart is lost.
Messages are dealt with the same order they were sent.
So my question is, how to config akka to create a sender and a receiver that could do this?
I'm not sure actor with a DurableMessageBox could solve this. If it could, how can i simulate the above situations for testing?
Update:
After reading the docs Victor pointed at, I now got the point that what I wanted was once-and-only-once pattern, which is extremely costly.
In the akka docs it says
Actual transports may provide stronger semantics, but at-most-once is the semantics you should expect. The alternatives would be once-and-only-once, which is extremely costly, or at-least-once which essentially requires idempotency of message processing, which is a user-level concern.
So inorder to achieve Guaranteed Delivery, I may need to turn to some other MQ solution (for example Kafka), or try to implement once-and-only-once with DurableMessageBox, and see if the complexity with it could be relieved with my specific use case.
You'd need to write your own remoting that utilizes the durable subscriber pattern, as Akka message send guarantees are less strict than what you are going for: http://doc.akka.io/docs/akka/2.0/general/message-send-semantics.html
Cheers,
√