Configuring spray-servlet to avoid request bottleneck - akka

I have an application which uses spray-servlet to bootstrap my custom Spray routing Actor via spray.servlet.Initializer. The requests are then handed off to my Actor via spray.servlet.Servlet30ConnectorServlet.
From what I can gather, the Servlet30ConnectorServlet simply retrieves my Actor out of the ServletContext that the Initializer had set when the application started, and hands the HttpServletRequest to my Actor's receive method. This leads me to believe that only one instance of my Actor will have to handle all requests. If my Actor blocks in its receive method, then subsequent requests will queue waiting for it to complete.
Now I realize that I can code my routing Actor to use detach() or a complete that returns a Future, however most of the documentation never alludes to having to do this.
If my above assumption is true (single Actor instance handling all requests), is there a way to configure the Servlet30ConnectorServlet to perhaps load balance the incoming requests amongst multiple instances of my routing Actor instead of just the one? Or is this something I'll have to roll myself by subclassing Servlet30ConnectorServlet?

I did some research and now I understand better how spray-servlet is working. It's not spray-servlet that dictates the strategy for how many Request Handler Actors are created but rather the plumbing code provided with the example I based my application on.
My assumption all along was that spray-servlet would essentially work like a traditional Java EE application dispatcher in a handler-per-request type of fashion (or some reasonable variant of that notion). That is not the case because it is routing the request to an Actor with a mailbox, not some singleton HttpServlet.
I am now delegating the requests to a pool of actors in order to reduce our potential for bottleneck when our system is under load.
val serviceActor = system.actorOf(RoundRobinPool(config.SomeReasonableSize).props(Props(Props[BootServiceActor])), "my-route-actors")
I am still a bit baffled by the fact that the examples and documentation assumes everyone would be writing non-blocking Request Handler Actors under spray. All of their documentation essentially demonstrates non-Future rendering complete, yet there is no mention in their literature that maybe, just maybe, you might want to create a reasonable sized pool of Request Handler Actors to prevent a slew of requests from bottle necking the poor single overworked Actor. Or it's possible I've overlooked it.

Related

Is it an anti-pattern in Akka HTTP to pass around the request context for completion?

Imagine I have a JobsEndpoint class, which contains a JobSupervisor class which then has two child actors, RepositoryActor and StreamsSupervisorActor. The behavior for different requests to this top level JobSupervisor will need to be performed in the appropriate child actor. For example, a request to store a job will be handled exclusively in the RepositoryActor, etc...
My question, then, is if it is an anti-pattern to pass the request context through these actors via the messages, and then complete the request as soon as it makes sense?
So instead of doing this:
Request -> Endpoint ~ask~> JobSupervisor ~ask~> RepositoryActor
Response <- Endpoint <- JobSupervisor <-|return result
I could pass the RequestContext in my messages, such as StoreJob(..., ctx: RequestContext), and then complete it in the RepositoryActor.
I admittedly haven't been using Akka long but I see a few opportunities for improvement.
First, you are chaining "ask" calls which block threads. In some cases it's unavoidable but I think in your case, it is avoidable. When you block threads, you're potentially hurting your throughput.
I would have the Endpoint send a message with it's ActorRef as a "reply to" field. That way, you don't have to block the Endpoint and JobSupervisor actors. Whenever Repository actor completes the operation, it can send the reply directly to Endpoint without having to traverse middlemen.
Depending on your messaging guarantee needs, the Endpoint could implement retrying and de-duplicating if necessary.
Ideally each actor will have everything it needs to process a message in the message. I'm not sure what your RequestContext includes but I would consider:
1) Is it hard to create one? This impacts testability. If the RequestContext is difficult to create, I would opt for pulling out just the needed members so that I could write unit tests.
2) Can it be serialized? If you deploy your actor system in a cluster environment, then you'll need to serialize the messages. Messages that are simple data holders work best.

How to expose an asynchronous api as a custom akka stream Source now that ActorPublisher is deprecated?

With ActorPublisher deprecated in favor of GraphStage, it looks as though I have to give up my actor-managed state for GraphStateLogic-managed state. But with the actor managed state I was able to mutate state by sending arbitrary messages to my actor and with GraphStateLogic I don't see how to do that.
So previously if I wanted to create a Source to expose data that is made available via HTTP request/response, then with ActorPublisher demand was communicated to my actor by Request messages to which I could react by kicking off an HTTP request in the background and send responses to my actor so I could send its contents downstream.
It is not obvious how to do this with a GraphStageLogic instance if I cannot send it arbitrary messages. Demand is communicated by OnPull() to which I can react by kicking off an HTTP request in the background. But then when the response comes in, how do I safely mutate the GraphStateLogic's state?
(aside: just in case it matters, I'm using Akka.Net, but I believe this applies to the whole Akka streams model. I assume the solution in Akka is also the solution in Akka.Net. I also assume that ActorPublisher will also be deprecated in Akka.Net eventually even though it is not at the moment.)
I believe that the question is referring to "asynchronous side-channels" and is discussed here:
http://doc.akka.io/docs/akka/2.5.3/scala/stream/stream-customize.html#using-asynchronous-side-channels.
Using asynchronous side-channels
In order to receive asynchronous events that are not arriving as stream elements (for example a completion of a future or a callback from a 3rd party API) one must acquire a AsyncCallback by calling getAsyncCallback() from the stage logic. The method getAsyncCallback takes as a parameter a callback that will be called once the asynchronous event fires.

Should supervised actors in Akka receive messages directly or via its supervisor?

I recently watched a great video by Riccardo Terrell about Akka.NET and F# but the question I am going to ask is related to Akka in general. I was puzzled with a part where he discusses supervisor and supervision strategy on error. In his examples clients send messages to supervisor that forwards them to a worker actor by calling worker.Tell(msg, mailbox.Sender()). I wonder how common is such practice. In our system there are a few places where a client first Ask a supervisor to obtain a worker instance and then make Tell calls directly to a worker. I am not happy about using Ask but in our case we need worker affinity, i.e. messages from the same client may need to be routed to the same worker so giving a client a worker instance simplifies this. But again, Ask is bad. So in case of supervised actors, are they supposed to receive messages via the supervisor (to avoid Ask) or "it depends"?
One solution could be is use hashing pool when using simple hasing function, so then you can eliminate sending back references to asker.
new ConsistentHashingPool(5).WithHashMapping(o =>
{
if (o is IHasCustomKey)
return ((IHasCustomKey)o).Key;
return null;
});
Any comments welcome!

Use case for Akka PoisonPill

According to the Akka docs for PoisonPill:
You can also send an actor the akka.actor.PoisonPill message, which will stop the actor when the message is processed. PoisonPill is enqueued as ordinary messages and will be handled after messages that were already queued in the mailbox.
Although the usefulness/utility of such a feature may be obvious to an Akka Guru, to a newcomer, this sounds completely useless/reckless/dangerous.
So I ask: What's the point of this message and when would one ever use it, for any reason?!?
We use a pattern called disposable actors:
A new temporary actor is created for each application request.
This actor may create some other actors to do some work related to the request.
Processed result is sent back to client.
All temporary actors related to this request are killed. That's the place where PoisonPill is used.
Creating an actor implies a very low overhead (about 300 bytes of RAM), so it's quite a good practise.

Multiple dispatcher for spray

I am wondering how to handle this specific case.
I have two ClientService that I want to provide to to a web application. By clientService, I mean client API that calls some external rest service. So we are in spray client here.
The thing is, one of the two service can be quite intensive and time consuming but less frequently called than the other one, which will be quicker but with very frequent calls.
I was thinking of having two dispatchers for the two clientService. Let's say we have the query API (ClientService1) and the classification API (ClientService2)
Both service shall indeed be based on the same actor system. So in other words, I would like to have two dispatcher in my actor system, then pass them to spray via the client-level api, for instance pipeline.
Is it feasible, scalable and appropriate?
What would you recommend instead to use one dispatcher but with a bigger thread pool?
Also, how can I obtain a dispatcher?
Should I create a threadpool executor myself and get a dispatcher out
of it?
How do I get an actor system to load/create multiple dispatcher, and
how to retrieve them such that to pass them to the pipeline method?
I know how to create an actor with a specific dispatcher, there are example for that, but that is a different scenario. I would not like to have lower than the client level API by the way
Edit
I have found that the system.dispatchers.lookup method can create one. So that should do.
However the thing that is not clear is related to AKK.IO/SPRAY.IO.
The manager IO(HTTP): it is not clear to me on which dispatcher it runs or if it can be configured.
Moreover, let's say I pass a different execution context to the pipeline method. What happens? I will still have IO(HTTP) running on the default execution context or its own (I don't know how it is done internally) ? Also what exactly will be ran on the execution context that I pass ? (in other words, which actors)