SelectChildName messages from RemoteDeadLetterActorRef - akka

I have two actor systems that communicate via akka remoting.
When I take a look into the JVM heap I am seeing (too) many instances of akka.dispatch.Envelope containing SelectChildName messages from akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.
The retained heap of these messages is pretty large and causes memory problems.
What is the purpose of these SelectChildName messages? Is there a way to avoid them?
FYI This seems to relate with Disassociation errors that occur between the two actor systems.
Thanks,
Michail

SelectChildName messages are used by Akka Remoting to resolve a remote actor. If you see a lot of them, there is a chance you are interacting directly with an ActorSelection, instead of an ActorRef.
Every time you send a message to an ActorSelection, for example (these are taken from the docs)
val selection = context.actorSelection("akka.tcp://actorSystemName#10.0.0.1:2552/user/actorName")
selection ! "Pretty awesome feature"
the - possibly remote - actor is resolved, and that involves exchanging of SelectChildName messages by the underlying Akka infrastructure.
If that's the case, try and use directly ActorRefs. You can obtain one from an ActorSelection by using the resolveOne method.
Citing the docs again:
It is always preferable to communicate with other Actors using
their ActorRef instead of relying upon ActorSelection. Exceptions are
sending messages using the At-Least-Once Delivery facility
initiating first contact with a remote system

Related

Akka.net load balancing and span out processing

I am looking to build a system that is able to process a stream of requests that needs a long processing time say 5 min each. My goal is to speed up request processing with minimal resource footprint which at times can be a burst of messages.
I can use something like a service bus to queue the request and have multiple process (a.k.a Actors in akka) that can subscribe for a message and start processing. Also can have a watchdog that looks at the queue length in the service bus and create more actors/ actor systems or stop a few.
if I want to do the same in the Actor system like Akka.net how can this be done. Say something like this:
I may want to spin up/stop new Remote Actor systems based on my request queue length
Send the message to any one of the available actor who can start processing without having to check who has the bandwidth to process on the sender side.
Messages should not be lost, and if the actor fails, it should be passed to next available actor.
can this be done with the Akka.net or this is not a valid use case for the actor system. Can some one please share some thoughts or point me to resources where I can get more details.
I may want to spin up/stop new Remote Actor systems based on my request queue length
This is not supported out of the box by Akka.Cluster. You would have to build something custom for it.
However Akka .NET has pool routers which are able to resize automatically according to configurable parameters. You may be able to build something around them.
Send the message to any one of the available actor who can start processing without having to check who has the bandwidth to process on the sender side.
If you look at Akka .NET Routers, there are various strategies that can be used to assign work. SmallestMailbox is probably the closest to what you're after.
Messages should not be lost, and if the actor fails, it should be passed to next available actor.
Akka .NET supports At Least Once Delivery. Read more about it in the docs or at the Petabridge blog.
While you may achieve some of your goals with Akka cluster, I wouldn't advise that. From your requirements it clearly states that your concerns are oriented about:
Reliable message delivery (where service buses and message queues are better option). There are a lot of solutions here, depending on your needs i.e. MassTransit, NServiceBus or queues (RabbitMQ).
Scaling workers (which is infrastructure problem and it's not solved by actor frameworks themselves). From what you've said, you don't even even need a cluster.
You could use akka for building a message processing logic, like workers. But as I said, you don't need it if your goal is to replace existing service bus.

Akka / Actors: Share a single, limited resource among the actor hierarchy

I'm learning Akka, and I'm struggling to find a good pattern to share a single, limited resource among the whole actor hierarchy.
My use case is that I have a HTTP REST endpoint to which I'm only allowed 10 simultaneous connections at any time. Different actors at different levels of the hierarchy need to be able to make HTTP REST calls. I'm using non-blocking I/O to make the HTTP requests (AsyncHttpClient).
The obvious solution is to have a single actor in charge of this REST resource, and have any actors who want to access it send a message to it and expect a reply at a later stage, however:
Having a single actor in charge of this resource feels a bit fragile to me
How should any "client" actor know how to reach this resource manager actor? Is it best to create it at a well known location like /user/rest-manager and use an actor selection, or is it better to try to pass its ActorRef to every actor that needs it (but meaning it will need to be passed down in a lot of actors that don't use it, just so they can in turn pass it down)
In addition, how to deal with "blocking" the client actors when 10 connections are already in progress, especially since I'm using non-blocking I/O? Is it best practice to re-send a message to self (perhaps after some time) as a wait pattern?
I also thought of a token-based approach where the resource manager actor could reply with "access tokens" to client actors that needs to access the resource until exhaustion. However it means that client actors are supposed to "return" the token once they're done which doesn't sound ideal, and I will also need cater for actors dying without returning the token (with some sort of expiration timeout I guess).
What are the patterns / best practices to deal with that situation?
Updated: To indicate I'm using non-blocking I/O
My suggestions would be:
Use the Error Kernel pattern, as the REST endpoint, as you said, is a fragile code (I/O operations can generate any kind of errors). In other words, Master/Worker actor hierarchy, where Workers do the job, while Master does any supervision
Connection limit could be handled by Akka Routing feature, where number of Routees is, in your case, 10. This also drops into Master/Worker category
Addressing - either way sounds good
Connection timeout - to be handled by a client code, as it's always done in a majority of network libs.

Akka concurrent message proccesing by preserving the messages order

In our project, we publish and consume messages to/from a JMS broker. We have a PL/SOL producer and a Java consumer. The problem is however; he producer is 10 times faster than the consumer. Theofore we want to change the consumerr to work with multiple threads while reading and processing the messages.
But we need to preserve the order of the messages as well. That said, the messages shall be sent to the target system in the order they were published to the jms broker. I'm new to Akka and i'm trying to understand its features. Can we achieve that using akka dispatchers ?
Assuming you want to parallelize the consumption inside a single instance of JVM, what you describe is a good case for Akka Streams. This can be solved with the Actors, but you risk of running out of memory if the producer is too fast, because you'll need to queue the results for re-ordering.
Akka Streams handle this problem with the introduction of backpressure. If consumer can't keep up with the producer, it will indicate this, and producer will reduce the rate. Akka Streams also can maintain the order of the messages.
Akka Streams is a 1.0 software, so it's not yet as battle-hardened as pure Akka, but it's based on Akka and is coming from the Akka team, so it should be good and become even better in the future. Also the documentation is not organized in the best way possible yet.
It's also important to mention that Akka Streams, while implemented using Akka, is quite different paradigm than Actor Model or Future combinations. It's based on stream processing paradigm, so you'll have to adjust the way you think about your programs. Might be an issue for some teams.

How to use akka as a replication mechanism

I'm new to akka and intend to use it in my new project as a data replication mechanism.
In this scenario, there is a master server and a replicate data server. The replicate data should contain the same data as the master. Each time a data change occurred in the master, it sends an update message to the replicate server. Here the master server is the Sender, and the Replicate server is the Receiver.
But after digging the docs I'm still not sure how to satisfy the following use cases:
When the receiver crashes, the sender should pile up messages to send, none messages should be lost. It should be able to reconnect to the receiver later and continue with last successful message.
when the sender crashes, it should restart and no messages between restart is lost.
Messages are dealt with the same order they were sent.
So my question is, how to config akka to create a sender and a receiver that could do this?
I'm not sure actor with a DurableMessageBox could solve this. If it could, how can i simulate the above situations for testing?
Update:
After reading the docs Victor pointed at, I now got the point that what I wanted was once-and-only-once pattern, which is extremely costly.
In the akka docs it says
Actual transports may provide stronger semantics, but at-most-once is the semantics you should expect. The alternatives would be once-and-only-once, which is extremely costly, or at-least-once which essentially requires idempotency of message processing, which is a user-level concern.
So inorder to achieve Guaranteed Delivery, I may need to turn to some other MQ solution (for example Kafka), or try to implement once-and-only-once with DurableMessageBox, and see if the complexity with it could be relieved with my specific use case.
You'd need to write your own remoting that utilizes the durable subscriber pattern, as Akka message send guarantees are less strict than what you are going for: http://doc.akka.io/docs/akka/2.0/general/message-send-semantics.html
Cheers,
√

When to use local vs remote actors?

When should I use Actors vs. Remote Actors in Akka?
I understand that both can scale a machine up, but only remote actors can scale out, so is there any practical production use of the normal Actor?
If a remote actor only has a minor initial setup overhead and does not have any other major overhead to that of a normal Actor, then I would think that using a Remote Actor would be the standard, since it can scale up and out with ease. Even if there is never a need to scale production code out, it would be nice to have the option (if it doesn't come with baggage).
Any insight on when to use an Actor vs. Remote Actor would be much appreciated.
Remote Actors cannot scale up, they are only remote references to a local actor on another machine.
For Akka 2.0 we will introduce clustered actors, which will allow you to write an Akka application and scale it up only using config.
Regular Actors can be used in sending out messages in local project.
As for the Remote Actors, you can used it in sending out messages to dependent projects that are connected to the project sending out the message.
Please refer here for the Remote Akka Actors
http://doc.akka.io/docs/akka/snapshot/scala/remoting.html
The question asks "If a remote actor only has a minor initial setup overhead and does not have any other major overhead then I would think that using a Remote Actor would be the standard". Yet the Fallacies of distributed computing make the point that it is a design error to assume that remoting with any technology has no overhead. You have the overhead of copying the messages to bytes and transmitting it across the network interface. You also have all the complexity of different processes being up, down, stalled or unreachable and of the network having hiccups leading to lost, duplicated or reordered messages.
This great article has real world examples of weird network errors which make remoting hard to make bullet proof. The Akka project lead Roland Kuln in his free video course about akka says that in his experience for every 1T of network messages being sent he sees a corruption. Notes on Distributed Systems for Young Bloods says "distributed systems tend to need actual, not simulated, distribution to flush out their bugs" so even good unit tests wont make for a perfect system. There is lots of advice that remoting is not "free" but hard work to get perfect.
If you need to use remoting for availability, or to move to huge scale, then note that akka does at-least-once delivery with possible duplication. So you must ensure that duplicated messages don't create bad results.
The moment you start to use remoting you have a distributed system which creates challenges which are discussed in Distributed systems for fun and profit. Unless you are doing very simply things like stateless calculators that are idempotent to duplicated messages things get tricky. One of assignments on that akka video course at the link above is to make a replicated key-value store which can deal with lost messages by writing the logic yourself. Its far from being an easy assignment. State distributed across different processes gets very hard, actors encapsulate state, therefore distributing actors can get very hard, depending on the consistency and availability requirements of the system you are building.
This all implies that if you can avoid remoting and achieve what you need to achieve then you would be wise to avoid it. If you do need remoting then Akka makes it easy due to its location transparency. So whilst its a great toolbox to take with you on the job; you should double check if the job needs all the tools or only the simplest ones in the box.