Does it make sense to watch(self) in akka? - akka

As far as I understand, context.watch simply delivers actor.Terminated message to watcher. I wanted it to be the last message that actor receives. Yet, I see that it is never delivered. I guess it can be because it is terminated and does not process messages anymore. As part of the answer you may tell what is expected behaviour. You can also tell what is the way to handle the stop condition.

Seems like you've already answered your own question: watching self will not result in that actor receiving a Terminated message for itself. The real question is why you need that message. If you just need to clean up resources, override postStop and put that logic there.
postStop is guaranteed to be executed after messages have stopped being enqueued in that actor's mailbox so you can be sure nothing will come after it.

Related

Actor is getting killed before processing the message

I am using akka streams. So i am having one actor with this functionality.
Both messages were getting processed in order sometimes . but due to async call (I THINK) , i am getting dead letter in this.
actor.tell(message,ActorRef.noSender());
actor.tell(PoisonPill.getInstance(),ActorRef.noSender())
Can anyone help how to make sure to run this code in specific order?
It's probably that you are getting an exception which is killing the stream. Are you using a supervision strategy? If not, I would recommend following the documentation: https://doc.akka.io/docs/akka/current/stream/stream-error.html#supervision-strategies
If it is in some async call, it might be the future is throwing a TimeoutException.

How to prevent other workers from accessing a message which is being currently processed?

I am working on a project that will require multiple workers to access the same queue to get information about a file which they will manipulate. Files are ranging from size, from mere megabytes to hundreds of gigabytes. For this reason, a visibility timeout doesn't seem to make sense because I cannot be certain how long it will take. I have though of a couple of ways but if there is a better way, please let me know.
The message is deleted from the original queue and put into a
‘waiting’ queue. When the program finished processing the file, it
deletes it, otherwise the message is deleted from the queue and put
back into the original queue.
The message id is checked with a database. If the message id is
found, it is ignored. Otherwise the program starts processing the
message and inserts the message id into the database.
Thanks in advance!
Use the default-provided SQS timeout but take advantage of ChangeMessageVisibility.
You can specify the timeout in several ways:
When the queue is created (default timeout)
When the message is retrieved
By having the worker call back to SQS and extend the timeout
If you are worried that you do not know the appropriate processing time, use a default value that is good for most situations, but don't make it so big that things become unnecessarily delayed.
Then, modify your workers to make a ChangeMessageVisiblity call to SQS periodically to extend the timeout. If a worker dies, the message stops being extended and it will reappear on the queue to be processed by another worker.
See: MessageVisibility documentation

Kafka's ZookeeperConsumerConnector.createMessageStreams never returns

I'm trying to retrieve data from my Kafka 0.8.1 cluster. I have brought into existence an instance of ZookeeperConsumerConnector and then attempt to call createMessageStreams on it. However, no matter what I do, it seems createMessageStreams just hangs and never returns, even if it is the only thing I have done with Kafka.
Reading mailing lists it seems this can sometimes happen for a few reasons, but as far as I can tell I haven't done any of those things.
Further, I'll point out that I'm actually doing this in Clojure using clj-kafka, but I suspect clj-kafka is not the issue because I have the problem even if I run this code:
(.createMessageStreams
(clj-kafka.consumer.zk/consumer {"zookeeper.connect" "127.0.0.1:2181"
"group.id" "my.consumer"
"auto.offset.reset" "smallest"
"auto.commit.enable" "false"})
{"mytopic" (int 1)})
And clj-kafka.consumer.zk/consumer just uses Consumer.createJavaConsumerConnector to create a ZookeeperConsumerConnector without doing anything too fancy.
Also, there are definitely messages in "mytopic" because from the command line I can run the following and get back everything I've already sent to the topic:
% kafka-console-consumer.sh --zookeeper 127.0.0.1:2181 --topic mytopic --from-beginning
So it's also not that the topic is empty.
Feeling stumped at this point. Ideas?
ETA: By "hang" I guess what I really mean is that it seems to spin up a thread and then stay stuck in it never doing anything. If I run this code from the REPL I can get out of it by hitting control-c and then I get this error:
IllegalMonitorStateException java.util.concurrent.locks.ReentrantLock$Sync.tryRelease (ReentrantLock.java:155)
I was experiencing the same issue with the same exception when interrupting the REPL. The reason it hangs is due to the lazy-iterate function in the consumer.zk namespace. The queue from which messages are read is a LinkedBlockingQueue and the call to .hasNext in the lazy-iterate function calls .take on this queue. This creates a read lock on the queue and will block and wait until something is available to take off the queue. This means that the lazy-iterate function will never actually return. lazy-iterate is called by the 'messages' function and if you don't do something like
(take 2 (messages "mytopic" some-consumer))
then the messages function will never return and hang indefinitely. It's my opinion that this is a bug (or design flaw) in clj-kafka. To illustrate that this is indeed what's happening, try setting "consumer.timeout.ms" "0" in your consumer config. It will throw a TimeoutExpection and return control to the REPL.
This further creates a problem with the 'with-resource' macro. The macro takes a binding to a consumer, a shutdown function, and a body; it calls the body and then the shutdown fn. If inside the body, you make a call to 'messages', the body will never return and thus the shutdown function will never be called. The messages function WOULD return if shutdown was called, because shutdown puts a message on the queue that signals the consumer to clean up its resources and threads in preparation for GC. This macro puts the application into a state where the only way to get out of the main loop is to kill the application (or a thread calling it) itself. The library certainly has a ways to go before it's ready for a production environment.

Akka Terminate Dead Letters

I completed the first Akka assignment for the Coursera Reactive Programming Class (week five - binary trees).
My question is about Akka itself.
My app runs correctly, but I notice a lot of non-fatal dead letter warnings. Here is one:
[INFO] [01/16/2014 15:09:41.668] [PostponeSpec-akka.actor.default-dispatcher-23] [akka://PostponeSpec/user/$c/$b/$a/$b/$a/$a/$b/$b/$a/$a] Message [akka.dispatch.sysmsg.Terminate] from Actor[akka://PostponeSpec/user/$c/$b/$a/$b/$a/$a/$b/$b/$a/$a#570299303] to Actor[akka://PostponeSpec/user/$c/$b/$a/$b/$a/$a/$b/$b/$a/$a#570299303] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
I notice others have asked about this, and the official answer is that this isn't a problem, it's just verbose information that can be ignored and hidden by updating the logging settings.
I understand the advice to simply ignore this, but it still seems like a sloppy flaw on Akka's part. In this simple learning exercise, I am confident that my actors never get sent a message after they initiate a graceful shutdown. Akka should not be putting anything in the dead letter queue in these idealized circumstances. What is the justification for these dead letters? I also see it that the dead letter message isn't one that my app explicitly sends, but an internal message.
As someone who took the course as well,and asked questions in the course feed I recall the following: the child actor may decide to stop itself,but its parent may decide to do the same thing. At this point there's an inherent race condition between the parent's termination and the delivery of the Terminate(child) message,if the parent managed to stop itself prior to receiving the message it will end in the dead letters queue.

In Akka, if I get a RemoteClientWriteFailed error on the event stream, do I know for sure that the message was not received?

I know that Akka in general does not provide any strong reliability guarnatees. However, if I receive a RemoteClientWriteFailed notification for a specific remote actor and message, can I know for sure that the message was not put into the mailbox in the remote system?
Since we haven't expressed such a guarantee, not only do we know for sure nor do we know if it will be preserved. 2.1 sheds the RemoteClientWriteFailed and insteads puts it into the DeadLetters (as it should be).
If you have something popping up in DeadLetters it's reasonable to assume that it hasn't been delivered.
Please refer to this for discussion on how to operate.