GridEngine Or Akka - akka

I am building an application that relies on some processing performed by a third party product (TPP). The distributors of this TPP recommend deploying it on GridEngine (for parallelisation etc...)
The interface to this TPP will be a REST API built on Scala & Akka.
Assuming the processing is something akin to handing off processing to a database or similar TPP, would I be be able to achieve this parallelisation using Akka and its Load Balancing, Routing, Cluster and Remote Actor features instead of GridEngine altogether?
My understanding of GridEngine is that it provides cluster management tools. It manages load among slaves and you hand it a job to complete and it allocates to an available slave. Is this possibly all achievable using just Akka? Would there be any specific reason to go with GridEngine?
Thanks

Related

Akka - understand the actors model

I have been learning Akka for few days and I have a simple questions to understand it well. How should be the application architecture created for REST service which using actors? Actors should be:
A simple component (for example Service layer, DAO, controllers, etc)?
An Actor should be a buisness logic element. For example I have a business logic which should be spearate into tasks which are actors?
An Actor = microservice. It is a high level layer. Every microservice in application should work as an separate actor?
I cannot understand it in this way - how should I use actors in correct way? If I create a REST service with layers (controllers, services, DAO and database), how should I separate it as actors in Akka application?
There was a blog (likely this) that reflects my take on Akka Actors pretty well. I don't really use them.
Depends on who you talk with, some people are really into it whereas others may see it as an underlying essential which maybe isn't that useful on an application level.
I use actors for handling state. That's all. Otherwise it's Futures or Akka Streams. I hope you like the blog. If you still have questions after it, please shoot. I have 5+ years of Akka behind me. Happy to help.
I wouldn't recommend building a REST service using raw Akka actors. Actors are better used for encapsulating state and behavior. For example, the loosely-coupled lightweight actors can be used for simulating individual IoT devices (e.g. thermostats), each of which maintains its own internal state (e.g. cool setting) and adjusts/reports its settings via non-blocking message passing.
For REST API/service, you might want to consider using Play which is built on top of Akka, supporting non-blocking I/O, JSON as first-class citizen, Websockets, etc. Here's a basic example of creating a REST service using Play.
On microservice, as noted in the above link:
Building a REST API in Play does not automatically make it a
"microservice" because it does not cover larger scale concerns about
microservices such as ensuring resiliency, consistency, or monitoring.
To incorporate microservice into your REST API, consider the Lagom framework which is built on top of Play/Akka along with the reactive qualities.

How to use config immutant to implement quartz cluster?

I want to start several web-server, and every server has a quartz instance for avoiding the job being interrupted by restarting the server.
I found that immutant can config the single job .But when i run the server i found that the scheme use the not-cluster config.And i do not know how to config it.
Immutant has built-in support for singelton jobs, but it requires running your application in a WildFly cluster, and does not use Quartz's clustering functionality.
Quartz clustering requires a JDBC JobStore, and Immutant does not currently expose a way to set a JobStore for the scheduler instance. The clustering works by using the database to lock the job - it would not be difficult to implement something similar yourself, by scheduling the same job on every node in the cluster, and using an external store as a synchronization mechanism, allowing the job to run on only one node at a time.
If you truly need the clustering inplementation in Quartz, or need more control over scheduler creation than Immutant provides, please file an issue against Immutant to have those options exposed. In the interim, you could take a look at Quartzite, I believe it exposes more options for scheduler creation.

Akka clustering conflicts

The Akka doc talks about a variety of seemingly inter-related Akka technologies without distinguishing much between them:
Akka Networking
Akka Remoting
Akka Clustering
The Akka ZeroMQ module
My understanding is that "Akka Networking" is simply a module/lib that gives Akka the ability to speak to remote actor systems over TCP. Akka Remoting is another module/lib (not contained in the core Akka JAR) that gives Akka the use of Gossip protocols. And Akka Clustering is yet another module/lib that then uses these Gossip protocols to allow remote actor systems to cluster together and sharestate changes in a viral/"service discovery"-esque manner. And my understanding of Akka ZeroMQ is that it accomplishes the same thing as Akka Clustering, except using ZeroMQ as the basis of the network connections and protocols (instead of Gossip).
So first, if my understanding of these different modules/libs are incorrect, please begin by correcting me!
Assuming I'm more or less on target here, then my main concern is that I might have Remote Actor System 1 (RAS1) using Akka Clustering (and hence Gossip) trying to communicate with Remote Actor System 2 (RAS2) which uses Akka ZeroMQ. In this case, we're using two completely different clustering technologies and protocols, so does this mean these two remote systems can't speak to each other, or does special care need to be taken so that they are compatible with each other?
Akka Remoting is what allows for one actor to speak to another actor on a different machine. For Akka Remoting to work you need to know the specific IP address (or hostname), ActorSystem name, and Actor path of the actor you want to talk to. The ActorSystem name can be different in the 2 machines.
Akka Clustering takes away the problem of having to know the specific machine you are talking to (via Cluster-aware Routing or via a Receptionist that listens for machines joining or leaving the cluster). Cluster-aware Routing also allows for things like having a minimum of X instances of an actor running on any machine in the cluster. Akka Clustering uses the Gossip protocol to maintain the list of cluster members. A cluster-enabled application must know the address of at least one host which must always be running to be able to join the cluster. There might be 2, 3 or more, but the idea is that at least one of them must always be up. Akka Clustering is built on top of Akka Remoting.
Although I haven't worked with Akka ZeroMQ, I assume it works similarly to Akka AMQP. I see it more as an alternative to Remoting, in the sense that it enables actors on different machines to talk to each other, with the advantage that none of the actors need to know any specifics about any other machines where actors are running. However, as with Remoting, you need to manually create the actors that receive the messages, whereas with Clustering the cluster takes care of it (as long as you've configured your Routers correctly).
Regarding your last question. The easiest way I can think about for a Cluster to talk to Akka ZeroMQ would be to have one (or several?) actors in the Cluster that talk to ZeroMQ actors (i.e., you can actually mix and match). Have an actor inside the Cluster that listens on the queue, and have another one that publishes the message to the queue. Sort of an Adapter pattern.

Cross-service calls with Akka

I'm playing with newly released Akka.Net 1.0 (congratulations on the release!) so it's all quite new to me, but I'm pretty sure anyone with JVM Akka experience could also chime in since there's nothing that's runtime-dependent in the question.
Let's consider several (for the sake of example, 2) separate services that are a part of a larger system/application. Those services usually do their own things, but cross-service calls are sometimes needed. Let's say that Service 2 can be standalone and has a GetStuff action. Service 1 has a DoSomething action, which has to get the result of GetStuff action first.
What is preferred way of handling that kind of situation when both services can be deployed separately and to different machines?
As I said, I don't know much about Akka, but digging through examples, docs and source I found two options:
Remoting. Separate actor systems in their own services using Remoting to get ActorSelection from remote host. It would be pretty much the same as Remoting docs example, just that two actor systems would be equal 'clients'.
Clustering. I'm trying to wrap my head around that and the most I can figure out right now would be to set up a separate cluster service that would just set up the cluster system, creating a simple listener actor so that the seed node could be properly initialized (?). Then each and every separate actor system created in their own services would join to said cluster system under different role.
Maybe there's yet another solution that I'm not aware of...?
Personally, clustering solution seems harder to grasp and set up at first glance, but maybe there are some significant advantages that I can't see right now.
To reiterate, what is the preferred way of handling such situation and what should I look out for?
Akka.Cluster depends on Akka.Remote - here's what's fundamentally different about them:
Akka.Remote - allows you to connect and communicate with actor systems running in remote processes. Can be totally separate code-bases running entirely different Akka.NET applications ("services", if you will.) All you need to communicate between the two systems is a shared set of message classes that's visible in both processes.
Akka.Cluster - an abstraction on top of Akka.Remote that eliminates the need for each of your service instances to have to know the explicit address of every other possible service instance you might need to connect to. These can be instances of the same services or instances of different services. Enables dynamic discovery of services via a really simple "seed node" strategy.
I recommend you take a look at the Akka.Cluster microservices example that I wrote - it shows how you can use the Akka.Cluster "roles" feature to dynamically make cross-service calls to nodes in a different service without having to explicitly define any of their network addresses. In particular, take a look at how I use "cluster-aware" routers to do this.

How to use run distribute tasks on worker nodes in a Clojure app?

On Python/Django stack, we were used to using Celery along with RabbitMQ.
Everything was easily done.
However when we tried doing the same thing in Clojure land, what we could get was Langhour.
In our current naive implementation we have a worker system which has three core parts.
Publisher module
Subscriber module
Task module
We can start the system on any node in either publisher or subscriber mode.
They are connected to RabbitMQ server.
They share one worker_queue.
What we are doing is creating tasks in Task module, and then when we want to run a task on subscriber. we send an expression call to the method, in EDN format to Subscriber which then decodes this and runs the actual task using eval.
Now is using eval safe ? we are not running expressions generated by user or any third party system.Initially we were planning to use JSON to send the payload message but then EDN gave us a lot more flexibility and it works like a charm, as of now.
Also is there a better way to do this ?
Depends on you needs (and your team), I highly suggest Storm Project. You will get a distributed, fault tolerant and realtime computation and it is really easy to use.
Another nice thing in Storm that it supports a plethora of options as the datasource for the topologies. It can be for example: Apache Kafka, RabbitMQ, Kestrel, MongoDB. If you aren't satisfied, then you can write your own driver.
It is also has a web interface to see what is happening in your topology.