How to solve this with Akka actors? - akka

didn't know how to name this thread but will try to explain the problem in few lines.
I have a command which need to calculate price for desired date range. To calculate it system needs to fetch the price for every day individually (DB, config, cache, it doesn't matter from where).
My suggestion was to have one PriceRangeActor which will have a pool of DailyPriceActors and will send them commands like CalculateDailyPrice.
But how to assemble all that data in PriceRanceActor?
1.
Having some big map with complex keys just smells a lot. How to determine then if the range is completely calculated? Is there any easier way of doing this?
2.
Create new PriceRangeActor for every command and use ask pattern to query the list of DailyPriceActors?

Because you aren't utilizing any message passing/queuing I suggest Futures rather than Actors as your concurrency abstraction mechanism. This blog entry makes a very compelling argument that Actors are for state and Futures are for computation.
With either Futures or Actor ? (which is a Future) you can use Future.sequence to bundle together all of the separate querying Futures into a single Future that only completes once all the sub-queries are complete.
USING FUTURES (recommended)
import scala.concurrent.Future
object Foo extends App {
type Date = Int
type Prices = Seq[Float]
type PriceMap = Map[Date, Prices]
//expensive query function
def fetchPrices(date : Date) : Prices = ???
//the Dates to query Prices for
val datesToQuery : Seq[Date] = ???
import scala.concurrent.ExecutionContext.Implicits._
def concurrentQuery(date : Date) : Future[Prices] = Future {fetchPrices(date)}
//launches a Future per date query, D Dates => D Futures
//Future.sequence converts the D Futures into 1 Future
val dates2PricesFuture : Future[PriceMap] =
Future.sequence(datesToQuery map concurrentQuery)
.map(datesToQuery zip _)
.map(_.toMap)
dates2PricesFuture onSuccess { case priceMap : PriceMap =>
//process the price data which is now completely available
}
}//end object Foo
USING ACTORS
import scala.concurrent.Future
import akka.actor.{Actor, ActorSystem, Props}
import akka.pattern.ask
import akka.util.Timeout
object Foo extends App {
type Date = Int
type Prices = Seq[Float]
type PriceMap = Map[Date, Prices]
def fetchPrices(date : Date) : Prices = ???
val datesToQuery : Seq[Date] = ???
class QueryActor() extends Actor {
def receive = { case date : Date => sender ! fetchPrices(date) }
}
implicit val as = ActorSystem()
implicit val queryTimeout = Timeout(1000)
import as.dispatcher
def concurrentQuery(date : Date) : Future[Prices] =
ask(as actorOf Props[QueryActor],date).mapTo[Prices]
val dates2PricesFuture : Future[PriceMap] =
Future.sequence(datesToQuery map concurrentQuery)
.map(datesToQuery zip _)
.map(_.toMap)
dates2PricesFuture onSuccess ... //same as first example
}//end object Foo

Related

Is it possible to build an OLTP/CRUD HTTP server using AkkaHttp, AkkaStreams, Alpakka and a database?

It is clear to me that using Actors of course it is possible: for instance https://github.com/chbatey/akka-http-typed.git is using AkkaHttp and typed actors.
But it is unclear to me if just using AkkaStreams and its Alpakka connectors library (which includes databases), if is it possible to do regular CRUD / OLTP services, or just data replication from one database to another, or other OLAP / batch / stream processing scenarios.
If you know how it can be done please indicate a few details and if you can provide an example on github for instance that would be great.
The way I am thinking it may be possible is that the server is involved in two conversations / stateful stream transformation: one with the outside world over HTTP, and one with the database. I am not sure if this is possible to be modelled like that.
https://doc.akka.io/docs/alpakka/current/slick.html seems to offer both UPDATE/INSERTS as a Sink as well as pointed SELECT to a certain id as a Source. Do you know if an example app is there or can you broadly mention how the wiring would happen with Akka Http?
I put a demo here, hope it can help you.
Creating table, database is mysql.
CREATE TABLE test(id VARCHAR(32))
sbt:
"com.lightbend.akka" %% "akka-stream-alpakka-slick" % "1.1.0",
"mysql" % "mysql-connector-java" % "5.1.40"
Code:
package tech.parasol.scala.crud
import java.sql.SQLException
import akka.actor.ActorSystem
import akka.http.scaladsl.Http
import akka.http.scaladsl.server.Directives.{complete, get, path, _}
import akka.stream.alpakka.slick.scaladsl.{Slick, SlickSession}
import akka.stream.scaladsl.Sink
import akka.stream.{ActorAttributes, ActorMaterializer, Supervision}
import com.typesafe.config.ConfigFactory
import scala.concurrent.Future
import scala.io.StdIn
import scala.util.{Failure, Success}
object CrudTest1 {
def main(args: Array[String]): Unit = {
implicit val system = ActorSystem("CrudTest1")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val hostName = "120.0.0.1"
val rocketDbConfig =
s"""
|db-config {
| profile = "slick.jdbc.MySQLProfile$$"
| db {
| dataSourceClass = "slick.jdbc.DriverDataSource"
| properties = {
| driver = "com.mysql.jdbc.Driver"
| url = "jdbc:mysql://${hostName}:3306/rocket?useUnicode=true&characterEncoding=utf8&rewriteBatchedStatements=true&useSSL=false"
| user = "root"
| password = "passw0rd"
| }
| }
|}
|
""".stripMargin
implicit val session = SlickSession.forConfig("db-config", ConfigFactory.parseString(rocketDbConfig))
import session.profile.api._
def persistence(message: String) = {
def insert(message: String): DBIO[Int] = {
sqlu"""INSERT INTO test(id) VALUES (${message})"""
}
session.db.run(insert(message)).map {
case _ => message
}.recover {
case e : SQLException => {
throw new Exception("Database error ===>")}
case e : Exception => {
throw new Exception("Database error.")}
}
}
val route = path("hello" / Segment ) { name =>
get {
val res = persistence(name)
onComplete(res) {
case Success(value) => {
complete(s"<h1>Say hello to ${name}</h1>")
}
case Failure(e) => {
complete(s"<h1>Failed to say hello to ${name}</h1>")
}
}
}
}
val bindingFuture = Http().bindAndHandle(route, "localhost", 8088)
println(s"Server online at http://localhost:8088/\nPress RETURN to stop...")
StdIn.readLine() // let it run until user presses return
bindingFuture
.flatMap(_.unbind()) // trigger unbinding from the port
.onComplete(_ => system.terminate()) // and shutdown when done
}
}
Yes, basically at every request receive in AkkaHttp, we create an AkkaStreams Graph (just a pipeline typically), basically just the Slick Alpakka Source from the database, maybe prefixed by some operators, and then returned in AkkaHttp, which of course supports Source. More details at [https://www.quora.com/Is-it-possible-to-build-an-OLTP-CRUD-HTTP-server-using-Akka-HTTP-Akka-Streams-Alpakka-and-a-database-Do-you-know-any-examples-of-code-on-GitHub-or-elsewhere/answer/Nicolae-Marasoiu]

How do I unit test a taglib that calls g.formatDate?

I have a tag library that calls formatDate:
out << g.formatDate(attrs)
In my unit test I have the following:
def output = applyTemplate('<tz:formatDate date="${date}"/>', [date: date])
When I run the test I get the following error:
org.grails.taglib.GrailsTagException: [Byte array resource [test_1520620408798]:1] Error executing tag <tz:formatDate>: Cannot invoke method getTimeZone() on null object
at org.grails.gsp.GroovyPage.throwRootCause(GroovyPage.java:472)
at org.grails.gsp.GroovyPage.invokeTag(GroovyPage.java:415)
at test_1520620408798.run(test_1520620408798:15)
at org.grails.gsp.GroovyPageWritable.doWriteTo(GroovyPageWritable.java:162)
at org.grails.gsp.GroovyPageWritable.writeTo(GroovyPageWritable.java:82)
at grails.testing.web.GrailsWebUnitTest$Trait$Helper.renderTemplateToStringWriter(GrailsWebUnitTest.groovy:242)
at grails.testing.web.GrailsWebUnitTest$Trait$Helper.applyTemplate(GrailsWebUnitTest.groovy:227)
at grails.testing.web.taglib.TagLibUnitTest$Trait$Helper.applyTemplate(TagLibUnitTest.groovy:49)
at grails.testing.web.GrailsWebUnitTest$Trait$Helper.applyTemplate(GrailsWebUnitTest.groovy:212)
at grails.testing.web.taglib.TagLibUnitTest$Trait$Helper.applyTemplate(TagLibUnitTest.groovy:44)
at com.captivatelabs.grails.timezone.detection.FormatTagLibSpec.test offset client to server time - formatDate(FormatTagLibSpec.groovy:22)
Caused by: java.lang.NullPointerException: Cannot invoke method getTimeZone() on null object
at org.grails.plugins.web.taglib.FormatTagLib$_closure2.doCall(FormatTagLib.groovy:170)
at groovy.lang.Closure.call(Closure.java:414)
at org.grails.taglib.TagOutput.captureTagOutput(TagOutput.java:64)
at org.grails.taglib.TagLibraryMetaUtils.methodMissingForTagLib(TagLibraryMetaUtils.groovy:139)
at org.grails.taglib.NamespacedTagDispatcher.methodMissing(NamespacedTagDispatcher.groovy:59)
at com.captivatelabs.grails.timezone.detection.FormatTagLib$_closure1.doCall(FormatTagLib.groovy:14)
at groovy.lang.Closure.call(Closure.java:414)
at org.grails.gsp.GroovyPage.invokeTagLibClosure(GroovyPage.java:439)
at org.grails.gsp.GroovyPage.invokeTag(GroovyPage.java:364)
... 9 more
Does anyone have any thoughts on this?
There are a number of ways you could orchestrate this. One is demonstrated in the project at https://github.com/jeffbrown/pietertest.
https://github.com/jeffbrown/pietertest/blob/master/grails-app/taglib/pieter/DemoTagLib.groovy
package pieter
class DemoTagLib {
static defaultEncodeAs = [taglib:'html']
static namespace = 'tz'
def formatDate = { attrs ->
out << g.formatDate(date: attrs.date, format: 'yyyy-MM-dd')
}
}
https://github.com/jeffbrown/pietertest/blob/master/src/test/groovy/pieter/DemoTagLibSpec.groovy
package pieter
import grails.testing.web.taglib.TagLibUnitTest
import org.grails.plugins.web.DefaultGrailsTagDateHelper
import spock.lang.Specification
class DemoTagLibSpec extends Specification implements TagLibUnitTest<DemoTagLib> {
Closure doWithSpring() {{ ->
grailsTagDateHelper DefaultGrailsTagDateHelper
}}
void "test date format"() {
given:
def date
Calendar cal = Calendar.instance
cal.with {
clear()
set MONTH, JULY
set YEAR, 1776
set DATE, 4
date = time
}
when:
def output = applyTemplate('<tz:formatDate date="${date}"/>', [date: date])
then:
output == '1776-07-04'
}
}
I hope that helps.

Processing Akka stream in Slick transaction

Software versions:
Akka 2.4.4
Slick 3.1.0
I want to process elements from an Akka stream in a Slick transaction.
Here is some simplified code to illustrate one possible approach:
def insert(d: AnimalFields): DBIO[Long] =
animals returning animals.map(_.id) += d
val source: Source[AnimalFields, _]
val sourceAsTraversable = ???
db.run((for {
ids <- DBIO.sequence(sourceAsTraversable.map(insert))
} yield { ids }).transactionally)
One solution I could come up with so far is blocking each future to traverse the elements:
class TraversableQueue[T](sinkQueue: SinkQueue[T]) extends Traversable[T] {
#tailrec private def next[U](f: T => U): Unit = {
val nextElem = Await.result(sinkQueue.pull(), Duration.Inf)
if (nextElem.isDefined) {
f(nextElem.get)
next(f)
}
}
def foreach[U](f: T => U): Unit = next(f)
}
val sinkQueue = source.runWith(Sink.queue())
val queue = new TraversableQueue(sinkQueue)
Now I can pass the traversable queue to DBIO.sequence(). This defeats the purpose of streamed processing, though.
Another approach I found is this:
def toDbioAction[T](queue: SinkQueue[DBIOAction[S, NoStream, Effect.All]]):
DBIOAction[Queue[T], NoStream, Effect.All] =
DBIO.from(queue.pull() map { tOption =>
tOption match {
case Some(action) =>
action.flatMap(t => toDbioAction(queue).map(_ :+ t))
case None => DBIO.successful(Queue())
}
}).flatMap(r => r)
With this method, a sequence of DBIOActions can be generated without blocking:
toDbioAction(source.runWith(Sink.queue()))
Is there any better / more idiomatic way to achieve the desired result?
Here is my implementation of sourceAsTraversable:
import scala.collection.JavaConverters._
def sourceAsTraversable[A](source: Source[A, _])(implicit mat: Materializer): Traversable[A] =
source.runWith(StreamConverters.asJavaStream()).iterator().asScala.toIterable
The issue with TraversableQueue was that the forEach had to finish processing the stream fully - it did not support the "break" concept, so methods like "drop"/"take", etc. would still have to process whole source. This could be important from error handling point of view and failing fast.

HBase Get/Scan in a Scalding job

I'm using Scalding with Spyglass to read from/write to HBase.
I'm doing a left outer join of table1 and table2 and write back to table1 after transforming a column.
Both table1 and table2 are declared as Spyglass HBaseSource.
This works fine. But, i need to access a different row in table1 using rowkey to compute transformed value.
I tried the following for HBase get:
val hTable = new HTable(conf, TABLE_NAME)
val result = hTable.get(new Get(rowKey.getBytes()))
I'm getting access to Configuration in Scalding job as mentioned in this link:
https://github.com/twitter/scalding/wiki/Frequently-asked-questions#how-do-i-access-the-jobconf
This works when i run the scalding job locally.
But, when i run it in cluster, conf is null when this code is executed in Reducer.
Is there a better way to do HBase get/scan in a Scalding/Cascading job for cases like this?
Ways to do this...
1) You can use a managed resource
class SomeJob(args: Args) extends Job(args) {
val someConfig = HBaseConfiguration.create().addResource(new Path(pathtoyourxmlfile))
lazy val hPool = new HTablePool(someConfig, 3)
def getConf = {
implicitly[Mode] match {
case Hdfs(_, conf) => conf
case _ => whateveryou are doing for a local conf...
}
}
... somePipe.someOperation.... {
val gets = key.map { key => new Get(key) }
managed(hPool.getTable("myTableName")) acquireAndGet { table =>
val results = table.get(gets)
...do something with these results
}
}
}
2) You can use some more specific cascading code, where you write a custom scheme and inside that you will override the source method and possibly some others depending on your needs. In there you can access the JobConf like this:
class MyScheme extends Scheme[JobConf, SomeRecordReader, SomeOutputCollector, ..] {
#transient var jobConf: Configuration = super.jobConfiguration
override def source(flowProcess: FlowProcess[JobConf], ...): Boolean = {
jobConf = flowProcess match {
case h: HadoopFlowProcess => h.getJobConf
case _ => jconf
}
... dosomething with the jobConf here
}
}

Akka Actors logging processing time

I have a set of Akka Actors and I give about a couple of hundreds of messages to each one of them. I want to track how much time each instance of that Actor took to process all the messages that it received. What I'm doing currently is to have a state in the Actor instance as:
var startTime
var firstCall
I set both the variables when the Actor instance is first called. Is there another way that I could use to track the processing time for my Actor instances? I want to avoid having a local state in my Actor instance.
This is a good use case for context.become.
Remember than a receive block in an Akka actor is just a PartialFunction[Any, Unit], so we can wrap that in another partial function. This is the same approach taken by Akka's builtin LoggingReceive.
class TimingReceive(r: Receive, totalTime: Long)(implicit ctx: ActorContext) extends Receive {
def isDefinedAt(o: Any): Boolean = {
r.isDefinedAt(o)
}
def apply(o: Any): Unit = {
val startTime = System.nanoTime
r(o)
val newTotal = totalTime + (System.nanoTime - startTime)
log.debug("Total time so far: " + totalTime + " nanoseconds")
ctx.become(new TimingReceive(r, newTotal))
}
}
object TimingReceive {
def apply(r: Receive)(implicit ctx: ActorContext): Receive = new TimingReceive(r, 0)
}
Then you can use it like this:
class FooActor extends Actor {
def receive = TimingReceive {
case x: String => println("got " + x)
}
}
After each message, the actor will log the time taken so far. Of course, if you want to do something else with that variable, you'll have to adapt this.
This approach doesn't measure the time the actor is alive of course, only the time taken to actually process messages. Nor will it be accurate if your receive function creates a future.