I want to reference the materialized value from the flow. Below is the code snippet, but its not compiling, error:
type mismatch;
found : (akka.NotUsed, scala.concurrent.Future[akka.Done])
required: (Playground.DomainObj, scala.concurrent.Future[akka.Done])
Code:
import akka.actor.ActorSystem
import akka.stream.scaladsl._
import scala.concurrent.Future
import akka.NotUsed
import akka.Done
implicit val actorSystem = ActorSystem("example")
case class DomainObj(name: String, age: Int)
val customFlow1:Flow[String,DomainObj,NotUsed] = Flow[String].map(s => {
DomainObj(s, 50)
})
val customFlow2 = Flow[DomainObj].map(s => {
s.age + 10
})
val printAnySink: Sink[Any, Future[Done]] = Sink.foreach(println)
val c1 = Source.single("John").viaMat(customFlow1)(Keep.right).viaMat(customFlow2)(Keep.left).toMat(printAnySink)(Keep.both)
val res: (DomainObj, Future[Done]) = c1.run()
Find the code in playground: https://scastie.scala-lang.org/P9iSx49cQcaOZfKtVCzTPA
I want to reference the DomainObj after the stream completes/
The materialized value of a Flow[String, DomainObj, NotUsed] is NotUsed, not a DomainObj, therefore c1's materialized value is (NotUsed, Future[Done]).
It looks like the intent here is to capture the DomainObj which is created in customFlow1. That can be accomplished with
val customFlow1: Flow[String, DomainObj, Future[DomainObj]] =
Flow[String]
.map { s => DomainObj(s, 50) }
.alsoTo(Sink.head)
val res: (Future[DomainObj], Future[Done]) = c1.run()
Note that Sink.head effectively requires that customFlow1 can only be used downstream of something that only emits once.
Related
Accessing the metrics of an Alpakka PlainSource seems fairly straight forward, but how can I do the same thing with a CommittableSource?
I currently have a simple consumer, something like this:
class Consumer(implicit val ma: ActorMaterializer, implicit val ec: ExecutionContext) extends Actor {
private val settings = ConsumerSettings(
context.system,
new ByteArrayDeserializer,
new StringDeserializer)
.withProperties(...)
override def receive: Receive = Actor.emptyBehavior
RestartSource
.withBackoff(minBackoff = 2.seconds, maxBackoff = 20.seconds, randomFactor = 0.2)(consumer)
.runForeach { handleMessage }
private def consumer() = {
AkkaConsumer
.committableSource(settings, Subscriptions.topics(Set(topic)))
.log(getClass.getSimpleName)
.withAttributes(ActorAttributes.supervisionStrategy(_ => Supervision.Resume))
}
private def handleMessage(message: CommittableMessage[Array[Byte], String]): Unit = {
...
}
}
How can I get access to the consumer metrics in this case?
We are using the Java prometheus client and I solved my issue with a custom collector that fetches its metrics directly from JMX:
import java.lang.management.ManagementFactory
import java.util
import io.prometheus.client.Collector
import io.prometheus.client.Collector.MetricFamilySamples
import io.prometheus.client.CounterMetricFamily
import io.prometheus.client.GaugeMetricFamily
import javax.management.ObjectName
import scala.collection.JavaConverters._
import scala.collection.mutable
class ConsumerMetricsCollector(val labels: Map[String, String] = Map.empty) extends Collector {
val metrics: mutable.Map[String, MetricFamilySamples] = mutable.Map.empty
def collect: util.List[MetricFamilySamples] = {
val server = ManagementFactory.getPlatformMBeanServer
for {
attrType <- List("consumer-metrics", "consumer-coordinator-metrics", "consumer-fetch-manager-metrics")
name <- server.queryNames(new ObjectName(s"kafka.consumer:type=$attrType,client-id=*"), null).asScala
attrInfo <- server.getMBeanInfo(name).getAttributes.filter { _.getType == "double" }
} yield {
val attrName = attrInfo.getName
val metricLabels = attrName.split(",").map(_.split("=").toList).collect {
case "client-id" :: (id: String) :: Nil => ("client-id", id)
}.toList ++ labels
val metricName = "kafka_consumer_" + attrName.replaceAll(raw"""[^\p{Alnum}]+""", "_")
val labelKeys = metricLabels.map(_._1).asJava
val metric = metrics.getOrElseUpdate(metricName,
if(metricName.endsWith("_total") || metricName.endsWith("_sum")) {
new CounterMetricFamily(metricName, attrInfo.getDescription, labelKeys)
} else {
new GaugeMetricFamily(metricName, attrInfo.getDescription, labelKeys)
}: MetricFamilySamples
)
val metricValue = server.getAttribute(name, attrName).asInstanceOf[Double]
val labelValues = metricLabels.map(_._2).asJava
metric match {
case f: CounterMetricFamily => f.addMetric(labelValues, metricValue)
case f: GaugeMetricFamily => f.addMetric(labelValues, metricValue)
case _ =>
}
}
metrics.values.toList.asJava
}
}
I'm getting the following error:
java.lang.IllegalArgumentException: requirement failed: The inlets [] and outlets [BlockOut.out] must correspond to the inlets [] and outlets [BlockOut.out]
I have a very simple graph:
val g1 = GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
val in: Source[ByteString, Any] = Source.single(ByteString(digest))
val flow: GraphStage[FlowShape[ByteString, ByteString]] = new ReadBlockStage(dataStore, blockingExecutionContext)
in ~> flow
SourceShape(flow.shape.out)
}
val sourceGraph: Source[ByteString, NotUsed] = Source.fromGraph(g1)
and my flow is defined like this:
class ReadBlockStage(dataStore: DataStore, implicit val exceutionContext: ExecutionContext) extends GraphStage[FlowShape[ByteString, ByteString]] with DefaultJsonProtocol {
val in = Inlet[ByteString]("DigestSpec.in")
val out = Outlet[ByteString]("BlockOut.out")
override val shape = FlowShape.of(in, out)
...
}
Why am I getting this error? The flow's "out" port is of type Outlet[ByteString], and my Source is Source[ByteString, NotUsed]. The error message is very confusing because it looks like the shape and the expected shape are the same.
I figured out the issue. I had forgotten to perform the builder.add() for each of the graph elements.
I'm playing with Akka Streams 2.4.2 and am wondering if it's possible to setup a stream which uses a database table for a source and whenever there is a record added to the table that record is materialized and pushed downstream?
UPDATE: 2/23/16
I've implemented the solution from #PH88. Here's my table definition:
case class Record(id: Int, value: String)
class Records(tag: Tag) extends Table[Record](tag, "my_stream") {
def id = column[Int]("id")
def value = column[String]("value")
def * = (id, value) <> (Record.tupled, Record.unapply)
}
Here's the implementation:
implicit val system = ActorSystem("Publisher")
implicit val materializer = ActorMaterializer()
val db = Database.forConfig("pg-postgres")
try{
val newRecStream = Source.unfold((0, List[Record]())) { n =>
try {
val q = for (r <- TableQuery[Records].filter(row => row.id > n._1)) yield (r)
val r = Source.fromPublisher(db.stream(q.result)).collect {
case rec => println(s"${rec.id}, ${rec.value}"); rec
}.runFold((n._1, List[Record]())) {
case ((id, xs), current) => (current.id, current :: xs)
}
val answer: (Int, List[Record]) = Await.result(r, 5.seconds)
Option(answer, None)
}
catch { case e:Exception => println(e); Option(n, e) }
}
Await.ready(newRecStream.throttle(1, 1.second, 1, ThrottleMode.shaping).runForeach(_ => ()), Duration.Inf)
}
finally {
system.shutdown
db.close
}
But my problem is that when I attempt to call flatMapConcat the type I get is Serializable.
UPDATE: 2/24/16
Updated to try db.run suggestion from #PH88:
implicit val system = ActorSystem("Publisher")
implicit val materializer = ActorMaterializer()
val db = Database.forConfig("pg-postgres")
val disableAutoCommit = SimpleDBIO(_.connection.setAutoCommit(false))
val queryLimit = 1
try {
val newRecStream = Source.unfoldAsync(0) { n =>
val q = TableQuery[Records].filter(row => row.id > n).take(queryLimit)
db.run(q.result).map { recs =>
Some(recs.last.id, recs)
}
}
.throttle(1, 1.second, 1, ThrottleMode.shaping)
.flatMapConcat { recs =>
Source.fromIterator(() => recs.iterator)
}
.runForeach { rec =>
println(s"${rec.id}, ${rec.value}")
}
Await.ready(newRecStream, Duration.Inf)
}
catch
{
case ex: Throwable => println(ex)
}
finally {
system.shutdown
db.close
}
Which works (I changed query limit to 1 since I only have a couple items in my database table currently) - except once it prints the last row in the table the program exists. Here's my log output:
17:09:27,982 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy]
17:09:27,982 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
17:09:27,982 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [file:/Users/xxxxxxx/dev/src/scratch/scala/fpp-in-scala/target/scala-2.11/classes/logback.xml]
17:09:28,062 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
17:09:28,064 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT]
17:09:28,079 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
17:09:28,102 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [application] to DEBUG
17:09:28,103 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to INFO
17:09:28,103 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT]
17:09:28,103 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.
17:09:28,104 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator#4278284b - Registering current configuration as safe fallback point
17:09:28.117 [main] INFO com.zaxxer.hikari.HikariDataSource - pg-postgres - is starting.
1, WASSSAAAAAAAP!
2, WHAAAAT?!?
3, booyah!
4, what!
5, This rocks!
6, Again!
7, Again!2
8, I love this!
9, Akka Streams rock
10, Tuning jdbc
17:09:39.000 [main] INFO com.zaxxer.hikari.pool.HikariPool - pg-postgres - is closing down.
Process finished with exit code 0
Found the missing piece - need to replace this:
Some(recs.last.id, recs)
with this:
val lastId = if(recs.isEmpty) n else recs.last.id
Some(lastId, recs)
The call to recs.last.id was throwing java.lang.UnsupportedOperationException: empty.last when the result set was empty.
In general SQL database is a 'passive' construct and does not actively push changes like what you described. You can only 'simulate' the 'push' with periodic polling like:
val newRecStream = Source
// Query for table changes
.unfold(initState) { lastState =>
// query for new data since lastState and save the current state into newState...
Some((newState, newRecords))
}
// Throttle to limit the poll frequency
.throttle(...)
// breaks down into individual records...
.flatMapConcat { newRecords =>
Source.unfold(newRecords) { pendingRecords =>
if (records is empty) {
None
} else {
// take one record from pendingRecords and save to newRec. Save the rest into remainingRecords.
Some(remainingRecords, newRec)
}
}
}
Updated: 2/24/2016
Pseudo code example based on the 2/23/2016 updates of the question:
implicit val system = ActorSystem("Publisher")
implicit val materializer = ActorMaterializer()
val db = Database.forConfig("pg-postgres")
val queryLimit = 10
try {
val completion = Source
.unfoldAsync(0) { lastRowId =>
val q = TableQuery[Records].filter(row => row.id > lastRowId).take(queryLimit)
db.run(q.result).map { recs =>
Some(recs.last.id, recs)
}
}
.throttle(1, 1.second, 1, ThrottleMode.shaping)
.flatMapConcat { recs =>
Source.fromIterator(() => recs.iterator)
}
.runForeach { rec =>
println(s"${rec.id}, ${rec.value}")
}
// Block forever
Await.ready(completion, Duration.Inf)
} catch {
case ex: Throwable => println(ex)
} finally {
system.shutdown
db.close
}
It will repeatedly execute the query in unfoldAsync against the DB, retrieving at most 10 (queryLimit) records a time and send the records downstream (-> throttle -> flatMapConcat -> runForeach). The Await at the end will actually block forever.
Updated: 2/25/2016
Executable 'proof-of-concept' code:
import akka.actor.ActorSystem
import akka.stream.{ThrottleMode, ActorMaterializer}
import akka.stream.scaladsl.Source
import scala.concurrent.duration.Duration
import scala.concurrent.{Await, Future}
import scala.concurrent.duration._
object Infinite extends App{
implicit val system = ActorSystem("Publisher")
implicit val ec = system.dispatcher
implicit val materializer = ActorMaterializer()
case class Record(id: Int, value: String)
try {
val completion = Source
.unfoldAsync(0) { lastRowId =>
Future {
val recs = (lastRowId to lastRowId + 10).map(i => Record(i, s"rec#$i"))
Some(recs.last.id, recs)
}
}
.throttle(1, 1.second, 1, ThrottleMode.Shaping)
.flatMapConcat { recs =>
Source.fromIterator(() => recs.iterator)
}
.runForeach { rec =>
println(rec)
}
Await.ready(completion, Duration.Inf)
} catch {
case ex: Throwable => println(ex)
} finally {
system.shutdown
}
}
Here is database infinite streaming working code. This has been tested with millions of records being inserted into postgresql database while streaming app is running -
package infinite.streams.db
import akka.NotUsed
import akka.actor.ActorSystem
import akka.stream.alpakka.slick.scaladsl.SlickSession
import akka.stream.scaladsl.{Flow, Sink, Source}
import akka.stream.{ActorMaterializer, ThrottleMode}
import org.slf4j.LoggerFactory
import slick.basic.DatabaseConfig
import slick.jdbc.JdbcProfile
import scala.concurrent.duration._
import scala.concurrent.{Await, ExecutionContextExecutor}
case class Record(id: Int, value: String) {
val content = s"<ROW><ID>$id</ID><VALUE>$value</VALUE></ROW>"
}
object InfiniteStreamingApp extends App {
println("Starting app...")
implicit val system: ActorSystem = ActorSystem("Publisher")
implicit val ec: ExecutionContextExecutor = system.dispatcher
implicit val materializer: ActorMaterializer = ActorMaterializer()
println("Initializing database configuration...")
val databaseConfig: DatabaseConfig[JdbcProfile] = DatabaseConfig.forConfig[JdbcProfile]("postgres3")
implicit val session: SlickSession = SlickSession.forConfig(databaseConfig)
import databaseConfig.profile.api._
class Records(tag: Tag) extends Table[Record](tag, "test2") {
def id = column[Int]("c1")
def value = column[String]("c2")
def * = (id, value) <> (Record.tupled, Record.unapply)
}
val db = databaseConfig.db
println("Prime for streaming...")
val logic: Flow[(Int, String), (Int, String), NotUsed] = Flow[(Int, String)].map {
case (id, value) => (id, value.toUpperCase)
}
val fetchSize = 5
try {
val done = Source
.unfoldAsync(0) {
lastId =>
println(s"Fetching next: $fetchSize records with id > $lastId")
val query = TableQuery[Records].filter(_.id > lastId).take(fetchSize)
db.run(query.result.withPinnedSession)
.map {
recs => Some(recs.last.id, recs)
}
}
.throttle(5, 1.second, 1, ThrottleMode.shaping)
.flatMapConcat {
recs => Source.fromIterator(() => recs.iterator)
}
.map(x => (x.id, x.content))
.via(logic)
.log("*******Post Transformation******")
// .runWith(Sink.foreach(r => println("SINK: " + r._2)))
// Use runForeach or runWith(Sink)
.runForeach(rec => println("REC: " + rec))
println("Waiting for result....")
Await.ready(done, Duration.Inf)
} catch {
case ex: Throwable => println(ex.getMessage)
} finally {
println("Streaming end successfully")
db.close()
system.terminate()
}
}
application.conf
akka {
loggers = ["akka.event.slf4j.Slf4jLogger"]
loglevel = "INFO"
}
# Load using SlickSession.forConfig("slick-postgres")
postgres3 {
profile = "slick.jdbc.PostgresProfile$"
db {
dataSourceClass = "slick.jdbc.DriverDataSource"
properties = {
driver = "org.postgresql.Driver"
url = "jdbc:postgresql://localhost/testdb"
user = "postgres"
password = "postgres"
}
numThreads = 2
}
}
Using spray-json (as I'm using spray-client) in order to get a latitude,longitude object from the google maps API I need to have the whole response structure set up:
case class AddrComponent(long_name: String, short_name: String, types: List[String])
case class Location(lat: Double, lng: Double)
case class ViewPort(northeast: Location, southwest: Location)
case class Geometry(location: Location, location_type: String, viewport: ViewPort)
case class EachResult(address_components: List[AddrComponent],
formatted_address: String,
geometry: Geometry,
types: List[String])
case class GoogleApiResult[T](status: String, results: List[T])
object AddressProtocol extends DefaultJsonProtocol {
implicit val addrFormat = jsonFormat3(AddrComponent)
implicit val locFormat = jsonFormat2(Location)
implicit val viewPortFormat = jsonFormat2(ViewPort)
implicit val geomFormat = jsonFormat3(Geometry)
implicit val eachResFormat = jsonFormat4(EachResult)
implicit def GoogleApiFormat[T: JsonFormat] = jsonFormat2(GoogleApiResult.apply[T])
}
import AddressProtocol._
Is there any way I can just get Location from the json in the response and avoid all this gumph?
The spray-client code:
implicit val system = ActorSystem("test-system")
import system.dispatcher
private val pipeline = sendReceive ~> unmarshal[GoogleApiResult[EachResult]]
def getPostcode(postcode: String): Point = {
val url = s"http://maps.googleapis.com/maps/api/geocode/json?address=$postcode,+UK&sensor=true"
val future = pipeline(Get(url))
val result = Await.result(future, 10 seconds)
result.results.size match {
case 0 => throw new PostcodeNotFoundException(postcode)
case x if x > 1 => throw new MultipleResultsException(postcode)
case _ => {
val location = result.results(0).geometry.location
new Point(location.lng, location.lat)
}
}
}
Or alternatively how can I use jackson with spray-client?
Following jrudolph's advice to json-lenses I also got in quite a bit of fiddling but finally got things to work. I found it quite difficult (as a newbie) and also I am sure this solution is far from the most elegant - nevertheless I think this might help people or inspire others for improvements.
Given JSON:
{
"status": 200,
"code": 0,
"message": "",
"payload": {
"statuses": {
"emailConfirmation": "PENDING",
"phoneConfirmation": "DONE",
}
}
}
And case class for unmarshalling statuses only:
case class UserStatus(emailConfirmation: String, phoneConfirmation: String)
One can do this to unmarshal response:
import scala.concurrent.Future
import spray.http.HttpResponse
import spray.httpx.unmarshalling.{FromResponseUnmarshaller, MalformedContent}
import spray.json.DefaultJsonProtocol
import spray.json.lenses.JsonLenses._
import spray.client.pipelining._
object UserStatusJsonProtocol extends DefaultJsonProtocol {
implicit val userStatusUnmarshaller = new FromResponseUnmarshaller[UserStatus] {
implicit val userStatusJsonFormat = jsonFormat2(UserStatus)
def apply(response: HttpResponse) = try {
Right(response.entity.asString.extract[UserStatus]('payload / 'statuses))
} catch { case x: Throwable =>
Left(MalformedContent("Could not unmarshal user status.", x))
}
}
}
import UserStatusJsonProtocol._
def userStatus(userId: String): Future[UserStatus] = {
val pipeline = sendReceive ~> unmarshal[UserStatus]
pipeline(Get(s"/api/user/${userId}/status"))
}
First off, I'm new to Scala.
I'm trying to make a template parser in Scala (similar to Smarty (PHP)). It needs to search through the document, replacing anything inside "{{ }}" tags, with anything provided in the HashMap.
I'm currently stuck here:
import scala.collection.mutable.HashMap
import scala.io.Source
class Template(filename: String, vars: HashMap[Symbol, Any]) {
def parse() = {
var contents = Source.fromFile(filename, "ASCII").mkString
var rule = """\{\{(.*)\}\}""".r
//for(rule(v) <- rule findAllIn contents) {
// yield v
//}
//rule.replaceAllIn(contents, )
}
}
var t = new Template("FILENAME", new HashMap[Symbol, Any])
println(t.parse)
The part's that I've commented are things that I've thought about doing.
Thanks
I've come a little further...
import scala.collection.mutable.HashMap
import scala.io.Source
import java.util.regex.Pattern
import java.util.regex.Matcher
class Template(filename: String, vars: HashMap[Symbol, Any]) {
def findAndReplace(m: Matcher)(callback: String => String):String = {
val sb = new StringBuffer
while (m.find) {
m.appendReplacement(sb, callback(m.group(1)))
}
m.appendTail(sb)
sb.toString
}
def parse() = {
var contents = Source.fromFile(filename, "ASCII").mkString
val m = Pattern.compile("""\{\{(.*)\}\}""").matcher(contents)
findAndReplace(m){ x => x }
}
}
var t = new Template("FILENAME.html", new HashMap[Symbol, Any])
println(t.parse)
At the moment it just currently adds whatever was inside of the tag, back into the document. I'm wondering if there is an easier way of doing a find-and-replace style regexp in Scala?
I'd do it like this (String as key instead of Symbol):
var s : String = input // line, whatever
val regexp = """pattern""".r
while(regexp findFirstIn s != None) {
s = regexp replaceFirstIn (s, vars(regexp.findFirstIn(s).get))
}
If you prefer not using var, go recursive instead of using while. And, of course, a stringbuilder would be more efficient. In that case, I might do the following:
val regexp = """^(.*?)(?:{{(pattern)}})?""".r
for(subs <- regexp findAllIn s)
subs match {
case regexp(prefix, var) => sb.append(prefix); if (var != null) sb.append("{{"+vars(var)+"}}")
case _ => error("Shouldn't happen")
}
That way you keep appending the non-changing part, followed by the next part to be replaced.
There is a flavor of replaceAllIn in util.matching.Regex that accepts a replacer callback. A short example:
import util.matching.Regex
def replaceVars(r: Regex)(getVar: String => String) = {
def replacement(m: Regex.Match) = {
import java.util.regex.Matcher
require(m.groupCount == 1)
Matcher.quoteReplacement( getVar(m group 1) )
}
(s: String) => r.replaceAllIn(s, replacement _)
}
This is how we would use it:
val r = """\{\{([^{}]+)\}\}""".r
val m = Map("FILENAME" -> "aaa.txt",
"ENCODING" -> "UTF-8")
val template = replaceVars(r)( m.withDefaultValue("UNKNOWN") )
println( template("""whatever input contains {{FILENAME}} and
unknown key {{NOVAL}} and {{FILENAME}} again,
and {{ENCODING}}""") )
Note Matcher.quoteReplacement escapes $ characters in the replacement string. Otherwise you may get java.lang.IllegalArgumentException: Illegal group reference, replaceAll and dollar signs. See the blog post on why this may happen.
Here is also interesting way how to do the same using functions compose:
val Regexp = """\{\{([^{}]+)\}\}""".r
val map = Map("VARIABLE1" -> "VALUE1", "VARIABLE2" -> "VALUE2", "VARIABLE3" -> "VALUE3")
val incomingData = "I'm {{VARIABLE1}}. I'm {{VARIABLE2}}. And I'm {{VARIABLE3}}. And also {{VARIABLE1}}"
def replace(incoming: String) = {
def replace(what: String, `with`: String)(where: String) = where.replace(what, `with`)
val composedReplace = Regexp.findAllMatchIn(incoming).map { m => replace(m.matched, map(m.group(1)))(_) }.reduceLeftOption((lf, rf) => lf compose rf).getOrElse(identity[String](_))
composedReplace(incomingData)
}
println(replace(incomingData))
//OUTPUT: I'm VALUE1. I'm VALUE2. And I'm VALUE3. And also VALUE1