akka stream custom graph stage - akka

I have an akka stream from a web-socket like akka stream consume web socket and would like to build a reusable graph stage (inlet: the stream, FlowShape: add an additional field to the JSON specifying origin i.e.
{
...,
"origin":"blockchain.info"
}
and an outlet to kafka.
I face the following 3 problems:
unable to wrap my head around creating a custom Inlet from the web socket flow
unable to integrate kafka directly into the stream (see the code below)
not sure if the transformer to add the additional field would be required to deserialize the json to add the origin
The sample Project (flow only) looks like:
import system.dispatcher
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
val incoming: Sink[Message, Future[Done]] =
Flow[Message].mapAsync(4) {
case message: TextMessage.Strict =>
println(message.text)
Future.successful(Done)
case message: TextMessage.Streamed =>
message.textStream.runForeach(println)
case message: BinaryMessage =>
message.dataStream.runWith(Sink.ignore)
}.toMat(Sink.last)(Keep.right)
val producerSettings = ProducerSettings(system, new ByteArraySerializer, new StringSerializer)
.withBootstrapServers("localhost:9092")
val outgoing = Source.single(TextMessage("{\"op\":\"unconfirmed_sub\"}")).concatMat(Source.maybe)(Keep.right)
val webSocketFlow = Http().webSocketClientFlow(WebSocketRequest("wss://ws.blockchain.info/inv"))
val ((completionPromise, upgradeResponse), closed) =
outgoing
.viaMat(webSocketFlow)(Keep.both)
.toMat(incoming)(Keep.both)
// TODO not working integrating kafka here
// .map(_.toString)
// .map { elem =>
// println(s"PlainSinkProducer produce: ${elem}")
// new ProducerRecord[Array[Byte], String]("topic1", elem)
// }
// .runWith(Producer.plainSink(producerSettings))
.run()
val connected = upgradeResponse.flatMap { upgrade =>
if (upgrade.response.status == StatusCodes.SwitchingProtocols) {
Future.successful(Done)
} else {
throw new RuntimeException(s"Connection failed: ${upgrade.response.status}")
system.terminate
}
}
// kafka that works / writes dummy data
val done1 = Source(1 to 100)
.map(_.toString)
.map { elem =>
println(s"PlainSinkProducer produce: ${elem}")
new ProducerRecord[Array[Byte], String]("topic1", elem)
}
.runWith(Producer.plainSink(producerSettings))

One issue is around the incoming stage, which is modelled as a Sink. where it should be modelled as a Flow. to subsequently feed messages into Kafka.
Because incoming text messages can be Streamed. you can use flatMapMerge combinator as follows to avoid the need to store entire (potentially big) messages in memory:
val incoming: Flow[Message, String, NotUsed] = Flow[Message].mapAsync(4) {
case msg: BinaryMessage =>
msg.dataStream.runWith(Sink.ignore)
Future.successful(None)
case TextMessage.Streamed(src) =>
src.runFold("")(_ + _).map { msg => Some(msg) }
}.collect {
case Some(msg) => msg
}
At this point you got something that produces strings, and can be connected to Kafka:
val addOrigin: Flow[String, String, NotUsed] = ???
val ((completionPromise, upgradeResponse), closed) =
outgoing
.viaMat(webSocketFlow)(Keep.both)
.via(incoming)
.via(addOrigin)
.map { elem =>
println(s"PlainSinkProducer produce: ${elem}")
new ProducerRecord[Array[Byte], String]("topic1", elem)
}
.toMat(Producer.plainSink(producerSettings))(Keep.both)
.run()

Related

Akka actor for http request Java

Hello I am trying to look for a simple example in AKKA - Java to create a HTTP Client un an Actor. So far I am able to create a request and get the response Http Entity. I need to migrate it to an actor , so I can call multiple actors in parallel with a time out.
final ActorSystem system = ActorSystem.create();
final Materializer materializer = ActorMaterializer.create(system);
final List<HttpRequest> httpRequests = Arrays.asList(
HttpRequest.create(url) // Content-Encoding: gzip in respons
);
Unmarshaller<ByteString, BitTweet> unmarshal = Jackson.byteStringUnmarshaller(BitTweet.class);
JsonEntityStreamingSupport support = EntityStreamingSupport.json();
final Http http = Http.get(system);
final Function<HttpResponse, HttpResponse> decodeResponse = response -> {
// Pick the right coder
final Coder coder;
if (HttpEncodings.gzip().equals(response.encoding())) {
coder = Coder.Gzip;
} else if (HttpEncodings.deflate().equals(response.encoding())) {
coder = Coder.Deflate;
} else {
coder = Coder.NoCoding;
}
// Decode the entity
return coder.decodeMessage(response);
};
List<CompletableFuture<HttpResponse>> futureResponses = httpRequests.stream()
.map(req -> http.singleRequest(req, materializer)
.thenApply(decodeResponse))
.map(CompletionStage::toCompletableFuture)
.collect(Collectors.toList());
for (CompletableFuture<HttpResponse> futureResponse : futureResponses) {
final HttpResponse httpResponse = futureResponse.get();
system.log().info("response is: " + httpResponse.entity()
.toStrict(1, materializer)
.toCompletableFuture()
.get());
HttpEntity.Strict entity_ = HttpEntities.create(ContentTypes.APPLICATION_JSON, httpResponse.entity().toString());
Source<BitTweet, Object> BitTweet =
entity_.getDataBytes()
.via(support.framingDecoder()) // apply JSON framing
.mapAsync(1, // unmarshal each element
bs -> unmarshal.unmarshal(bs, materializer)
);

How to send Akka broadcast message

I'm trying to send a single message to several actors. Investigation led me to the code below, but it doesn't work. The message that's "wrapped" in the Broadcast object disappears, and the plain string ends up in the dead letter box.
Can someone tell me what I'm missing? (Edit: I've added the correction below)
import akka.actor.{Actor, ActorSystem, Props}
import akka.routing.{Broadcast, BroadcastRoutingLogic, Router}
object RouterAndBroadcast {
class MyRoutee extends Actor {
override def receive: Receive = {
case x => println(s"MyRoutee $this got message $x")
}
}
def main(args: Array[String]): Unit = {
val system = ActorSystem.create("system")
val mr0 = system.actorOf(Props[MyRoutee])
val mr1 = system.actorOf(Props[MyRoutee])
val mr2 = system.actorOf(Props[MyRoutee])
/* This was the error:
val router = new Router(BroadcastRoutingLogic())
router.addRoutee(mr1)
router.addRoutee(mr2) */
// This is the corrected version:
val router = new Router(BroadcastRoutingLogic())
.addRoutee(mr1)
.addRoutee(mr2)
mr1 ! "Hello number one!"
mr2 ! "Ahoy two, me old mate!"
router.route(new Broadcast("Listen up!"), mr0) // vanishes??
router.route("Listen up!", mr0) // ends up in dead letters
mr1 ! "Number one, are you still there?"
mr2 ! "Two, where's the grog?"
router.route(new Broadcast("Still shouting!"), mr0) // also vanishes
Thread.sleep(5000)
system.terminate()
}
}
Router.addRoutee returns a copy with the routee added, it doesn't modify ther Router in place, see:
https://github.com/akka/akka/blob/b94e064a34a7f6a9d1fea55317d5676731ac0778/akka-actor/src/main/scala/akka/routing/Router.scala#L140
/**
* Create a new instance with one more routee and the same [[RoutingLogic]].
*/
def addRoutee(routee: Routee): Router = copy(routees = routees :+ routee)
so instead try
router = router.addRoutee(mr1).addRoutee(mr2)

Akka stream get current value of an infinite stream from outside world

What is the best way to get the current value of an infinite stream which aggregates values and by definition never complete
Source.repeat(1)
.scan(0)(_+_)
.to(Sink.ignore)
I would like to query from Akka HTTP the current counter value. Should I use dynamic stream ? The broadcastHub and then from Akka http subscribe to the infinite stream on GET request ?
One solution could be to use an actor to keep the state you need. Sink.actorRef will wrap an existing actor ref in a sink, e.g.
class Keeper extends Actor {
var i: Int = 0
override def receive: Receive = {
case n: Int ⇒ i = n
case Keeper.Get ⇒ sender ! i
}
}
object Keeper {
case object Get
}
val actorRef = system.actorOf(Props(classOf[Keeper]))
val q = Source.repeat(1)
.scan(0)(_+_)
.runWith(Sink.actorRef(actorRef, PoisonPill))
val result = (actorRef ? Keeper.Get).mapTo[Int]
Note that backpressure is not preserved when using Sink.actorRef. This can be improved by using Sink.actorRefWithAck. More about this can be found in the docs.
One possibility is using Sink.actorRefWithBackpressure.
Imagina having the following Actor to store the state coming from a Stream:
object StremState {
case object Ack
sealed trait Protocol extends Product with Serializable
case object StreamInitialized extends Protocol
case object StreamCompleted extends Protocol
final case class WriteState[A](value: A) extends Protocol
final case class StreamFailure(ex: Throwable) extends Protocol
final case object GetState extends Protocol
}
class StremState[A](implicit A: ClassTag[A]) extends Actor with ActorLogging {
import StremState._
var state: Option[A] = None
def receive: Receive = {
case StreamInitialized =>
log.info("Stream initialized!")
sender() ! Ack // ack to allow the stream to proceed sending more elements
case StreamCompleted =>
log.info("Stream completed!")
case StreamFailure(ex) =>
log.error(ex, "Stream failed!")
case WriteState(A(value)) =>
log.info("Received element: {}", value)
state = Some(value)
sender() ! Ack // ack to allow the stream to proceed sending more elements
case GetState =>
log.info("Fetching state: {}", state)
sender() ! state
case other =>
log.warning("Unexpected message '{}'", other)
}
}
This actor can be then used in a Sink of a Stream as follows:
implicit val tm: Timeout = Timeout(1.second)
val stream: Source[Int, NotUsed] = Source.repeat(1).scan(0)(_+_)
val receiver = system.actorOf(Props(new StremState[Int]))
val sink = Sink.actorRefWithBackpressure(
receiver,
onInitMessage = StremState.StreamInitialized,
ackMessage = StremState.Ack,
onCompleteMessage = StremState.StreamCompleted,
onFailureMessage = (ex: Throwable) => StremState.StreamFailure(ex)
)
stream.runWith(sink)
// Ask for Stream current state to the receiver Actor
val futureState = receiver ? GetState

can this code can be translated to stateful akka streams?

I'm trying to listen to sqs using akka streams and i get messages from it's q
using this code snippet:
of course this code snippet get messages one-by-one (then ack it):
implicit val system = ActorSystem()
implicit val mat = ActorMaterializer()
implicit val ec = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(ioThreadPoolSize))
val awsSqsClient: AmazonSQSAsync = AmazonSQSAsyncClientBuilder
.standard()
.withCredentials(new ClasspathPropertiesFileCredentialsProvider())
.withEndpointConfiguration(new EndpointConfiguration(sqsEndpoint, configuration.regionName))
.build()
val future = SqsSource(sqsEndpoint)(awsSqsClient)
.takeWhile(_ => true)
.mapAsync(parallelism = 2)(m => {
val msgBody = SqsMessage.deserializeJson(m.getBody)
msgBody match {
case Right(body) => val id = getId(body) //do some stuff with the message may save state according the id
}
Future(m, Ack())
})
.to(SqsAckSink(sqsEndpoint)(awsSqsClient))
.run()
my question is:
can i get several messages, and save them for example in a stateful map for latter use?
for example that after receiving 5 messages (all of them will saved (per state))
then if specific condition happens i will ack them all, and if not they will return into queue (will happen anyway because visibility timeout)?
thanks.
Could be that you're looking for grouped (or groupedWithin) combinator. These allow you to batch messages and process them in groups. groupedWithin allows you to release a batch after a certain time in case it hasn't yet reached your determined size. Docs reference here.
In a subsequent check flow you can perform any logic you need, and emit the sequence in case you want the messages to be acked, or not emit them otherwise.
Example:
val yourCheck: Flow[Seq[MessageActionPair], Seq[MessageActionPair], NotUsed] = ???
val future = SqsSource(sqsEndpoint)(awsSqsClient)
.takeWhile(_ => true)
.mapAsync(parallelism = 2){ ... }
.grouped(5)
.via(yourCheck)
.mapConcat(identity)
.to(SqsAckSink(sqsEndpoint)(awsSqsClient))
.run()

Processing Akka stream in Slick transaction

Software versions:
Akka 2.4.4
Slick 3.1.0
I want to process elements from an Akka stream in a Slick transaction.
Here is some simplified code to illustrate one possible approach:
def insert(d: AnimalFields): DBIO[Long] =
animals returning animals.map(_.id) += d
val source: Source[AnimalFields, _]
val sourceAsTraversable = ???
db.run((for {
ids <- DBIO.sequence(sourceAsTraversable.map(insert))
} yield { ids }).transactionally)
One solution I could come up with so far is blocking each future to traverse the elements:
class TraversableQueue[T](sinkQueue: SinkQueue[T]) extends Traversable[T] {
#tailrec private def next[U](f: T => U): Unit = {
val nextElem = Await.result(sinkQueue.pull(), Duration.Inf)
if (nextElem.isDefined) {
f(nextElem.get)
next(f)
}
}
def foreach[U](f: T => U): Unit = next(f)
}
val sinkQueue = source.runWith(Sink.queue())
val queue = new TraversableQueue(sinkQueue)
Now I can pass the traversable queue to DBIO.sequence(). This defeats the purpose of streamed processing, though.
Another approach I found is this:
def toDbioAction[T](queue: SinkQueue[DBIOAction[S, NoStream, Effect.All]]):
DBIOAction[Queue[T], NoStream, Effect.All] =
DBIO.from(queue.pull() map { tOption =>
tOption match {
case Some(action) =>
action.flatMap(t => toDbioAction(queue).map(_ :+ t))
case None => DBIO.successful(Queue())
}
}).flatMap(r => r)
With this method, a sequence of DBIOActions can be generated without blocking:
toDbioAction(source.runWith(Sink.queue()))
Is there any better / more idiomatic way to achieve the desired result?
Here is my implementation of sourceAsTraversable:
import scala.collection.JavaConverters._
def sourceAsTraversable[A](source: Source[A, _])(implicit mat: Materializer): Traversable[A] =
source.runWith(StreamConverters.asJavaStream()).iterator().asScala.toIterable
The issue with TraversableQueue was that the forEach had to finish processing the stream fully - it did not support the "break" concept, so methods like "drop"/"take", etc. would still have to process whole source. This could be important from error handling point of view and failing fast.