can this code can be translated to stateful akka streams? - akka

I'm trying to listen to sqs using akka streams and i get messages from it's q
using this code snippet:
of course this code snippet get messages one-by-one (then ack it):
implicit val system = ActorSystem()
implicit val mat = ActorMaterializer()
implicit val ec = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(ioThreadPoolSize))
val awsSqsClient: AmazonSQSAsync = AmazonSQSAsyncClientBuilder
.standard()
.withCredentials(new ClasspathPropertiesFileCredentialsProvider())
.withEndpointConfiguration(new EndpointConfiguration(sqsEndpoint, configuration.regionName))
.build()
val future = SqsSource(sqsEndpoint)(awsSqsClient)
.takeWhile(_ => true)
.mapAsync(parallelism = 2)(m => {
val msgBody = SqsMessage.deserializeJson(m.getBody)
msgBody match {
case Right(body) => val id = getId(body) //do some stuff with the message may save state according the id
}
Future(m, Ack())
})
.to(SqsAckSink(sqsEndpoint)(awsSqsClient))
.run()
my question is:
can i get several messages, and save them for example in a stateful map for latter use?
for example that after receiving 5 messages (all of them will saved (per state))
then if specific condition happens i will ack them all, and if not they will return into queue (will happen anyway because visibility timeout)?
thanks.

Could be that you're looking for grouped (or groupedWithin) combinator. These allow you to batch messages and process them in groups. groupedWithin allows you to release a batch after a certain time in case it hasn't yet reached your determined size. Docs reference here.
In a subsequent check flow you can perform any logic you need, and emit the sequence in case you want the messages to be acked, or not emit them otherwise.
Example:
val yourCheck: Flow[Seq[MessageActionPair], Seq[MessageActionPair], NotUsed] = ???
val future = SqsSource(sqsEndpoint)(awsSqsClient)
.takeWhile(_ => true)
.mapAsync(parallelism = 2){ ... }
.grouped(5)
.via(yourCheck)
.mapConcat(identity)
.to(SqsAckSink(sqsEndpoint)(awsSqsClient))
.run()

Related

Akka stream get current value of an infinite stream from outside world

What is the best way to get the current value of an infinite stream which aggregates values and by definition never complete
Source.repeat(1)
.scan(0)(_+_)
.to(Sink.ignore)
I would like to query from Akka HTTP the current counter value. Should I use dynamic stream ? The broadcastHub and then from Akka http subscribe to the infinite stream on GET request ?
One solution could be to use an actor to keep the state you need. Sink.actorRef will wrap an existing actor ref in a sink, e.g.
class Keeper extends Actor {
var i: Int = 0
override def receive: Receive = {
case n: Int ⇒ i = n
case Keeper.Get ⇒ sender ! i
}
}
object Keeper {
case object Get
}
val actorRef = system.actorOf(Props(classOf[Keeper]))
val q = Source.repeat(1)
.scan(0)(_+_)
.runWith(Sink.actorRef(actorRef, PoisonPill))
val result = (actorRef ? Keeper.Get).mapTo[Int]
Note that backpressure is not preserved when using Sink.actorRef. This can be improved by using Sink.actorRefWithAck. More about this can be found in the docs.
One possibility is using Sink.actorRefWithBackpressure.
Imagina having the following Actor to store the state coming from a Stream:
object StremState {
case object Ack
sealed trait Protocol extends Product with Serializable
case object StreamInitialized extends Protocol
case object StreamCompleted extends Protocol
final case class WriteState[A](value: A) extends Protocol
final case class StreamFailure(ex: Throwable) extends Protocol
final case object GetState extends Protocol
}
class StremState[A](implicit A: ClassTag[A]) extends Actor with ActorLogging {
import StremState._
var state: Option[A] = None
def receive: Receive = {
case StreamInitialized =>
log.info("Stream initialized!")
sender() ! Ack // ack to allow the stream to proceed sending more elements
case StreamCompleted =>
log.info("Stream completed!")
case StreamFailure(ex) =>
log.error(ex, "Stream failed!")
case WriteState(A(value)) =>
log.info("Received element: {}", value)
state = Some(value)
sender() ! Ack // ack to allow the stream to proceed sending more elements
case GetState =>
log.info("Fetching state: {}", state)
sender() ! state
case other =>
log.warning("Unexpected message '{}'", other)
}
}
This actor can be then used in a Sink of a Stream as follows:
implicit val tm: Timeout = Timeout(1.second)
val stream: Source[Int, NotUsed] = Source.repeat(1).scan(0)(_+_)
val receiver = system.actorOf(Props(new StremState[Int]))
val sink = Sink.actorRefWithBackpressure(
receiver,
onInitMessage = StremState.StreamInitialized,
ackMessage = StremState.Ack,
onCompleteMessage = StremState.StreamCompleted,
onFailureMessage = (ex: Throwable) => StremState.StreamFailure(ex)
)
stream.runWith(sink)
// Ask for Stream current state to the receiver Actor
val futureState = receiver ? GetState

Create AMQP queue in exchange with Alpakka

I want to create a queue within an existing exchange for reading.
Another application is publishing messages to this exchange and fanning them out to all member queues. I want my new application to be an additional subscriber to these messages.
The following creates a queue:
implicit val system = ActorSystem("my-system")
implicit val materializer = ActorMaterializer()
implicit val executionCtx: ExecutionContext = system.dispatcher
val queueName: String = s"test-queue-${System.currentTimeMillis}"
val queueDeclaration = QueueDeclaration(queueName, autoDelete = true)
val amqpSource = AmqpSource(
NamedQueueSourceSettings(AmqpConnectionUri(amqpUri), queueName)
.withDeclarations(queueDeclaration), bufferSize = 10)
And this creates a sink for an exchange
val sink = AmqpSink.simple(AmqpSinkSettings(AmqpConnectionUri(amqpUri))
.withExchange("exchange_name"))
But I'm not sure how to use them together, if that's the right approach.

akka stream custom graph stage

I have an akka stream from a web-socket like akka stream consume web socket and would like to build a reusable graph stage (inlet: the stream, FlowShape: add an additional field to the JSON specifying origin i.e.
{
...,
"origin":"blockchain.info"
}
and an outlet to kafka.
I face the following 3 problems:
unable to wrap my head around creating a custom Inlet from the web socket flow
unable to integrate kafka directly into the stream (see the code below)
not sure if the transformer to add the additional field would be required to deserialize the json to add the origin
The sample Project (flow only) looks like:
import system.dispatcher
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
val incoming: Sink[Message, Future[Done]] =
Flow[Message].mapAsync(4) {
case message: TextMessage.Strict =>
println(message.text)
Future.successful(Done)
case message: TextMessage.Streamed =>
message.textStream.runForeach(println)
case message: BinaryMessage =>
message.dataStream.runWith(Sink.ignore)
}.toMat(Sink.last)(Keep.right)
val producerSettings = ProducerSettings(system, new ByteArraySerializer, new StringSerializer)
.withBootstrapServers("localhost:9092")
val outgoing = Source.single(TextMessage("{\"op\":\"unconfirmed_sub\"}")).concatMat(Source.maybe)(Keep.right)
val webSocketFlow = Http().webSocketClientFlow(WebSocketRequest("wss://ws.blockchain.info/inv"))
val ((completionPromise, upgradeResponse), closed) =
outgoing
.viaMat(webSocketFlow)(Keep.both)
.toMat(incoming)(Keep.both)
// TODO not working integrating kafka here
// .map(_.toString)
// .map { elem =>
// println(s"PlainSinkProducer produce: ${elem}")
// new ProducerRecord[Array[Byte], String]("topic1", elem)
// }
// .runWith(Producer.plainSink(producerSettings))
.run()
val connected = upgradeResponse.flatMap { upgrade =>
if (upgrade.response.status == StatusCodes.SwitchingProtocols) {
Future.successful(Done)
} else {
throw new RuntimeException(s"Connection failed: ${upgrade.response.status}")
system.terminate
}
}
// kafka that works / writes dummy data
val done1 = Source(1 to 100)
.map(_.toString)
.map { elem =>
println(s"PlainSinkProducer produce: ${elem}")
new ProducerRecord[Array[Byte], String]("topic1", elem)
}
.runWith(Producer.plainSink(producerSettings))
One issue is around the incoming stage, which is modelled as a Sink. where it should be modelled as a Flow. to subsequently feed messages into Kafka.
Because incoming text messages can be Streamed. you can use flatMapMerge combinator as follows to avoid the need to store entire (potentially big) messages in memory:
val incoming: Flow[Message, String, NotUsed] = Flow[Message].mapAsync(4) {
case msg: BinaryMessage =>
msg.dataStream.runWith(Sink.ignore)
Future.successful(None)
case TextMessage.Streamed(src) =>
src.runFold("")(_ + _).map { msg => Some(msg) }
}.collect {
case Some(msg) => msg
}
At this point you got something that produces strings, and can be connected to Kafka:
val addOrigin: Flow[String, String, NotUsed] = ???
val ((completionPromise, upgradeResponse), closed) =
outgoing
.viaMat(webSocketFlow)(Keep.both)
.via(incoming)
.via(addOrigin)
.map { elem =>
println(s"PlainSinkProducer produce: ${elem}")
new ProducerRecord[Array[Byte], String]("topic1", elem)
}
.toMat(Producer.plainSink(producerSettings))(Keep.both)
.run()

Order of ActorContext changes

I noticed that actor at first sent message about state change and later really has been changed this state. It's correct?
class MyActor extends Actor {
def receive = idle(Set.empty)
def idle(isInSet: Set[String]): Receive = {
case Add(key) =>
// sending the result as a message back to our actor
validate(key).map(Validated(key, _)).pipeTo(self)
// waiting for validation
context.become(waitForValidation(isInSet, sender()))
}
def waitForValidation(set: Set[String], source: ActorRef): Receive = {
case Validated(key, isValid) =>
val newSet = if (isValid) set + key else set
// sending acknowledgement of completion
source ! Continue
Here occurs sending notification
// go back to idle, accepting new requests
context.become(idle(newSet))
and later changed state
case Add(key) =>
sender() ! Rejected
}
def validate(key: String): Future[Boolean] = ???
}
// Messages
case class Add(key: String)
case class Validated(key: String, isValid: Boolean)
case object Continue
case object Rejected
You should probably consider moving become() before pipeTo(self) if you want the actor to receive the message in the waitForValidation state:
context.become(waitForValidation(isInSet, sender()))
validate(key).map(Validated(key, _)).pipeTo(self)
I agree that piping the message will put it in the queue, and by the time the object gets to processing it the object should be in the new state, but most of the examples I have seen call the become before piping just to be on the safe side.

Processing Akka stream in Slick transaction

Software versions:
Akka 2.4.4
Slick 3.1.0
I want to process elements from an Akka stream in a Slick transaction.
Here is some simplified code to illustrate one possible approach:
def insert(d: AnimalFields): DBIO[Long] =
animals returning animals.map(_.id) += d
val source: Source[AnimalFields, _]
val sourceAsTraversable = ???
db.run((for {
ids <- DBIO.sequence(sourceAsTraversable.map(insert))
} yield { ids }).transactionally)
One solution I could come up with so far is blocking each future to traverse the elements:
class TraversableQueue[T](sinkQueue: SinkQueue[T]) extends Traversable[T] {
#tailrec private def next[U](f: T => U): Unit = {
val nextElem = Await.result(sinkQueue.pull(), Duration.Inf)
if (nextElem.isDefined) {
f(nextElem.get)
next(f)
}
}
def foreach[U](f: T => U): Unit = next(f)
}
val sinkQueue = source.runWith(Sink.queue())
val queue = new TraversableQueue(sinkQueue)
Now I can pass the traversable queue to DBIO.sequence(). This defeats the purpose of streamed processing, though.
Another approach I found is this:
def toDbioAction[T](queue: SinkQueue[DBIOAction[S, NoStream, Effect.All]]):
DBIOAction[Queue[T], NoStream, Effect.All] =
DBIO.from(queue.pull() map { tOption =>
tOption match {
case Some(action) =>
action.flatMap(t => toDbioAction(queue).map(_ :+ t))
case None => DBIO.successful(Queue())
}
}).flatMap(r => r)
With this method, a sequence of DBIOActions can be generated without blocking:
toDbioAction(source.runWith(Sink.queue()))
Is there any better / more idiomatic way to achieve the desired result?
Here is my implementation of sourceAsTraversable:
import scala.collection.JavaConverters._
def sourceAsTraversable[A](source: Source[A, _])(implicit mat: Materializer): Traversable[A] =
source.runWith(StreamConverters.asJavaStream()).iterator().asScala.toIterable
The issue with TraversableQueue was that the forEach had to finish processing the stream fully - it did not support the "break" concept, so methods like "drop"/"take", etc. would still have to process whole source. This could be important from error handling point of view and failing fast.