Amazon SQS message multi-delivery - amazon-web-services

I understand that to bring vast scalability and reliability, SQS does extensive parallelization of resources. It uses redundant servers for even small queues and even the messages posted to the queues are stored redundantly as multiple copies. These are the factors which prevent it from exactly-once-delivery like in RabbitMQ. I have seen even deleted messages being delivered.
The implications for the developers is that they need to be prepared for multiple delivery of messages. Amazon claims it not to be a problem, but it it is, then the developer must use some synchronization construct like a database-transaction-lock or dynamo-db conditional write. both of these reduce scalability.
Question is,
In light of the duplicate delivery problem, how the message-invisible-period feature holds? The message is not guaranteed to be invisible. If the developer has to make own arrangements for synchronization, what benefit is of the invisibility-period. I have seen messages re-delivered even when they were supposed to be invisible.
Edit
here i include some references
What is a good practice to achieve the "Exactly-once delivery" behavior with Amazon SQS?
http://aws.amazon.com/sqs/faqs/#How_many_times_will_I_receive_each_message
http://aws.amazon.com/sqs/faqs/#How_does_Amazon_SQS_allow_multiple_readers_to_access_the_same_message_queue_without_losing_messages_or_processing_them_many_times
http://aws.amazon.com/sqs/faqs/#Can_a_deleted_message_be_received_again

Message invisibility solves a different problem to guaranteeing one and only one delivery. Consider a long running operation on an item in the queue. If the processor craps out during the operation, you don't want to delete the message, you want it to reappear and be handled again by a different processor.
So the pattern is...
Write (push) item into queue
View (peek) item in queue
Mark item invisible
Execute process on item
Write results
Delete (pop) item from queue
So whether you get duplicate delivery or not, you still need to ensure that you process the item in the queue. If you delete it on pulling it off the queue, and then your server dies, you may lose that message forever. It enables aggressive scaling through the use of spot instances - and guarantees (using the above pattern), that you won't lose a message.
But - it doesn't guarantee once and only once delivery. But I don't think it's designed for that problem. I also don't think it's an insurmountable problem. In our case (and I can see why I've never noticed the issues before) - we're writing results to S3. It's no big deal if it overwrites the same file with the same data. Of course if it's a debit transaction going to a bank a/c, you'd probably want some sort of correlation ID... and most systems already have those in there. So if you get a duplicate correlation value, you throw an exception and move on.
Good question. Highlighted something for me.

Related

Redshift stv_wlm_query_state.state: QUEUED vs QUEUEDWAITING

In redshift stv_wlm_query_state system table, what are the differences between QUEUED state and QUEUEDWAITING state?
I've not seen an exact and authoritative set of definitions for queue states published but I have a general understanding that has been useful to me. When a query is submitted it needs to be processed through many steps like compiling, running and returning data. These are all reflected in queue states but there is also time before and between these steps as the query progresses. QUEUED just means that the query is in the queue process but not in another defined state.
Since parallel execution of queries is limited by the WLM and the number of slots available there is a defined state for queries that are waiting on other queries to finish before they can be executed. This specific waiting-for-an-execution-slot state is QUEUEDWAITING. This is generally the most common place for significant waiting to occur and is directly optimizable through the WLM (but possibly not fixed). Delays caused by a flurry of very complex queries needing to be compiled and optimized by the leader would not create QUEUEDWAITING states but these could just show up as QUEUED state.
This is my working understanding based on experience. If someone posts an authoritative set of definitions for queue states I'll be as interested as you are.

Event Sourcing/CQRS doubts about aggregates, atomicity, concurrency and eventual consistency

I'm studying event sourcing and command/query segregation and I have a few doubts that I hope someone with more experience will easily answer:
A) should a command handler work with more than one aggregate? (a.k.a. should they coordinate things between several aggregates?)
B) If my command handler generates more than one event to store, how do you guys push all those events atomically to the event store? (how can I garantee no other command handler will "interleave" events in between?)
C) In many articles I read people suggest using optimistic locking to write the new events generated, but in my use case I will have around 100 requests / second. This makes me think that a lot of requests will just fail at huge rates (a lot of ConcurrencyExceptions), how you guys deal with this?
D) How to deal with the fact that the command handler can crash after storing the events in the event store but before publishing them to the event bus? (how to eventually push those "confirmed" events back to the event bus?)
E) How you guys deal with the eventual consistency in the projections? you just live with it? or in some cases people lock things there too? (waiting for an update for example)
I made a sequence diagram to better ilustrate all those questions
(and sorry for the bad english)
If my command handler generates more than one event to store, how do you guys push all those events atomically to the event store?
Most reasonable event store implementations will allow you to batch multiple events into the same transaction.
In many articles I read people suggest using optimistic locking to write the new events generated, but in my use case I will have around 100 requests / second.
If you have lots of parallel threads trying to maintain a complex invariant, something has gone badly wrong.
For "events" that aren't expected to establish or maintain any invariant, then you are just writing things to the end of a stream. In other words, you are probably not trying to write an event into a specific position in the stream. So you can probably use batching to reduce the number of conflicting writes, and a simple retry mechanism. In effect, you are using the same sort of "fan-in" patterns that appear when you have concurrent writers inserting into a queue.
For the cases where you are establishing/maintaining an invariant, you don't normally have many concurrent writers. Instead, specific writers have authority to write events (think "sharding"); the concurrency controls there are primarily to avoid making a mess in abnormal conditions.
How to deal with the fact that the command handler can crash after storing the events in the event store but before publishing them to the event bus?
Use pull, rather than push, as the primary subscription mechanism. Make sure that subscribers can handle duplicate messages safely (aka "idempotent"). Don't use a message subscription that can re-order events when you need events strictly ordered.
How you guys deal with the eventual consistency in the projections? you just live with it?
Pretty much. Views and reports have metadata information in them to let you know at what fixed point in "time" the report was accurate.
Unless you lock out all writers while a report is being consumed, there's a potential for any data being out of date, regardless of whether you are using events vs some other data model, regardless of whether you are using a single data model or several.
It's all part of the tradeoff; we accept that there will be a larger window between report time and current time in exchange for lower response latency, an "immutable" event history, etc.
should a command handler work with more than one aggregate?
Probably not - which isn't the same thing as always never.
Usual framing goes something like this: aggregate isn't a domain modeling pattern, like entity. It's a lifecycle pattern, used to make sure that all of the changes we make at one time are consistent.
In the case where you find that you want a command handler to modify multiple domain entities at the same time, and those entities belong to different aggregates, then have you really chosen the correct aggregate boundaries?
What you can do sometimes is have a single command handler that manages multiple transactions, updating a different aggregate in each. But it might be easier, in the long run, to have two different command handlers that each receive a copy of the command and decide what to do, independently.

Akka Persistence - AtLeastOnceDelivery: How to get unlimited number of unconfirmed messages without running out of memory

We are using a AtLeastOnceDelivery persistent actor. We are getting a huge volume of unconfirmed messages and this is causing an OutOfMemoryError if this number is too large.
Is there a way we can configure the AtLeastOnceDelivery actor to keep the unconfirmed messages on disk rather than keeping them all in memory until they are confirmed?
Or better is there an alternative to AtLeastOnceDelivery that gets around this limitation?
There is no way to tune AtLeastOnceDelivery trait to only persist on disk and not keep the information in memory. AtLeastOnceDelivery follows similar semantics as persistent actors: the deliverable messages are be backed to a Akka persistence backend and kept in memory for fast access. The in-memory storage is hardcoded to the trait.
If you have trouble keeping the memory usage in bounds, perhaps you should investigate why that is happening and how to solve it. It seems to me that the actor that keeps running out of memory is not getting responses to its message deliveries fast enough, thus deliverable work keeps piling up.
There are a couple of tricks you can try to speed the message processing. You can try optimising individual components in your message processing to be faster, or you can try parallelising the message processing using techniques such as the Router actor. However, sometimes these tricks might not be enough to get the processing fast enough, which is when you need to consider controlling how messages are brought to the actor for processing.
Instead of the actor accepting all messages coming at it, you could make the actor pull the work instead. There are many ways you can implement this pattern, but they have a common premise: instead of dumping all the messages on actors directly, you store the messages to a database (or something that can handle huge volumes of data), and then have worker actors pull and work on the stored messages as fast as they can. I don't think there are many off-the-shelf solutions for the pattern, so you probably have to do a bit of work on implementing the solution yourself.

When to use various Akka Mailbox types

I'm trying to understand when and where to use the different built-in Akka mailboxes as well as when it is appropriate to roll your own. However, nowhere on that page does it explain what a "bounded mailbox" actually is, or how it behaves different than an unbounded mailbox. Also, that page categorizes mailboxes as "blocking" vs "non-blocking". And while I have a strong idea of what they mean by this (a message can be sent to a mailbox unless the mailbox is first emptied) I'm not 100% sure that I understand this. So seeing that I have no idea what the docs mean when they categorize a mailbox as bounded or blocking, it's tough for me to tell when I should be using each type.
Also, it seems like it is the default Akka behavior to clear out an actor's mailbox if that actor is restarted. I'd like to prevent this, but not sure if the solution is to use one of these built-in mailbox types (no mention of message persistence is mentioned on this page) or to somehow use persistent actors to accomplish such lossless-ness.
First, if an actor crashes and is restarted you only lose the current message that was being processed and not the entire mailbox.
A bounded mailbox has a limit to the number of messages it can have queued before it starts blocking the sender and not allowing the item if the queue doesn't go down while the sender is trying to put an item on. If you have concerns about memory and you can deal with potential message loss then you might want something like this. An unbounded mailbox has no limit on capacity at all so it could possible suffer memory issues if it gets flooded.
Whether it's bounded or not will affect whether or not it blocks. Blocking is generally not great for performance and should be avoided if the situation does not call for a bounded mailbox. That's why the default mailbox is unbounded; it will yield much better performance than a bounded counterpart.
The single consumer unbounded mailbox will most likely be the fastest because it is optimized to only have one consumer ever taking things off the queue. This means that you can not use a dispatcher that allows an actor instance to steal items from another actor instances mailbox (work distributing/stealing) but if you don't care about that then this mailbox might be the best bet for performance.
The priority based mailboxes allow you to provide code that allows the placement within the queue to vary depending on some attributes on the messages itself. This allows you to define the priority of the messages yourself and this will then shift higher priority items to the front of the queue regardless of the normal FIFO rules.

MPI distribution layer

I used MPI to write a distribution layer. Let say we have n of data sources and k of data consumers. In my approach each of n MPI processes reads data, then distributes it to one (or many) of k data consumers (other MPI processes) in given manner (logic).
So it seems to be very generic and my question is there something like that already done?
It seems simple, but it might be very complicated. Let say that distribution checks which of data consumers is ready to work (dynamic work distribution). It may distribute data according to given algorithm based on data. There are plenty of possibilities and I as every of us do not want to reinvent the wheel.
As far as I know, there is no generic implementation for it, other than the MPI API itself. You should use the correct functions according to the problem's constraints.
If what you're trying to build a simple n-producers-and-k-consumers synchronized job/data queue, then of course there are already many implementations out there (just google it and you should get a few).
However, the way you present it seems very general - sometimes you want the data to only be sent to one consumer, sometimes to all of them, etc. In that case, you should figure out what you want and when, and use either point-to-point communication functions, or collective communication functions, accordingly (and of course everyone has to know what to expect - you can't have a consumer waiting for data from a single source, while the producer wishes to broadcast the data...).
All that aside, here is one implementation that comes to mind that seems to answer all of your requirements:
Make a synchronized queue, producers pushing data in one end, consumers taking it from the other (decide on all kinds of behaviors for the queue as you need - is the queue size limited, does adding an element to a full queue block or fail, does removing an element from an empty queue block or fail, etc.).
Assuming the data contains some flag that tells the consumers if this data is for everyone or just for one of them, the consumers peek and either remove the element, or leave it there and just note that they already did it (either by keeping its id locally, or by changing a flag in the data itself).
If you don't want a single piece of collective data to block until everyone dealt with it, you can use 2 queues, one for each type of data, and the consumers would take data from one of the queues at a time (either by choosing a different queue each time, randomly choosing a queue, prioritizing one of the queues, or by some accepted order that is deductible from the data (e.g. lowest id first)).
Sorry for the long answer, and I hope this helps :)