Reprocess AWS SQS Dead Letter Queue messages - amazon-web-services

I'm sending messages that failed in my lambda to a dead letter queue using aws sdk. I want to wait for few hours before sending the message back to the main queue for reprocessing. I have a lambda attached to my dead letter queue. I can use delay for sending messages to the dead letter queue. But the maximum delay is 15 minutes. But I want to wait for more time. Has anyone done this before?

Amazon SQS is not intended to be used in this manner. Its primary purpose is to store messages and then provide them back when requested.
Some other options:
Store the message in a database and have the application search for relevant messages based on a timestamp field, or
Do some tricky stuff with delays on AWS Step Functions (which has a delay feature)

As shown in this answer you can extend the delay, by giving each message a timestamp and processing only those that are in queue for a while now.
message = SQS.poll_messages
if message.perform_message_at > Time.now
SQS.push_to_queue({perform_message_at : "Thursday November
2022"},delay:15 mins)
else
process_message(message)
end

Related

Concurrent processing of user Messages in SQS

We have several consumer processes which poll from a standard SQS and process the message. Each message is associated with a user. For each user, we can process 100 messages per minute. Beyond that, the API which we are using for processing would start giving 500 errors.
Now since the Queue contains messages for other users, we can't cherry-pick those users since they have their quota under the limit.
One solution to this is using FIFO and implementing message groups. But FIFO has a peculiar limitation.
You can have a maximum of 20,000 in-flight messages
This would have been completely fine, but the issue is that when a message is in flight from a message group, SQS adds the count of all the messages in that group to the in-flight count.
This article explains more in detail:
https://tomgregory.com/3-surprising-facts-about-aws-sqs-fifo-queues/#:~:text=A%20FIFO%20queue%20has%20a%20maximum%20inflight%20message%20limit%20of%2020%2C000.
In this article read "20,000 message buffer" header. That might explain what's happening.
https://aws.amazon.com/premiumsupport/knowledge-center/sqs-message-backlog/
The second solution which I could think of is to make the producer of the microservice smart. But in our case, the producer is a completely different microservice. And the owners of that microservice hardly listen.
We definitely want our consumers to scale to provide minimum wait time to each user but can't because of the above reasons.
I genuinely feel SQS was not the correct choice for this design, but can't convince my superiors of the same.
Is there a way we can overcome this situation or did we hit a dead end?
This would have been completely fine, but the issue is that when a
message is in flight from a message group, SQS adds the count of all
the messages in that group to the in-flight count
I do not think this is the case for a FIFO. I am using a FIFO where I process one message at a time per consumer(There are 3 of them). There are SQS messages from the same message group, but the inflight message count for me is always 3, i.e each of the 3 consumers processing one of them. When either of them processes the message, and the processing time for each SQS message is variable here, it picks up the next one in the queue. The inflight messages count remains 3 all the time.

SQS Lambda Trigger polling rate

I'm trying to understand how SQS Lambda Triggers works when polling for messages from the Queue.
Criteria
I'm trying to make sure that not more than 3 messages are processed within a period of 1 second.
Idea
My idea is to set the trigger BatchSize to 3 and setting the ReceiveMessageWaitTimeSeconds of the queue to 1 second. Am I thinking about this correctly?
Edit:
I did some digging and looks like I can set a concurrency limit on my Lambda. If I set my Lambda concurrency limit to one that ensures only one batch of message gets processed at a time. If my lambda runs for a second, then the next batch of messages gets processed at least a second later. The gotcha here is long-polling auto scales the number of asychronous polling on the queue based on message volume. This means, the lambdas can potentailly throttle when a large number of messages comes in. When the lambdas throttle, the message goes back to the queue until it eventually goes into the DLQ.
ReceiveMessageWaitTimeSeconds is used for long polling. It is the length of time, in seconds, for which a ReceiveMessage action waits for messages to arrive (docs). Long polling does not mean that your client will wait for the full length of the time set. If you have it set to one second, but in the queue we already have enough messages, your client will consume them instantaneously and will try to consume again as soon as processing is completed.
If you want to consume certain number of messages at certain rate, you have do this on your application (for example consumes messages on a scheduled basis). SQS by itself does not provide any kind of rate limiting similar to what you would want to accomplish.

SQS - Schedule a message to de delivered

I would like to publish a message on SQS and process that message after a few hours.
How can i schedule a message delivery or select messages from SQS based on some attribute?
I've implemented a SQS consumer but I'm receiving every message from SQS queue. Is possible to implement something like that on SQS? I was thinking about to receive every message and send to queue again if it's not time to process that message.
There is a feature called as Delay Queues in SQS, wherein if you set the delay on the queue then any message that is put on queue is available to consumers only after the delay duration has elapsed. However, the maximum delay that you can set there is 15 minutes and if you are looking for a delay of few hours this may not directly work for you.
The other option is to set a visibilty timeout for the messages higher than the delay time that you want. Then when you read the message you can get the message timestamp. If there is still some time left for your delay then you can sleep your consumer for the remaining time and after it has woken up you can process that message. However this is not a recommended way and would be highly inefficient because your threads are getting blocked. In fact what can as well be done is if there is still some time left for your delay then you just hold the message in a local List/Array and check for other messages and process this message after your delay. But all this would require entire logic to reside in your code and you don't get any ready-made feature from AWS

Can I tell if an Amazon SQS message is still in flight?

Given an Amazon SQS message, is there a way to tell if it is still in flight via the API? Or, would I need to note the timestamp when I receive the message, subtract that from the current time, and check if that is less than the visibility timeout?
The normal flow for using Amazon Simple Queueing Service (SQS) is:
A message is pushed onto a queue using SendMessage (it can remain in the queue for up to 14 days)
An application uses ReceiveMessage to retrieve a message from the queue (no guarantee of first-in-first-out)
When the application has finished processing the message, it calls DeleteMessage (it can also call ChangeMessageVisibility to extend the time until it times-out)
If the application does not delete the message within a pre-configured time period, SQS makes the message reappear on the queue
If a message is retrieved from the queue more than a pre-configured number of times, the message can be moved to a Dead Letter queue
It is not possible to obtain information about a specific message. Rather, the application asks for a message (or a batch of messages), upon which the message becomes invisible (or 'in flight'). This also gives access to a ReceiptHandle that can be used with DeleteMessage or ChangeMessageVisibility.
The closest option is to call GetQueueAttributes. The value for ApproximateNumberOfMessagesNotVisible will indicate the number of in-flight messages but it will not give insight into a particular message.

How to handle large Emailing queue and delivery with AWS SES?

We are developing an app. that need to handle large email queues. We have planned to store emails in a SQS queue and use SES to send emails. but a bit confused on how to actually handle the queue and process queue. should I use cronjob to regularly read the SQS queue and send emails? What would be the best way to actually trigger the script that will be emailing from our app?
Using SQS with SES is a great way to handle this. If something goes wrong while emailing the request will still be on the queue and will be processed next time around.
I just use a cron job that starts my queue processing/email sending job once an hour. The job runs for an hour as a simple loop:
while i've been running < 1 hour:
if there's a message in the queue:
process the message
delete the message from the queue
I set the WaitTimeSeconds parameter to the maximum (20 seconds) so that the check for a new message will wait a while for a new message if necessary so that the job isn't hitting AWS every few milliseconds. Otherwise, I could put a sleep statement of some kind in the loop.
The reason I run for just an hour is that the job might encounter some error that kills it, or have a memory leak, or some other unanticipated problem. This way any queued email requests will still get handled the next time the job is started.
If you want, you can start the job every fifteen minutes so you'll always have four worker processes handling queue requests. If one of them dies for some reason, you'll still be processing with the other three.