AWS SQS Boto3 sending messages to dead letter manually - amazon-web-services

So I am building a small application that uses SQS. I have a simple handler process that determines if a given message is considered processed, marked for retry (to be re-queued) or is not able to be processed (should be sent to dead letter).
However based on the docs it would appear the only way to truly send a message to DL is by using a redrive policy which operates over # of receives a message has racked up. Because of the nature of my application, I could have several valid retries if my process isn't ready to handle a given message, but there are also times I may want to DL a message I have just received. Does AWS/Boto3 not provide a way to mark a specific message for DL?
I know I can just send the message myself to another queue I consider my own DL, I would just rather use AWS' built in tools for this.

I don't believe there is any limitation that would prevent you from sending the message to the deal-letter-queue by yourself.
So just read the message from the Q, if you know it needs to go to the DLQ directly, send it to the DLQ and remove it from the regular Q.

Related

is it possible to know how many times sqs messsage has been read

I have a use case to know how many times sqs message has been read in my code.
For example we read message from SQS, for abc reason/exception we cant process that message . Now the same message available in queue to read after visibility timeout.
This will create endless loop. Is there a way to know how many times particular sqs message has been read and returned back to queue.
I am aware this can be handled via dead letter queue. Since that requires more effort I am checking is there any other option
i dont want to retry the message if it fails more than x time and i want to delete it. Is it possible in SQS
You can do this manually by looking at the approximateReceiveCount attribute of your messages, see this question on how to do so. You just need to implement the logic to read the count and decide whether to try processing the message or delete it. Note however that receiveCount is affected by more than just programmatically processing messages: viewing messages in the console will increment it too.
That being said a DLQ is a premade solution for exactly this usecase. It's not a lot of additional work: all you have to do is create another SQS queue, set it as the DLQ of your processing queue, and set the number of retries. Then, the DLQ handles all your redrive logic, and instead of deleting messages after n failures they're moved to the DLQ, where you can manually look at them to understand why they're failing, set metrics alarms on the queue, and if you want manually re-drive the messages into your processing queue. Or just ignore them until they age out of the queue based on its retention policy - the important thing is that the DLQ gives you the option of being able to see which messages failed after the fact, while deleting them outright does not.
When calling ReceiveMessage(), you can specify a list of AttributeNames that you would like returned.
One of these attributes is ApproximateReceiveCount, which returns "the number of times a message has been received across all queues but not deleted".
It is an 'approximate' count due to the highly parallel nature of SQS -- it is possible that the count is slightly off if a message was processed around the same time as this request.

How to create a pipeline to read from both topic and dead letter and wait until the system is recovered to insert back the messages

I have a dead-letter queue for a pubsub cloud function that is receiving messages using PUSH subscription and at the moment the service is successfully sending those messages to the dead-letter topic and the dead-letter subscription when fails. However, I am uncertain on how to carry on once it reaches this dead-letter subscription.
I don't want to lose the messages that have been sent to the dead letter so my idea would be that in case of failures from the service to acknowledge the message after predefined delivery attempts, the message will be forwarded to a dead letter topic. The same service, when back to life, can pull the messages from the dead-letter topic as well to see what it missed during the times of unavailability.
There is a similar post in here but the answer only points to the options but not the solutions, and unfortunately, I haven't been able to find it.
There is also a mention in here about this issue where it's actually from where I have taken my question.
Please, could somebody point me in the right direction? Is there a better way?
If the main process aren't able to process the messages, you have to rely on the retry mechanism of PubSub.
If you put the messages in a Dead letter topic, it's because you can't process the message with the main function. So, it's another process, another function. You can't process with the same function the dead letter message (In fact you can, but it's an anti pattern).
A good pattern is to save your dead letter message somewhere. When your main function is up again, trigger a process that read the messages and re publish the messages in the main topic.

AWS SQS Dead Letter Queue notifications

I'm trying to design a small message processing system based on SQS, Lambda, and SNS. In case of failure, I'd like for the message to be enqueued in a Dead Letter Queue (DLQ) and for a webhook to be called.
I'd like to know what the most canonical or reasonable way of achieving that would look like.
Currently, if everything goes well, the process should be as follows:
SQS (in place to handle retries) enqueues a message
Lambda gets invoked by SQS and processes the message
Lambda sends a webhook and finishes normally
If something in the lambda goes wrong (success webhook cannot be called, task at hand cannot be processed), the easiest way to achieve what I want seems to be to set up a DLQ1 that SQS would put the failed messages in. An auxiliary lambda would then be called to process this message, pass it to SNS, which would call the failure webhook, and also forward the message to DLQ2, the final/true DLQ.
Is that the best approach?
One alternative I know of is Alarms, though I've been warned that they are quite tricky. Another one would be to have lambda call the error reporting webhook if there's a failure on the last retry, although that somehow seems inappropriate.
Thanks!
Your architecture looks good enough in case of success, but I personally find it quite confusing if anything goes wrong as I don't see why you need two DLQs to begin with.
Here's what I would do in case of failure:
Define a DLQ on your source SQS Queue and set the maxReceiveCount to e.g. 3, meaning if messages fail three times, they will be redirected to the configured DLQ
Create a Lambda that listens to this DLQ.
Execute the webhook inside this Lambda.
Since step 3 automatically deletes the message from the Queue once it has been processed and, apparently, you want the messages to be persisted somewhere, store the content of the message in a file on S3 and store the file metadata (bucket and key) in a table in DynamoDB, so you can always query for failed messages.
I don't see any role for SNS here unless you want multiple subscribers for a given message, but as I see this is not the case.
This way, you need need to maintain only one DLQ and you can get rid of SNS as it's only adding an extra layer of complexity to your architecture.

AWS SQS FIFO - How to get more than 10 messages at a time?

Currently we want to pull down an entire FIFO queue, and process the contents, and if any issues, release messages back into the queue.
The problem is, that currently AWS only gives us 10 messages, and won't give us 10 more (which is the way you get bulk messages in SQS, multiple 10 max message requests) until we delete or release the first 10.
We need to get more than 10 though. Is this not possible? We understand we can set the group_id to a random string, and that allows processing more, but then the order isn't guaranteed, which defeats the purpose of FIFO.
I managed to reproduce your results -- I could retrieve 10 messages, but then running the same command again would not return another set of messages.
The relevant documentation seems to be:
While messages with a particular MessageGroupId are invisible, no more messages belonging to the same MessageGroupId are returned until the visibility timeout expires. You can still receive messages with another MessageGroupId as long as it is also visible.
I suspect (just a theory!) that this is to preserve the ordering of messages... If a client asked for a set of messages and they are still being processed, there is the chance that the messages might be returned to the queue. Therefore, no further messages are provided until the original messages are deleted or pass their visibility timeout.
This is only a behaviour of FIFO queues.
It seems that you will need to receive and delete all messages to be able to access them all. I would suggest:
Receive one (or more) message.
Process it. If everything worked, delete the message.
If there were problems, push the message to a new queue.
Once the queue is empty, you would need to read from the new queue and send them back to the original queue (which should preserve ordering).
If you frequently require more capabilities that Amazon SQS provides, you could consider using Amazon MQ – Managed message broker service for ActiveMQ. It has many more capabilities (but is accordingly less 'simple').
If you set another MessageGroupId, you can get another 10 messages, even you don't release or delete the previous ones.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/using-messagegroupid-property.html

Does SQS really send multiple S3 PUT object records per message?

I've set up an S3 bucket to emit an event on PUT object to SQS, and I'm handling the SQS queue in an EB worker tier.
The schema for the message that SQS sends is here: http://docs.aws.amazon.com/AmazonS3/latest/dev/notification-content-structure.html
Records is an array, implying that there can be multiple records sent in one POST to my worker's endpoint. Does this actually happen? Or will my worker only ever receive one record per message?
The worker can only return one response, either 200 (message handled successfully) or non-200 (message not handled successfully, which puts it back into the queue), regardless of how many records in the message it receives.
So if my worker receives multiple records in a message, and it handles some successfully (say by doing something with side effects such as inserting into a database) but fails on one or more, how should I handle that? If I return 200, then the ones that failed will not be retried. But if I return non-200, then the ones that were handled successfully will be retried unnecessarily, and possibly re-inserted. So I'd have to make my worker smart enough to retry only the failed ones -- which is logic I'd prefer not having to write.
This would be much easier if only one record was ever sent per message. So if that's the case in practice, despite records being an array, I'd really like to know!
To be clear, it's not the records that "SQS sends." It's the records that S3 sends to SQS (or to SNS, or to Lambda).
Currently, all S3 event notifications have a single event per notification message. We might include multiple records as we add new event types in the future. This is also a message format that is shared across other AWS services, and other services can include multiple records.
— https://forums.aws.amazon.com/thread.jspa?messageID=592264&#592264
So, for the moment, it appears there's only one record per message.
But... you are making a mistake if you assume your application need not be prepared to handle repeated or duplicate messages. In any massive and distributed system like SQS it is extremely difficult to absolutely guarantee that this can never happen, however unlikely:
Q: How many times will I receive each message?
Amazon SQS is engineered to provide “at least once” delivery of all messages in its queues. Although most of the time each message will be delivered to your application exactly once, you should design your system so that processing a message more than once does not create any errors or inconsistencies.
— http://aws.amazon.com/sqs/faqs/
Incidentally, in my platform, more than one entry in the records array is considered an error, causing the message to be abandoned and sent to the dead letter queue for review.