Subscribing to AWS SQS Messages - amazon-web-services

I have large number of messages in AWS SQS Queue. These messages will be pushed to it constantly by other source. There are no proper dynamic on how often those messages will be pushed to queue. Currently, I keep polling SQS every second and checking if there are any messages available in there. Is there any better way of handling this, like receiving notification from SQS or SNS that some messages are available so that I only request SQS when I needed instead of constant polling?

The way to do what you want is to use long polling - rather than constantly poll every second, you open a request that stays open until it either times out or a message comes into the queue. Take a look at the documentation for ReceiveMessageRequest
ReceiveMessageRequest req = new ReceiveMessageRequest()
.withWaitTimeSeconds(Integer.valueOf(20)); // set long poll timeout to 20 sec
// set other properties on the request as well
ReceiveMessageResult result = amazonSQS.receiveMessage(req);
A common usage pattern for this is to have a background thread running the long poll and pushing the results into an internal queue (such as LinkedBlockingQueue or an ExecutorService) for a worker thread to read from.
PS. Don't forget to call deleteMessage once you're done processing the result so you don't end up receiving it again.

You can also use the worker functionality in AWS Elastic Beanstalk. It allows you to build a worker to process each message, and when you use Elastic Beanstalk to deploy it to an EC2 instance, you can define it as subscribed to a specific queue. Then each message will be POST to the worker, without your need to call receive-message on it from the queue.
It makes your system wiring much easier, as you can also have auto scaling rules that will allow you to spawn multiple workers to handle more messages in time of peak load, and scale down back to a single worker, when the load is low. It will also delete the message automatically, if you respond with OK from your worker.
See more information about it here: http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-tiers.html

You could also have a look at Shoryuken and the property delay:
delay: 25 # The delay in seconds to pause a queue when it's empty
But being honest we use delay: 0 here, the cost of SQS is inexpensive:
First 1 million Amazon SQS Requests per month are free
$0.50 per 1 million Amazon SQS Requests per month thereafter ($0.00000050 per SQS Request)
A single request can have from 1 to 10 messages, up to a maximum total payload of 256KB.
Each 64KB ‘chunk’ of payload is billed as 1 request. For example, a single API call with a 256KB payload will be billed as four requests.
You will probably spend less than 10 dollars monthly polling messages every second 24x7 in a single host.
One of the advantages of Shoryuken is that it fetches in batch, so it saves some money compared with a fetch per message solutions.

Related

Kinesis vs SQS, which is the best for this particular case?

I have been reading about Kinesis vs SQS differences and when to use each but I'm struggling to know which is the appropriate solution for this particular problem:
Strava-like app where users record their runs
50 incoming runs per second
The processing of each run takes exactly 1 minute
I want the user to have their results in less than 5 minutes
A run is just a guid, the job that processes it will get al the info from S3
If i understand correctly in kinesis you can have 1 worker per shard, correct? That would mean 1 runs per minute. Since i have 3000 incoming runs per minute, to meet the 5 minute deadline would mean i would need to have 600 shards with 1 worker each.
Is this assumption correct?
With SQS I can just have 1 queue and as many workers as I like, up to SQS's limit of 120,000 inflight messages.
If 1 run errors during processing I want to reprocess it a few more times and then store it for further inspection.
I don't need to process messages in order, and duplicates are totally fine.
1 worker per message, after it's processed i no longer care about the message
In that case, a queuing services such as SQS should be used. Kinesis is a streaming service, which persist a data. This means that multiple works can read messages from a stream for as long as they are valid. Non of your workers would be able to remove the message from the stream.
Also with SQS you can setup dead-letter queues which would allow you capture messages with fail to process after a pre-defined number of trials.

SQS Lambda Trigger polling rate

I'm trying to understand how SQS Lambda Triggers works when polling for messages from the Queue.
Criteria
I'm trying to make sure that not more than 3 messages are processed within a period of 1 second.
Idea
My idea is to set the trigger BatchSize to 3 and setting the ReceiveMessageWaitTimeSeconds of the queue to 1 second. Am I thinking about this correctly?
Edit:
I did some digging and looks like I can set a concurrency limit on my Lambda. If I set my Lambda concurrency limit to one that ensures only one batch of message gets processed at a time. If my lambda runs for a second, then the next batch of messages gets processed at least a second later. The gotcha here is long-polling auto scales the number of asychronous polling on the queue based on message volume. This means, the lambdas can potentailly throttle when a large number of messages comes in. When the lambdas throttle, the message goes back to the queue until it eventually goes into the DLQ.
ReceiveMessageWaitTimeSeconds is used for long polling. It is the length of time, in seconds, for which a ReceiveMessage action waits for messages to arrive (docs). Long polling does not mean that your client will wait for the full length of the time set. If you have it set to one second, but in the queue we already have enough messages, your client will consume them instantaneously and will try to consume again as soon as processing is completed.
If you want to consume certain number of messages at certain rate, you have do this on your application (for example consumes messages on a scheduled basis). SQS by itself does not provide any kind of rate limiting similar to what you would want to accomplish.

Delay in getting messages from AWS SQS

I am adding messages in SQS on Lambda and then receiving the messages inside a container on ECS.
The problem is there is a 10-15 seconds of delay when I am receiving the messages on the container.
On the container a loop is running indefinitely every 1 second where I am getting messages and if available processing it.
Example:
Suppose the message is added in SQS at 15:20:00 but I am able to get that message at 15:20:15 on ECS. These 15 seconds are too long for my use case.
Can this time be reduced ?
Assuming that there are multiple producers and consumers is there any alternative solution ?
If your workers are continually polling the Amazon SQS queue, they can reduce the amount of requests by specifying WaitTimeSeconds=20 (which is its maximum value).
This tells Amazon SQS to wait until at least one message is available, to a maximum of 20 seconds. If no messages are available after 20 seconds, an empty set of messages is returned. However, if one or more messages appear in the queue, then the call returns immediately without waiting for 20 seconds.
This reduces the frequency of calls to SQS and might increase stability in your application.

Optimizing SQS Lambda configuration for single concurrency microservice composition

Apologies for the title. It's hard to summarise what I'm trying to accomplish.
Basically let me define a service to be an SQS Queue + A Lambda function.
A service (represented by square brackets below) performs a given task, where the queue is the input interface, processes the input, and outputs on to the queue of the subsequent service.
Service 1 Service 2 Service 3
[(APIG) -> (Lambda)] -> [(SQS) -> (Lambda)] -> [(SQS) -> (Lambda)] -> ...
Service 1: Consumes the request and payload, splits it into messages and passes on to the queue of the next service.
Service 2: This service does not have a reserved concurrency. It validates each message on the queue, and if valid, passes on to the next service.
Service 3: Processes each message in the queue (ideally in batches of approximately 100). The lambda here must have a reserved concurrency of 1 (as it hits an API that can't process multiple requests concurrently).
Currently I have the following configuration on Service 3.
Default visibility timeout of queue = 5 minutes
Lambda timeout = 5 minutes
Lambda reserved concurrency = 1
Problem 1: Service 3 consumes x items off the queue and if it finishes processing them within 30 seconds I expect the queue to process the next x items off the queue immediately (ideally x=100). Instead, it seems to always wait 5 minutes before taking the next batch of messages off the queue, even if the lambda completes in 30 seconds.
Problem 2: Service 3 typically consumes a few messages at a time (inconsistent) rather than batches of 100.
A couple of more notes:
In service 3 I do not explicitly delete messages off the queue using the lambda. AWS seems to do this itself when the lambda successfully finishes processing the messages
In service 2 I have one item per message. And so when I send messages to Service 3 I can only send 10 items at a time, which is kind of annoying. Because queue.send_messages(Entries=x), len(x) cannot exceed 10.
Does anyone know how I solve Problem 1 and 2? Is it an issue with my configuration? If you require any further information please ask in comments.
Thanks
Both your problems and notes indicate misconfigured SQS and/or Lambda function.
In service 3 I do not explicitly delete messages off the queue using
the lambda. AWS seems to do this itself when the lambda successfully
finishes processing the messages.
This is definitely not the case here as it would go agains the reliability of SQS. How would SQS know that the message was successfully processed by your Lambda function? SQS doesn't care about consumers and doesn't really communicate with them and that is exactly the reason why there is a thing such as visibility timeout. SQS deletes message in two cases, either it receives DeleteMessage API call specifying which message to be deleted via ReceiptHandle or you have set up redrive policy with maximum receive count set to 1. In such case, SQS will automatically send message to dead letter queue when if it receives it more than 1 time which means that every message that was returned to the queue will be send there instead of staying in the queue. Last thing that can cause this is a low value of Message Retention Period (min 60 seconds) which will drop the message after x seconds.
Problem 1: Service 3 consumes x items off the queue and if it finishes
processing them within 30 seconds I expect the queue to process the
next x items off the queue immediately (ideally x=100). Instead, it
seems to always wait 5 minutes before taking the next batch of
messages off the queue, even if the lambda completes in 30 seconds.
This simply doesn't happen if everything is working as it should. If the lambda function finishes in 30 seconds, if there is reserved concurrency for the function and if there are messages in the queue then it will start processing the message right away.
The only thing that could cause is that your lambda (together with concurrency limit) is timing out which would explain those 5 minutes. Make sure that it really finishes in 30 seconds, you can monitor this via CloudWatch. The fact that the message has been successfully processed doesn't necessarily mean that the function has returned. Also make sure that there are messages to be processed when the function ends.
Problem 2: Service 3 typically consumes a few messages at a time
(inconsistent) rather than batches of 100.
It can never consume 100 messages since the limit is 10 (messages in the sense of SQS message not the actual data that is stored within the message which can be anywhere up to 256 KB, possibly "more" using extended SQS library or similar custom solution). Moreover, there is no guarantee that the Lambda will receive 10 messages in each batch. It depends on the Receive Message Wait Time setting. If you are using short polling (1 second) then only subset of servers which are storing the messages will be polled and a single message is stored only on a subset of those servers. If those two subsets do not match when the message is polled, the message is not received in that batch. You can control this by increasing polling interval, Receive Message Wait Time, (max 20 seconds) but even if there are not enough messages in the queue when the timer finishes, the batch will still be received with fewer messages, possibly zero.
And as it was mentioned in the comments, using this strategy with concurrency set to low number can lead to some problems. Another thing is that you need to ensure that rate at which messages are produced is somehow consistent with the time it takes for one instance of lambda function to process the message otherwise you will end up with constantly growing queue, possibly losing messages after they outlive the Message Retention Period.

Rate-limiting a Worker for a Queue (e.g.: SQS)

Every day, I will have a CRON task run which populates an SQS queue with a number of tasks which needs to be achieved. So (for example) at 9AM every morning, and empty queue will receive ~100 messages that will need to be processed.
I would like a new worker to be spun up every second until the queue is empty. If any task fails, it's put at the back of the queue to be re-run.
For example, if each task takes up to 1.5 seconds to complete:
after 1 second, 1 worker will have started message A
after 2 seconds, 1 worker may still be running message A and 1 worker will have started running message B
after 100 seconds, 1 worker may still be running message XX and 1 worker will pick up message B because it failed previous
after 101 seconds, no more workers are propagated until the next morning
Is there any way to have this type of infrastructure configured within AWS lambda?
One way, though I'm not convinced it's optimal:
A lambda that's triggered by an CloudWatch Event (say every second, or every 10 seconds, depending on your rate limit). Which polls SQS to receive (at most) N messages, it then "fans-out" to another Lambda function with each message.
Some pseudo code:
# Lambda 1 (schedule by CloudWatch Event / e.g. CRON)
def handle_cron(event, context):
# in order to get more messages, we might have to receive several times (loop)
for message in queue.receive_messages(MaxNumberOfMessages=10):
# Note: the Event InvocationType so we don't want to wait for the response!
lambda_client.invoke(FunctionName="foo", Payload=message.body, InvocationType='Event')
and
# Lambda 2 (triggered only by the invoke in Lambda 1)
def handle_message(event, context):
# handle message
pass
Seems to me you would be better of publishing you messages to SNS, instead of SQS and then have your lambda functions subscribe to the SNS topic.
Let Lambda worry about how many 'instances' it needs to spinup in response to the load.
Here is one blog post on this method, but google may help you find one that is closer to your actual use case.
https://aws.amazon.com/blogs/mobile/invoking-aws-lambda-functions-via-amazon-sns/
Why not just have a Lambda function that starts polling sqs at 9am, getting one message at a time and sleeping for a second between each message? Dead letter queues can handle retries. Stop execution after not receiving a message from SQS after x seconds.
It is a unique case where you don't actually want parallel processing.