Elastic Beanstalk Workers and the VisibilityTimeout period - amazon-web-services

According to http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-tiers.html:
If the application returns any response other than 200 OK, then Elastic
Beanstalk waits to put the message back in the queue after the configured
VisibilityTimeout period.
I have set the VisibilityTimeout to 1 minute. My app is returning a 400 error when processing the request. I see from the logs that the request is being re-tried every 2 seconds! I was expecting, based on the above, for it to retry every 60 seconds.
What am I missing?

This might not be the issue of the SQS queue at all. It is true that the message is returned to the queue only after the specified VisibilityTimeout, but it depends on how you are polling the messages.
If you do not access the queue directly (but use some kind of service to do it for you), you have another layer of complexity there.
There's a worker process in Elastic Beanstalk called sqsd that's doing the polling, (processing the messages and deleting them from the queue once you respond with 200).
The sqsd uses similar concept called InactivityTimeout - this specifies the time for this worker to wait for the 200 response and it resends the message after this time if such response is not delivered.
My guess is that the cause of your problem is in this InactivtyTimeout.
If this is not the cause, try looking into the WaitTimeSeconds parameter of your SQS. This specifies that the call to the SQS should be returned immediately if there are messages in the queue (otherwise, it waits for the specified time).
I had a similar issue with an EC2 instance and I specified all the timeouts. In the end - it turned it was caused by a bug in Tomcat - see this: https://forums.aws.amazon.com/thread.jspa?threadID=183473

Related

AWS Lambda triggered twice for a sigle SQS Message

I have a system where a Lambda is triggered with event source as an SQS Queue.Each message gets our own internal unique id to differentiate between two requests .
Now lambda deletes the message from the queue automatically after sqs invocation and keeps the message in inflight while processing it so duplicate processing of a unique message should never occur ideally.
But when I checked my logs a message with the same unique id was processed within 100 milliseconds of the time frame of each other.
So This seems like two lambdas were triggered for one message and something failed at the end of aws it was either visibility timeout or something else.I have read online that few others have gone through the same situation.
Can anyone who has gone through the same situation explain how did they solve it or people with current scalable systems who don't have this kind of issue can help me out with the reasons why I could be having it ?
Note:- One single message was successfully executed Twice this wasn't the case of retry on failure.
I faced a similar issue, where a lambda (let's call it lambda-1) is triggered through a queue, and lambda-1 further invokes lambda-2 'synchronously' (https://docs.aws.amazon.com/lambda/latest/dg/invocation-sync.html) and the message basically goes to inflight and return back after visibility timeout expiry and triggers lambda-1 again. This goes on in a loop.
As per the link above:
"For functions with a long timeout, your client might be disconnected
during synchronous invocation while it waits for a response. Configure
your HTTP client, SDK, firewall, proxy, or operating system to allow
for long connections with timeout or keep-alive settings."
Making async calls in lambda-1 can resolve this issue. In the case above, invoking lambda-2 with InvocationType='Event' returns back, which in-turn deletes the item from queue.

How can I know if SQS has a new message?

I am writing an application that use lambda function that send request to a spring boot application which will call other service. I have to use sqs (required). So sqs is between lambda and spring. The question is how do my spring application know if there is new message in sqs.
I heard about long pooling, but I don't know if this is what I need.
Do I need to set a loop that open the long pooling forever or something?
Is it efficient? I mean if there are 10 message in sqs, The connection will be opened ten times?
I aslo find using while loop here: Check for an incoming message in aws sqs
Thanks
The answer you linked is accurate.
You must write a program that polls SQS for a message (or up to 10 messages). It is more efficient to use long polling because you require less calls.
If you wish to know about a message very quickly, then you will need to poll continually. That is, as soon as it comes back and says "nothing to receive", you should call it again. To reduce the frequency of these calls, you can set long polling, up to a maximum of 20 seconds. This means that, if there are no messages in the queue, the ReceiveMessages() option will take 20 seconds before it returns a response of "no messages". If, however, a message arrives in the meantime, it will respond immediately. The long polling option is specified when making the ReceiveMessages() request.
If you do not require instant notification, your application could call less often (eg every minute, or every few minutes). This would involve less calls to Amazon SQS.
When making the ReceiveMessages() call, your application can request up to 10 messages. This means that multiple messages might be returned.
Once your application has finished processing a message, it must call DeleteMessage() to have the message removed from the queue. This is a failsafe that will automatically put the message back on the queue if there is a problem with the application and the message doesn't get correctly processed.
This is a great video from the AWS re:Invent conference that explains Amazon SQS (and Amazon SNS) in detail: AWS re:Invent SVC 105: AWS Messaging

AWS SQS: Delay making available a message that failed to process

Here's my scenario:
I have an SQS queue which processes a number of tasks. Those tasks can, and often times do, fail. Their failure is common and somewhat expected.
When a task fails, I want to retry it after a certain amount of time, and fail the item into a DLQ after a certain amount of retries. I do not want to retry immediately.
I have a worker EB app which processes these tasks. When it succeeds, I return 200 (and the task is successfully removed from the queue). When it fails I return 404, and the task is immediately returned to the queue (and, thus, immediately retried). This is not desired, I'd like to delay this failed item before it is retried.
Is it possible to do this with a combination of visibility timeouts and delay queues?
You can do this natively with SQS by calling ChangeMessageVisibility on a message you just failed to process and setting the VisibilityTimeout to whatever you want: http://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_ChangeMessageVisibility.html
Answered my own question, turns out I was looking in the wrong place (SQS config options, not EB config options). The magic settings I was looking for is "error visibility timeout" in the EB config options, which allows you to control the amount of time a failed item has before returning to its queue.

Subscribing to AWS SQS Messages

I have large number of messages in AWS SQS Queue. These messages will be pushed to it constantly by other source. There are no proper dynamic on how often those messages will be pushed to queue. Currently, I keep polling SQS every second and checking if there are any messages available in there. Is there any better way of handling this, like receiving notification from SQS or SNS that some messages are available so that I only request SQS when I needed instead of constant polling?
The way to do what you want is to use long polling - rather than constantly poll every second, you open a request that stays open until it either times out or a message comes into the queue. Take a look at the documentation for ReceiveMessageRequest
ReceiveMessageRequest req = new ReceiveMessageRequest()
.withWaitTimeSeconds(Integer.valueOf(20)); // set long poll timeout to 20 sec
// set other properties on the request as well
ReceiveMessageResult result = amazonSQS.receiveMessage(req);
A common usage pattern for this is to have a background thread running the long poll and pushing the results into an internal queue (such as LinkedBlockingQueue or an ExecutorService) for a worker thread to read from.
PS. Don't forget to call deleteMessage once you're done processing the result so you don't end up receiving it again.
You can also use the worker functionality in AWS Elastic Beanstalk. It allows you to build a worker to process each message, and when you use Elastic Beanstalk to deploy it to an EC2 instance, you can define it as subscribed to a specific queue. Then each message will be POST to the worker, without your need to call receive-message on it from the queue.
It makes your system wiring much easier, as you can also have auto scaling rules that will allow you to spawn multiple workers to handle more messages in time of peak load, and scale down back to a single worker, when the load is low. It will also delete the message automatically, if you respond with OK from your worker.
See more information about it here: http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-tiers.html
You could also have a look at Shoryuken and the property delay:
delay: 25 # The delay in seconds to pause a queue when it's empty
But being honest we use delay: 0 here, the cost of SQS is inexpensive:
First 1 million Amazon SQS Requests per month are free
$0.50 per 1 million Amazon SQS Requests per month thereafter ($0.00000050 per SQS Request)
A single request can have from 1 to 10 messages, up to a maximum total payload of 256KB.
Each 64KB ‘chunk’ of payload is billed as 1 request. For example, a single API call with a 256KB payload will be billed as four requests.
You will probably spend less than 10 dollars monthly polling messages every second 24x7 in a single host.
One of the advantages of Shoryuken is that it fetches in batch, so it saves some money compared with a fetch per message solutions.

SQS Messages never gets removed/deleted after script runs

I'm having issues where my SQS Messages are never deleted from the SQS Queue. They are only removed when the lifetime ends, which is 4 days.
So to summarize the app:
Send URL to SQS Queue to wait to be crawled
Send message to Elastic Beanstalk App that crawls the data and store it in database
The script seems to be working in the meaning that it does receive the message, and it does crawl it successfully and store the data successfully in the database. The only issue is that the messages remain in the queue, stuck at "Message Available".
So if I for example load the queue with 800 messages, it will be stuck at ~800 messages for 4 days and then they will all be deleted instantly because of the lifetime value. It seems like a few messages get deleted because the number changes slightly, but a large majority is never removed from the queue.
So question:
Isn't SQS supposed to remove the message as soon as it has been send and received by the script?
Is there a manual way for me to in the script itself, delete the current message? From what I know the message is only sent 1 way. From SQS -> App. So from what I know, I can not do SQS <-> App.
Any ideas?
A web application in a worker environment tier should only listen on
the local host. When the web application in the worker environment
tier returns a 200 OK response to acknowledge that it has received and
successfully processed the request, the daemon sends a DeleteMessage
call to the SQS queue so that the message will be deleted from the
queue. (SQS automatically deletes messages that have been in a queue
for longer than the configured RetentionPeriod.) If the application
returns any response other than 200 OK or there is no response within
the configured InactivityTimeout period, SQS once again makes the
message visible in the queue and available for another attempt at
processing.
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-tiers.html
So I guess that answers my question. Some messages do not return HTTP 200 and then they are stuck in an infinite loop.
No the messages won't get deleted when you read a Queue Item; it is only hidden for a specific amount of time it is called as Visibility Timeout. The idea behind visibility timeout is to ensure that if there are multiple consumers for a single queue, no two consumer pick the same item and start processing.
The is the change you need to do your app to get the expected behavior
Send URL to SQS Queue to wait to be crawled
Send message to Elastic Beanstalk App that crawl the data and store it in database
On the event of successful crawled status, use the receipt-handle(not the message id) and delete the Queue Item from the Queue.
AWS Documentation - DeleteMessage