Jmeter load requests failure for AWS API Gateway + Lambda - amazon-web-services

I need to load test a node js based AWS lambda function behind AWS API Gateway. But If I try basic test with 1000 users + 60 seconds, 10-20 % requests get failed with error
java.net.SocketException: Broken pipe (Write failed)
When I try same node js function on local server with node js / mongo db, It allow 5K..10K requests without failures.
Node js script inserting a big packet json data (approx 100K) into mongo db.
These are configurations
Jmeter : connect timeout : 10000 millseconds
AWS Lamdba RAM allocated : 512M
AWS Lamdba Timeout : 30 seconds
UPDATE :
After updating heap to 2GB, I am able to hit 1K requests, but now it failing (40% requests) at 5K requests.

Related

Increase idle timeout of AWS serverless websocket

We are trying to use serverless Websocket feature of API Gateway on AWS platform.
During initial observation, we have seen like the idle connection timeout for such websocket is 10 minutes. We have a requirement where we need to increase this time to 30 minutes so that the websocket connection should not close.
Is there any setting or alternate way of increasing this default idle time?
If you take a look at the table under API Gateway quotas for configuring and running a WebSocket API in this AWS documentation, you can see that increase of idle timeout is currently not supported.
My solution was to send a heartbeat (without any data since we just need to interact through the websocket to let API Gateway know that the connection is not idle) through the websocket every 5 minutes, and it has been working well.

Data Pipeline: URL request in Google Cloud Function ends with "crash" on VPC Connector

I am having a small problem with my Cloud Function that crashes the message is
Function execution took 242323 ms, finished with status: 'crash'
My Setup
There are two GCP projects set up, One is managed by Department A, I work in Department B and I am getting access to a server that is set up on Department A's GCP project.
The Department A GCP project sits behind our internal network, and I am accessing a server on that project via VPC Connector.
On my department B GCP project, I use Cloud Scheduler, Cloud Pub/Sub, Cloud Function and Cloud Storage.
Workflow
Cloud Scheduler publishes once a day a message to a pub/sub-topic.
A Cloud Function subscribes to the pub/sub-topic, when a new message arrives at the pub/sub-topic, the Cloud Function will initiate an HTTP request to the server on the Department A GCP Server. The request made to the server initiates a query and returns data that is stored in CLoud Storage as a .csv file.
I have several URL that I have scheduled to run during a morning period, only on give me a problem as it takes the longest time to execute, all other URLs that I execute are completing with status OK, and the files are stored in Cloud Storage.
This specific URL that I have a problem with always crashes around 242323 ms, even if my Cloud Function is set to 540 seconds.
To mention the other URLs that works are all completed before 242323 ms mark.
Viewing the log I can see that the troublesome URL that crash the Cloud Function, the message is ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
I am using python "requests" to make the HTTP Request.
In Department A we use a proxy server in their GCP Project which the VPC-Connector has been paired with.
We can use cURL to make the HTTP request from the Department B GCP project, with cURL we can without any problem complete the request to the URL that we have a problem with using a Cloud Function.
The issue is that the request from the Cloud Function gets terminated before the end of the Cloud Functions execution time of 540 seconds. I tested the proxy and it has a timeout of 15 minutes, which is more than sufficient for the Cloud Function.
It may be an issue with the VPN Connector, I can not see any settings that are related to the hangup, I'm hoping someone here has an idea on what to look for.

How to receive an endless WebSocket data source using AWS Lambda?

I want to crawl data from a WebSocket data source, usually WebSocket data is an endless stream, while an AWS Lambda function has a Timout limit, the maximum allowed value is 900 seconds.
If my Lambda function acts as a WebSocket client and connects to a WebSocket url, e.g., wss://ws-feed-public.sandbox.pro.coinbase.com, it starts to receive data for 900 seconds and get terminated by then.
How to keep my Lamda function running forever? Thanks!
Right now I'm running my crawler inside a Linux VM, is it possible to migrate it to AWS Lambda?
AWS Lambda functions run for a maximum of 900 seconds (15 minutes).
There is no way to extend this.
You should continue using an Amazon EC2 instance or a container (ECS, Fargate).
Fun fact: When initially released, the limit was 3 minutes. It was later extended to 5 minutes, then to 15 minutes.

AWS cloudwatch metrics

I would like to know more details and impact on aws cloud watch metrics- aws docs does have much details on metrics
what is the difference between below metrics ?
What is the impact on application or aws instance if below alerts triggers ?
Http Server Errors GreaterThan 0 (Count) in the last 5 minutes was activated
Requests GreaterThan 100 (Count) in the last 5 minutes
Http 404 GreaterThan 0 (Count) in the last 5 minutes' was activated
Requests GreaterThan 500 (Count) in the last 5 minutes' was activated
Cloudwatch checking these errors in logs ?
These metrics are related to your load balancer. Here is my explanation:
the web server behind the load balancer throws an HTTP error with code 5XX - indicating that your server cannot perform the request. This can be due to several reasons such as Internal Server Error, Not Implemented (e.g., the server expects POST but the client sends GET), Gateway Timeout (e.g., the server executes a slow db query and
the result does not come back in time), etc.
the number of requests completed or connections made is more than 100 - indicating exactly what is says.
the number of "Not Found" messages received by the clients - indicating that the client is requesting a page which does not exist in your application (for instance, https://stackoverflow.com/test)
the number of requests completed or connections made is more than 500 - the same as number 2 but indicating even more requests.
If all these alarms are triggered at once, there is probably high load on your server and it is not functioning optimally. More than that, though, is hard to say. You need to check the maximum number of errors. The most important one is 5XX (number 1).
The load balancers, publish these metrics to CloudWatch and these don't have anything to do with your application log (if I understood the question correctly).

AWS throttling for Code Commit

I am getting below error when I'm doing git operations on a Code Commit repository. The number of operations is in the range of tens in few minutes - adding/removing/pulling files.
Is this because of AWS throttling or something else?
If so, what's the limit and how do I increase it in AWS?
"interim_desc": "RequestId: 12e27770db854bf0a6034cd6f851717d. 'git fetch origin --depth 20' returned with exit code 128.
error: RPC failed; HTTP 429 curl 22 The requested URL returned error: 429 Too Many Requests: The remote end hung up unexpectedly'"
Here is the manual how to handle 429 error while accessing CodeCommit:
Access error: “Rate Exceeded” or “429” message when connecting to a CodeCommit repository
https://docs.aws.amazon.com/codecommit/latest/userguide/troubleshooting-ae.html#troubleshooting-ae3
I would copy here the most noteable part:
Implement jitter in requests, particularly in periodic polling requests.
If you have an application that is polling CodeCommit periodically and this application is running on multiple Amazon EC2 instances, introduce jitter (a random amount of delay) so that different Amazon EC2 instances do not poll at the same second. We recommend a random number from 0 to 59 seconds to evenly distribute polling mechanisms across a one-minute timeframe.
......
Request a CodeCommit service quota increase in the AWS Support Center.
To receive a service limit increase, you must confirm that you have already followed the suggestions offered here, including implementation of error retries or exponential backoff methods. In your request, you must also provide the AWS Region, AWS account, and timeframe affected by the throttling issues.