I have been researching AWS Documentation on how to invoke lambda functions, and I've come across different ways to do that. Mainly, Lambda invocation is done by calling Invoke() function which can be used to invoke lambda functions synchronously or asynchronously.
Currently I am invoking my Lambda functions via HTTP Request (as REST API), but, HTTP Request times out after 30 seconds, while asynchronous calls as far as I know times out after 15min.
What are the advantages, besides time that I have already mentioned, of asynchronous lambda invocation compared to invoking lambda with HTTP Request. Also, what are best (recommended) ways to invoke lambdas in production? On AWS docs (SDK for Go - https://docs.aws.amazon.com/sdk-for-go/api/service/lambda/#InvokeAsyncInput) I see that InvokeAsyncInput and InvokeAsyncOutput have been depricated. So I am wondering how async implementation would actually look like.
Lambda really is about event-driven-computing. This means Lambda always gets triggered in response to an event. This event can originate from a wide range of AWS Services as well as the AWS CLI and SDK.
All of these events invoke the Lambda function and pass some kind of information in the form of an event and context object. How this event looks like depends on the service that triggered lambda. You can find more information about the context in this documentation.
There is no real "best" way to invoke Lambda - this mostly depends on your use case - if you're building a webservice, let API Gateway invoke Lambda for you. If you want to process new files on S3 - let S3 trigger Lambda. If you're just testing the Lambda function you can invoke it via the CLI. If you have custom software that needs to trigger a Lambda function you can use the SDK. If you want to run Lambda on a schedule, configure CloudWatch events...
Please provide more information about your use case if you require a more detailed evaluation of the available options - right now this is very broad.
Related
GCP has a service called "GCP Cloud Scheduler". I can simply call an api to schedule a REST endpoint call in 45 minutes OR can call the api to schedule a recurring call every 24 hours.
What is the AWS equivalent here? I see a bunch of stuff with lambdas but don't really want the added complexity of needing a lamba (ie. GCP functions == lambdas and I don't need a GCP function to do what I need in GCP). A lambda would be 1 more point of failure I do not want to really monitor.
TWO main questions
is there an equivalent (preferably without lambdas)?
if lambdas is the only way to go, what is the service to call to make sure I can feed the REST endpoint through to the lambda to call? (I am really hoping I don't have to create a lambda PER job as that is even more work).
I am considering just using GCP's service and having it call our AWS endpoints as that may be a ton easier unless anyone knows of an AWS equivalent?
I have not tried anything yet as I can't quite find the correct API in AWS.
Create an EventBridge Rule with an API Destination
Unfortunately, there is no other equivalent in AWS if your endpoint cannot respond within 5s.
As you mentioned, you'd need a lambda to call your endpoint. This lambda can be triggered at a regular interval using EventBridge. When creating the rule, you can specify a custom input (that could be your endpoint).
I have an AWS Lambda which is triggered by an S3 push event. The lambda will call an API which will trigger a long-running process. I recognize that I can configure S3 to invoke the lambda function asynchronously, and so S3 will not wait for a response, but I am interested to find out if I can configure lambda to call my API asynchronously as well. I don't want lambda waiting for several minutes while the process completes. Can anyone point me to some documentation which outlines this process? Thanks in advance.
I don't think Lambda can do this nor would I recommend a workaround. There is an article on SenseDeep which talks about this and specifically points out that points out "So what happens if we simply do not call "await" and thus not wait on the response from our HTTP request?" - "Strange things happen" - which is to say that invoking an async call in a lambda then returning immediately has unpredictable results.
Why do you need the Lambda to return quickly? If there is a valid reason (for example, you want a push notification that something in S3 has changed right away), then I'd recommend a different pattern.
Updating S3 triggers a lambda
That lambda writes to an SNS topic and does whatever fast action you want
There is a subscriber to the SNS topic that does the long running action you want
I need to asynchronously invoke lambda functions from my EC2 instance. At high level, so many services come to my mind(most likely all of these support my desired functionality) -
AWS State machine(not sure), step functions, Active MQ, SQS, SNS. I am aware about pros and cons of each at high level. However not sure which one should I go for :|. Please let me know your feedback.
PS: We expect the invocation in 1000s per second at peak for very short periods. Concurrency for lambda functions is not an issue as we can ask Amazon for increase in the limit along with the burst.
If you want to invoke asynchronously then you can not use SQS as SQS invoke lambda function synchronously with event source mapping.
You can use SNS to invoke lambda function asynchronously out of the option you listed above.
Better option would be writing small piece of code in any AWS SDK whichever you are comfortable and then call lambda function from that piece of code asynchronously.
Example in python using boto3 asynchronously
pass Event in InvocationType to invoke lambda function asynchronously and pass RequestResponse to invoke lambda function synchronously
client = boto3.client('lambda')
response = client.invoke(
FunctionName="loadSpotsAroundPoint",
**InvocationType='Event',**
Payload=payload3
Given a REST API, outside of my AWS environment, which can be queried for json data:
https://someExternalApi.com/?date=20190814
How can I setup a serverless job in AWS to hit the external endpoint on a periodic basis and store the results in S3?
I know that I can instantiate an EC2 instance and just setup a cron. But I am looking for a serverless solution, which seems to be more idiomatic.
Thank you in advance for your consideration and response.
Yes, you absolutely can do this, and probably in several different ways!
The pieces I would use would be:
CloudWatch Event using a cron-like schedule, which then triggers...
A lambda function (with the right IAM permissions) that calls the API using eg python requests or equivalent http library and then uses the AWS SDK to write the results to an S3 bucket of your choice:
An S3 bucket ready to receive!
This should be all you need to achieve what you want.
I'm going to skip the implementation details, as it is largely outside the scope of your question. As such, I'm going to assume your function already is written and targets nodeJS.
AWS can do this on its own, but to make it simpler, I'd recommend using Serverless. We're going to assume you're using this.
Assuming you're entirely new to serverless, the first thing you'll need to do is to create a handler:
serverless create --template "aws-nodejs" --path my-service
This creates a service based on the aws-nodejs template on the provided path. In there, you will find serverless.yml (the configuration for your function) and handler.js (the code itself).
Assuming your function is exported as crawlSomeExternalApi on the handler export (module.exports.crawlSomeExternalApi = () => {...}), the functions entry on your serverless file would look like this if you wanted to invoke it every 3 hours:
functions:
crawl:
handler: handler.crawlSomeExternalApi
events:
- schedule: rate(3 hours)
That's it! All you need now is to deploy it through serverless deploy -v
Below the hood, what this does is create a CloudWatch schedule entry on your function. An example of it can be found over on the documentation
First thing you need is a Lambda function. Implement your logic, of hitting the API and writing data to S3 or whatever, inside the Lambda function. Next thing, you need a schedule to periodically trigger your lambda function. Schedule expression can be used to trigger an event periodically either using a cron expression or a rate expression. The lambda function you created earlier should be configured as the target for this CloudWatch rule.
The resulting flow will be, CloudWatch invokes the lambda function whenever there's a trigger (depending on your CloudWatch rule). Lambda then performs your logic.
I've been playing with Lambda recently and am working on creating an API using API Gateway and Lambda. I have a lambda function in place that returns a JSON and an API Gateway endpoint that invokes the function. Everything works well with this simple setup.
I tried loadtesting the API gateway endpoint with the loadtest npm module. While Lambda processes the concurrent requests (albeit with an increase in mean latency over the course of execution), when I send it 40 requests per second or so, it starts throwing errors, only partially completing the requests.
I read in the documentation that by default, Lambda invocation is of type RequestResponse (which is what the API does right now) which is synchronous in nature, and it looks like it is non-blocking. For asynchronous invocation, the invocation type is Event. But lambda discards the return type for async invocations and the API returns nothing.
Is there something I am missing either with the sync, async or concurrency definitions in regards to AWS? Is there a better way to approach this problem? Any insight is helpful. Thank you!
You will have to use Synchronous execution if you want to get a return response from API Gateway. It doesn't make sense to use Async execution in this scenario. I think what you are missing is that while each Lambda execution is blocking, single threaded, there will be multiple instances of your function running in multiple Lambda server environments.
The default number of concurrent Lambda executions is fairly low, for safety reasons. This is to prevent you from accidentally writing a run-away Lambda process that would cost lots of money while you are still learning about Lambda. You need to request an increase in the Lambda concurrent execution limit on your account.