The official guide for AWS API gateway introduces the way to use a lambda function to make responses to the API calls from the gateway. But it can only deal with one function, not for the condition of several functions call one by one.
For a solution, here are two to the best of my mind:
Use AWS Step Function services to bundle the function workflow.
Use one main thread function for orchestration.
Obviously, method 1 will bring extra fees, while method 2 needs a redundant function to run for long.
Could you please give me any help?
If they must be completed one after another before you can return the result, use AWS Step Functions (like you said) to orchestrate this. Synchronous invocation of the other Lambda is also an option (though it's commonly an architectural smell and indicates other issues with your architecture - why are the Lambdas not combined if they need each other etc.)
For asynchronous invocation of the other function (fire and forget), either invoke the Lambda using invocation type Event or even better would be to have an SQS queue in between the Lambdas for improved resiliency.
Related
I have a lambda function that potentially many different APIs call to parse large chunks of data (potentially might take more than a few minutes) and store their results into their own separate S3.
In such case, is it better to have a copy of the same AWS Lambda function separately for each API or is it ok to have the same lambda function being called from many APIs?
The goal is to avoid queuing and have the function run asynchronously for each request.
visual reference
I'm not an expert, so perhaps other answers will help more, but I don't see why it would make a difference as long as the code involved in processing each separate call isn't enough to increase the cost of initializing an instance.
The reason I don't think it would make a difference is that lambda will initialize a new instance if it is invoked while processing another function. This approach is potentially better because you can at times have an already initialized instance from a previous request ready for the next one (although again, I'm sure there are aws experts who could confirm/deny this, you should contact AWS support if you want an authoritative answer).
Source:
If you invoke the function again while the first event is being processed, Lambda initializes another instance, and the function processes the two events concurrently. As more events come in, Lambda routes them to available instances and creates new instances as needed.
https://docs.aws.amazon.com/lambda/latest/dg/invocation-scaling.html
Seems a little inefficient the way it currently is:
response.body = {
user: await userService(userID) // calls a user service to get info on user
friends: await friendsService(userID) // calls a friends service to get info on friends for
}
Let's say the userService and friendsService are configured on different API Gateway endpoints.
Then wouldn't that make the network request take longer than if I were to just package my entire backend into one zip file that's uploaded to AWS Lambda.
Seems like this is very inefficient.
Is there a way to call other lambdas without having to make a network request? I understand putting the lambdas/gateway in the same VPC as the main Gateway endpoint exposed to the internet, but this is expensive?
Anyway to do this more efficiently?
You can call a Lambda function by using the AWS SDK (a LambdaClient object). So, for example assume you wrote two Lambda functions - funA and funB.
Next assume:
you want to call funB from funA
you wrote your Lambda function by using the Lambda Java runtime
You can use the Lambda Java API to invoke funB. There is no need to wrap either one in Restful call using API Gateway. You can use the AWS SDK. Here is the Java API example that shows you how to invoke a Lambda function:
https://github.com/awsdocs/aws-doc-sdk-examples/blob/master/javav2/example_code/lambda/src/main/java/com/example/lambda/LambdaInvoke.java
#smac2020 writes in the answer about using SDK, that is of course a network call too. It just skips api gateway and calls directly AWS api.
I think the key point about Lambda is if it scales well. Let you try to thing about the algorithms in a different way. For example you can create a pipeline where in each step your "state object" is enriched with additional data. You can use step function or SQS and send the requests in between the steps or you can make the client responsible to manage the data. You should try to avoid one function waiting for another function. You are then paying two lambdas running - the caller and the called one.
If you are thinking "but microservices..." - look at the design of the AWS API itself. You do not need output of one service as an input to another one. It needs some time to adapt and to look at problems from different perspectives. In your case I would consider if the user list can for example live in the user object and the calls can be merged (look at some no-sql database design principles).
I want to implement a simple sequence of tasks on AWS Step Function. Something like next:
I can't fire and forget External API, because I need a response from it. So it is a bad idea to wrap it in a lambda function.
I can't implement the External API task on Lambda Function, because work exceeds lambda limitations.
The best way that I see is the implementation of a call to External API from the task of Step Function. If I understand correctly it is possible to do with Activities and Worker.
I see some Ruby example, but it isn't clear for me.
Could anybody suggest me a good tutorial with clear examples of similar implementation?
PS: External API I could wrap in anything on EC2.
Unfortunately you can't call an external API from the Step Function.
You have to wrap the call in a Lambda.
AWS documentation:
Step Functions supports the ability to call HTTP endpoints through API Gateway, but does not currently support the ability to call generic HTTP endpoints.
What Lambda limitation prevents you from wrapping the external API? If Lambda indeed cannot be used, depending on your usage, you may choose:
If the traffic is not stable/continuous, take a look at the ECS task which can be called by Step Functions (https://docs.aws.amazon.com/step-functions/latest/dg/connectors-ecs.html), because it can save you the cost of paying idle.
Otherwise, using EC2/ECS and Activity is the way to go.
Context:
I have a usecase where my backend service should compute 1 or more features, where each feature is a simple peace of computation (can be as simple as adding two numbers) and each feature takes input and return an output value, which can be boolean or a number. Client can actually request features (1 or 10 etc), also each feature can have multiple versions.
Design:
Lambda function seems like a good choice, since it supports versioning and takes care of scaling. In my design, one Lambda will receive the request and then call further lambda functions in parallel (Say user asked for 12 features, Lambda function L1 will invoke 12 Lambda functions in parallel) synchronously, and return all computed feature values as one response (HTTP). This way, all features can be versioned in their own Lambda functions.
Questions:
Is it ok to call a lambda function directly from another Lambda function? Is it a good usecase for using Lambda functions?
Thanks
I think Lambda would work just fine for your use case. For versioning, you could use the API versioning provided by API Gateway but I think that is a bit much for your case. Just create different functions.
Check out serverless.com. It is a solid framework and easy to get started with. It will take a lot of the work out of setting it up, plus you'll have your infrastructure as code.
Yes, it is okay to call lambdas from other Lambdas. There is not a 'clean' way to do that though. On the other hand, 'Step functions' may be what you need. Lambda support's chaining functions in a workflow. The previous lambda is not 'calling' the next function as much as proceeding to the next step in the workflow. The Serverless framework also supports using the method and can be configured in the serverless.yml config file
Say I have a Lambda function called 'TestExecutor' which takes takes in an argument which contains ARNs for N 'Tests' which are also implemented as Lambda functions.
The workflow:
TestExecutor is invoked with a list of ARNs of various 'Tests'
TestExecutor calls each Test concurrently; each Lambda is expected to return a JSON
TestExecutor waits for each Test to complete. It consolidates all the JSONs received
Consolidated JSON is stored in DynamoDB/S3
Problem statement - What is the best way to create this kind of workflow in a Serverless manner?
I considered two AWS Services to manage this:
AWS Step Functions - My step function would need states for each possible 'Test' Lambda that can be executed. I want to give flexibility to the user to invoke any Lambda without needing to 'register' it in my Step function.
AWS SWF - Just seems a little overkill. Suffers from the same problem as above too.
So right now the best I can think of is doing this in a simple manner:
In my TestExecutor Lambda, I could create N threads for N tests each thread invokes a particular Test's Lambda function. Each thread waits for its Test to return a JSON. As all executions are successful, all JSONs are consolidated. Consolidated JSON is stored in DynamoDB.
I'm not happy with this solution - it will be a little tricky to manually manage failures and retries of the Test Lambdas from within the TestExecutor Lambda. This is my first time into trying something serverless, but it just seems like the wrong pattern. I'd like to get a nice top-down view of my workflow - it seems like monitoring this would be a little messy and scattered since there's no formal link between TestExecutor and the Test Lambdas
Maybe I could create an SQS Queue along with each Test Lambda. For each ARN supplied to the TestExecutor, I could push a message to a corresponding queue. But what now? I'd have to create 'Listener' Lambda's for each Test which polls each queue every T seconds. It would then invoke the actual Test Lambda. This also sounds needlessly complex.
Would love to hear some advice! Cheers.
AWS SWF doesn't suffer from the same problem as it doesn't require registration of a lambda function to invoke it.
The main limitation of SWF is that it is still not possible to run decider process as a lambda function. So you'll have to run it somewhere else. If you already have some host that can run it implementing your use case using AWS Flow Framework is pretty straightforward.
You could leverage the AWS SDK to generate a Step Machine using said ARNs from within a Lambda Function.
It would require some way to clean up afterwards somehow, and / or avoid duplicates, or the console would quickly get messy.