I'm using serverless framework to enable hosting my functions on AWS Lambdas. For development, I'm using Kotlin.
Since I wanted to re-use resources ( like DB connection ) by a particular lambda, I have grouped apis which have the same handler function. Like all /posts related apis will be handled by PostHandler. Internally, based on routeKey, I'm assigning requests to concerned functions.
This means that for all /posts endpoints ( like GET /posts/{id}, POST /posts etc.) they all get logged to the same CloudWatch log group. This was becoming a problem. Since I was using an API Gateway, I also enable access logging at API Gateway level. This resolved my issue when I'm directly hitting an api.
However, I also have a service which would like to call these lambdas directly ( it could be lambda of that service invoking lambdas of my service or an EC2/ECS instance invoking lambdas of my service ). In this case, we would directly be using lambdas and no ApiGateway is involved. How can I maintain logging for different endpoints in this case?
It's hard to tell what the actual question is, I think you're saying that it's hard to differentiate between the different use cases and requests?
I would recommend using structured logs (logging as JSON) and then using CloudWatch Logs Insights to query for the specific information you need. For example, you could log a request-id with each message and then query on the request-id later. You could also do the same with a shared id between other services which would give you end-to-end log visibility.
Related
I am very new to AWS Lambda and am struggling to understand its functionalities based on many examples I found online (+reading endless documentations). I understand the primary goal of using such service is to implement a serverless architecture that is cost and probably effort-efficient by allowing Lambda and the API Gateway to take on the role of managing your server (so serverless does not mean you don't use a server, but the architecture takes care of things for you). I organized my research into two general approaches taken by developers to deploy a Flask web application to Lambda:
Deploy the entire application to Lambda using zappa-and zappa configurations (json file) will be the API Gateway authentication.
Deploy only the function, the parsing blackbox that transforms user input to a form the backend endpoint is expecting (and backwards as well) -> Grab a proxy url from the API Gateway that configures Lambda proxy -> Have a separate application program that uses the url
(and then there's 3, which does not use the API Gateway and invokes the Lambda function in the application itself-but I really want to get a hands on experience using the API Gateway)
Here are the questions I have for each of the above two approaches:
For 1, I don't understand how Lambda is calling the functions in the Flask application. According to my understanding, Lambda only calls functions that have the parameters event and context-or are the url calls (url formulated by the API Gateway) actually the events calling the separate functions in the Flask application, thus enabling Lambda to function as a 'serverless' environment-this doesn't make sense to me because event, in most of the examples I analyzed, is the user input data. That means some of the functions in the application have no event and some do, which means Lambda somehow magically figures out what to do with different function calls?
I also know that Lambda does have limited capacity, so is this the best way? It seems to be the standard way of deploying a web application on Lambda.
For 2, I understand the steps leading to incorporating the API Gateway urls in the Flask application. The Flask application therefore will use the url to access the Lambda function and have HTTP endpoints for user access. HOWEVER, that means, if I have the Flask application on my local computer, the application will only be hosted when I run the application on my computer-I would like it to have persistent public access (hopefully). I read about AWS Cloud9-would this be a good solution? Where should I deploy the application itself to optimize this architecture-without using services that take away the architecture's severless-ness (like an EC2 instance maybe OR on S3, where I would be putting my frontend html files, and host a website)? Also, going back to 1 (sorry I am trying to organize my ideas in a coherent way, and it's not working too well), will the application run consistently as long as I leave the API Gateway endpoint open?
I don't know what's the best practice for deploying a Flask application using AWS Lambda and the API Gateway but based on my findings, the above two are most frequently used. It would be really helpful if you could answer my questions so I could actually start playing with AWS Lambda! Thank you! (+I did read all the Amazon documentations, and these are the final-remaining questions I have before I start experimenting:))
Zappa has its own code to handle requests and make them compatible with the "Flask" format. Keep in mind that you aren't really using Flask as intended in either cases when using Lambda. Lambdas are invoked only when calls are made, flask usually keeps running looking for requests. But the continuously running part is handled by the API Gateway here. Zappa essentially creates a single ANY request on API gateway, this request is passed to your lambda handler which interprets it and uses it to invoke your flask function.
If you you are building API Gateway + Lambda you don't need to use Flask. It would be much easier to simply make a function that is invoked by the parameters passed to lambda handler by the API Gateway. The front end application you can host on S3 (if it is static or Angular).
I would say the best practice here would be to not use Flask and go with the API Gateway + Lambda option. This lets you put custom security and checks on your APIs as well as making the application a lot more stable as every request has its own lambda.
We have a requirement to write custom logs for the application to capture the things like who did what and when.
To do that we have created a Lambda to insert the logs in DynamoDb database. We need this Lambda to be called from a common place every time we call an API from frontend of the application instead of invoking it in each and every individual lambdas.
We tried invoking this in the API Gateway Authorizer but it doesn't work because our gateway authorizer is of type 'Token'. So, it does not accept any other parameters than access token. We cannot change the type of custom authorizer to type 'Request' because we need access token to be present for authorizing user in Cognito.
Question:
Is there any place where we can invoke this Logs Lambda so that it executes when each API is called?
Your last paragraph makes no sense but typically the best way to do this is streaming, as this minimises the amount of Lambda invocations you need to make.
You can stream API Access logs which contain things like the path, current time, principal to a cloudwatch log streams, or a lambda.
In this lambda you can do your custom logging logic there. If you have other sources which will have different types of events you may need to use Kinesis directly for streaming.
try using a different event trigger. If your lambda can get triggered by a queue or cloudfront you won't have authorization problems. however your application has to assume a suitable role to use some of these. If you're using Java, you can intercept your request in many ways and make the lambda call via SDK before processing the API. Need more details to provide a holistic solution.
I am contemplating if I should invoke lambda directly from another lambda or should I expose api through api-gateway in front of lambda. I am looking for pros and cons for both.
Approach #1 Using API Gateway
API Gateway and Lambda have one of the best integrations for serverless applications. It is very widely used and offers a ton of features - proxy integration, mapping templates, custom domain names and different types of authentication.
However, with these pros comes the cons due to some limits with using API Gateway. API Gateway has a default integration time out (a hard limit) of 29 seconds - which means the Lambda function needs to send back a response to API within this time frame or API fails with a 504 response. You may review other limits related to API Gateway here.
Approach #2 Lambda invoking Lambda
I'm not a big fan of this approach and have multiple reasons for it. I'll start with the additional code you have to write - same task with better features can be done by API Gateway with simple configurations on the AWS console.
A container calling another container(Lambda) can result it container-related problems - networking, container reuse and even managing IAM permissions properly.
Also, a Lambda function can be invoked by only three options - SDK, CLI or an entity that has "Invoke" permission. So basically, you need to have some kind of resource in front of your first Lambda to invoke it which will then invoke the second. In my opinion, API Gateway is the best front-end you can have for Lambda which is exactly AWS had in mind building these two services.
One of the pros I can think of this approach is the time out value - Lambda can run for up to max of 15 mins. Unless your client does not require a response back pretty quickly, you can run these two Lambda functions for a longer time to execute code.
Summary
All the above information was pretty general for anyone looking to use API Gateway and Lambda. I'll say it again that using API Gateway is a more convenient and useful approach however it may depend on your use-case. Hope this helps!
I have a few different services (generated by the Serverless Framework) that need to communicate between each other. The data is sensitive and requires authentication.
My current strategy is to create an api key for each service communicate between services using json web token like the token below.
fM61kaav8l3y_aLC/3ZZF7nlQGyYJsZVpLLiux5d84UnAoHOqLPu4dw3W7MiGwPiyN
What are some other options for communicating between services? Are there any downsides to this approach? To reiterate, the request needs to be authenticated and appropriately handle sensitive data.
Do you need sync or async communication?
A good approach would be to use events, because aws-lambda is designed as an event based system. So you could use SNS or SQS to decouple your services.
If you just want to make calls from one service to another you could invoke the lambda function directly via the aws-sdk see docs. So you would not add an API Gateways endpoint and your lambdas would stay private.
To better anwser your question you should give a short overview of your application and and an example of an interservice call you would make.
As I understand it, you intend to make the various functions in a given a service private. In doing so, each service will likely have serverless.yml file that resembles the following:
Image shows the setup for api keys used with a serverless framework rest api
While this is a suitable approach, it is less desirable than using ** Custom Authorizers**.
Custom Authorizers allow you to run an AWS Lambda Function before your targeted AWS Lambda Function. This is useful for Microservice Architectures or when you simply want to do some Authorization before running your business logic.
If you are familiar with the onEnter function when using ReactRouter, the logic among Custom Authorizers is similar.
Regarding implementation, since different services are leveraged to deploy various functions, consider deploying the function to AWS and noteing the ARN of the Lambda function. Follow these links to see the appropriate setup for the custom authorizer.
These images show the serverless.yml file for using custom authorizers when the authorizers are not part of the service but rather deployed on lambda already
The following github project aws-node-auth0-custom-authorizers-api/frontend is a good example of how to implement Custom Authorizers when the authorizer funciton is in the same service as the private function. Note your situation differs slightly yet you should expect their authorizer function logic to be simliar - only the project structure should differ
Is it possible to make some kind of HTTP request that will trigger Lambda and allow it to build a response for the request?
Is it possible for Lambda to access CloudFront cache directly or somehow get the data it needs. I guess it can be done making HTTP requests to CloudFront, but maybe there is more direct way to do that, no?
Or all this stuff I'm asking here is a peace of **** and I better go and buy a new server or optimize my code (actually, i would like to, but manager wants CloudFront + Lambda, so I'm trying to figure out if that is possible, but the docs don't give me an answer. Am I blind maybe?)
You can expose your lambda function via an API gateway. Then your lambda function can just run code that will access other services/resources (CloundFront, SNS, SQS, etc). Use the AWS SDK to access these services.
See Amazon API Gateway documentation: http://docs.aws.amazon.com/apigateway/latest/developerguide/getting-started.html