API Gateway Authorizer and Logout (Performance/Security Considerations) - amazon-web-services

I am using Cognito, API Gateway and Authorizers. Authorizers are configured to be cached for 5mins for performance. I felt that this is a nice feature.
I understand that authorizers are a good way to keep auth logic in one place and apps can assume users are already authorized. However, I am having my doubts.
The pentest report recommends that once logged out tokens should not be able to be used. So it means for security, I should not enable authorizer caching? It also means all authenticated APIs will have a go through one overhead of a lambda authorizer ...
Also from a coding perspective, isit really a good idea to use authorizers which are hard to test end to end? I could test the lambda function as a unit. But whats more important to me is they are attached to the correct APIs. Theres currently no way I see that allows me to test this easily.
Another problem is looking at the code, I can no longer tell what authorization is required easily ... I have to look through which authorizer is supposed to be attached (eg. CloudFormation) then the lambda code itself.
Is there a good thing from using Authorizers? Or whats the best practice with this actually?

For security, I should not enable authorizer caching
If you have strict security requirements (e.g., as soon as a token is invalidated all requests should fail) you will need to turn off authorizer caching. See this answer from https://forums.aws.amazon.com/thread.jspa?messageID=703917:
Currently there is only one TTL for the authorizer cache, so in the scenario you presented API Gateway will continue to return the same cached policy (generating a 200) for the cache TTL regardless of whether the token may be expired or not. You can tune your cache TTL down to level you feel comfortable with, but this is set at level of the authorizer, not on a per token basis.
We are already considering updates to the custom authorizer features and will definitely consider your feedback and use case as we iterate on the feature.
It also means all authenticated APIs will have a go through one overhead of a Lambda authorizer...
That it does. However, in practice, my team fought far harder with Lambda cold starts and ENI attachment than anything else performance-wise, so the overhead that our authorizer added ended up being negligible. This does not mean that the performance hit was not measurable, but it ended up increasing latency on the order of milliseconds over placing authorizer code directly in the Lambda, a tradeoff which made sense in our application. In stark contrast, Lambda cold starts could often take up to 30s.
Also from a coding perspective, is it really a good idea to use authorizers which are hard to test end to end?
In many applications built on a service-oriented architecture you are going to have "end-to-end" scenarios which cross multiple codebases and can only be tested in a deployed environment. Tests at this level are obviously expensive, so you should avoid testing functionality which could be covered by a lower-level (unit, integration, etc.) test. However, it is still extremely important to test the cohesiveness of your application, and the fact that you will need such tests is not necessarily a huge detractor for SOA.
I could test the Lambda function as a unit. But what's more important to me is they are attached to the correct APIs. There's currently no way I see that allows me to test this easily.
If you are considering multiple authorizers, one way to test that the right authorizers are wired up is to have each authorizer pass a fingerprint down to the endpoint. You can then ping your endpoints and have them return a health check status.
For example,
[ HTTP Request ] -> [ API Gateway ] -> [ Authorizer 1 ] -> [ Lambda 1 ] -> [ HTTP Response ]
Payload: Payload:
user identity status: ok
authorizer: 1 authorizer: 1
In practice, my team had one authorizer per service, which made testing this configuration non-critical (we only had to ensure that endpoints which should have been secured were).
Another problem is. looking at the code, I can no longer tell what authorization is required easily... I have to look through which authorizer is supposed to be attached (eg. CloudFormation) then the Lambda code itself.
Yes, this is true, and the nature of a hugely decoupled environment which was hard to test locally was one of the biggest gripes my team had when working with AWS infrastructure. However, I believe that it's mainly a learning curve when dealing with the AWS space. The development community as a whole is still relatively new to a lot of concepts that AWS exposes, such as infrastructure as code or microservices, so our tooling and education is lacking in this space when compared with traditional monolithic development.
Is this the right solution for your application? I couldn't tell you that without an in-depth analysis. There are plenty of opinions in the broader community which go both ways, but I would like to point you to this article, especially for the fallacies listed: Microservices – Please, don’t. Use microservices because you have developed a solid use case for them, not necessarily just because they're the newest buzzword in computer science.
Is there a good thing from using Authorizers? Or what's the best practice with this actually?
My team used authorizers for AuthN (via a custom auth service), and handled AuthZ at the individual Lambda layer (via a different custom auth service). This was hugely beneficial to our architecture as it allowed us to isolate what were often extremely complex object-specific security rules from the simple question of identity. Your use case may be different, and I certainly won't claim to know best practices. However, I will point you to the API Gateway Authorizer examples for more ideas on how you can integrate this service into your application.
Best of luck.

Related

AWS WAF Log Utilisation + Penetration Testing with Web Applications

How can AWS Web Application Firewall help me in identifying which penetration testing I should use against my web application. Once i have access to the WAF Logs, how can I best utilise it to identify penetration testing.
I would say that the AWS WAF (or any WAF) is not a good indicator of what type of pentesting you should be doing. Determining the scope and type of pentesting is one of the most important first steps a qualified pentester or consultancy should be doing.
On the topic of WAFs, I would also say that they are not a good indicator of true manual pentesting. While the AWS WAF is great at catching SQL Injection and XSS test cases, it is not capable of detecting parameter tampering attacks.
So while it can create alerts that are likely to detect scans, it may fall short of detecting subtle human-driven test cases (which are often more dangerous).
To detect true pentest test-cases, it is always most valuable to add instrumentation at the application layer. This way you can create alerts for when a user tries to access pages or objects that belong to other users.
Also consider that if you create these alerts at the application layer, you can include more valuable data points such as the user and IP address. This will provide a valuable distinction between alerts that are create by random scanner on the internet, and ones created by authenticated users.

Lambda AWS function for each endpoint

I have an application with 3 modules and 25 endpoints (between modules). Modules: Users, CRM, PQR.
I want to optimize AWS costs and generally respect the architecture best practices.
Should I build a lambda function for each endpoint?
Does using many functions cost more than using only one?
The link in Gustavos' answer provides a decent starting point. I'll elaborate on that based on the criteria you mentioned in the comments.
You mentioned that you want to optimize for cost and architecture best practices, let's start with the cost component.
Lambda pricing is fairly straightforward and you can check it out on the pricing page. Basically you pay for how long your code runs in 1MS increments. How much each millisecond costs depends on how many resources you provision for your Lambda function. Lambda is typically not the most expensive item on your bill, so I'd start optimizing it, once it becomes a problem.
From a pricing perspective it doesn't really matter if you have fewer or more Lambda functions.
In terms of architecture best practices, there is no single one-size-fits-all reference architecture, but the post Gustavo mentioned is a good starting point: Best practices for organizing larger serverless applications. How you structure your application can depend on many factors:
Development team size
Development team maturity/experience (in terms of AWS technologies)
Load patterns in the application
Development process
[...]
You mention three main components/modules with 25 endpoints in total:
Users
CRM
PQR
Since you didn't tell us much about the technology stack, I'm going to assume you're trying to build a REST API that serves as the backend for some frontend application.
In that case you could think of the three modules as three microservices, which implement specific functionality for the application. Each of them implements a few endpoints (combination of HTTP-Method and path). If you start with an API Gateway as the entry point for your architecture, you can use that as an abstraction of the internal architecture for your clients.
The API Gateway can route requests to different Lambda functions based on the HTTP method and path. You can now choose how to implement the backend. I'd probably start off with a common codebase from which multiple Lambdas are built and use the API gateway to map each endpoint to a Lambda function. You can also start with larger multi-purpose Lambdas and refactor them in time to extract specific endpoints and then use the API Gateway to route to the more specialized Lambdas.
You might have noticed, that this is a bit vague and that's on purpose. I think you're going to end up with roughly as many Lambdas as you'll have endpoints, but it doesn't mean you have to start that way. If you're just getting started with AWS, managing a bunch of Lambdas and there interaction can seem daunting. Start with more familiar architectures and then refactor them to be more cloud native over time.
It depends on your architecture and how decoupled you want it to be. Here is a good starting point for you to take a look into best practices:
https://aws.amazon.com/blogs/compute/best-practices-for-organizing-larger-serverless-applications/

Load testing AWS SDK client

What is the recommended way to performance test AWS SDK clients? I'm basically just listing/describing resources and would like to see what happens when I query 10k objects. Does AWS provide some type of mock API, or do I really need to request 10k of each type of resource to do this?
I can of course mock in at least two levels:
SDK: I wrap the SDK with my own interfaces and create mocks. This doesn't exercise the SDK's JSON to objects code and my mocks affect the AppDomain with additional memory, garbage collection, etc.
REST API: As I understand it the SDKs are just wrappers to the REST API (hence the HTTP response codes shown in the objects. It seems I can configure the SDK to go to custom endpoints.
This isolates the mocks from the main AppDomain and is more representative, but of course I'm still making some assumptions about response time, limits, etc.
Besides the above taking a long time to implement, I would like to make sure my code won't fail at scale, either locally or at AWS. The only way I see to guarantee that is creating (and paying for) the resources at AWS. Am I missing anything?
When you query 10k or more objects you'll have to deal with:
Pagination - the API usually returns only a limited number of items per call, providing NextToken for the next call.
Rate Limiting - if you hammer some AWS APIs too much they'll rate limit you which the SDK will probably report as some kind of Rate Limit Exceeded Exception.
Memory usage - hopefully you don't collect all the results in the memory before processing. Process them as they arrive to conserve your operating memory.
Other than that I don't see why it shouldn't work.
Update: Also check out Moto - the AWS mocking library (for Python) that can also run in a standalone mode for use with other languages. However as with any mocking it may not behave 100% the same as the real thing, for instance around the Rate Limiting behaviour.

Can AWS Lambda Replace an entire Rest Api layer in an enterprise web application

I am new to AWS and havebeen reading about aws lambda. Its very useful but you still have to write individual lambda functions instead of as a whole. i am wondering practically if its possible AWS Lambda can replace an entire Rest Api layer in an enterprise web application
Of course, everything is possible in the computer world but you need to answer lambda-serverless is the best way for me?
For example, you need smaller business flow per lambda(lambda have some hardware limits and need short computing and starting time for cost savings), that's mean you must separate your flow, its success depends on your business area and implementation. is your working area fit for this? But Lambda can handle almost everything with other AWS services(to be honest, maybe in some cases, lambda is a bit harder than the current system and community support is less than traditional systems but it also has lots of advantages as you know). You can check this repo, it full-serverless booking app and this serverless e-commerce repo.
To sum up, if your team is ready for it, you can start the conversion from some part of your application and check everything is ok. This answer totally depends on your team and business BCS nothing is impossible and that's engineering.
That's my opinion because your question looks like a comment question.

How to build complex apps with AWS Lambda and SOA?

We currently run a Java backend which we're hoping to move away from and switch to Node running on AWS Lambda & Serverless.
Ideally during this process we want to build out a fully service orientated architecture.
My question is if our frontend angular app requests the current user's ordered items to get that information it would need to hit three services, the user service, the order service and the item service.
Does this mean we would need make three get requests to these services? At the moment we would have a single endpoint built for that specific request, which can then take advantage of DB joins for optimal performance.
I understand the benefits SOA, but how to do we scale when performing more compex requests such as this? Are there any good resources I can take a look at?
Looking at your question I would advise to align your priorities first: why do you want to move away from the Java backend that you're running on now? Which problems do you want to overcome?
You're combining the microservices architecture and the concept of serverless infrastructure in your question. Both can be used in conjunction, but they don't have to. A lot of companies are using microservices, even bigger enterprises like Uber (on NodeJS), but serverless infrastructures like Lambda are really just getting started. I would advise you to read up on microservices especially, e.g. here are some nice articles. You'll also find answers to your question about performance and joins.
When considering an architecture based on Lambda, do consider that there's no state whatsoever possible in a Lambda function. This is a step further then stateless services that we usually talk about; they generally target 'client state' that does not exist anymore. But a Lambda function cannot have any state, so e.g. a persistent DB-connection pool is not possible. For all the downsides, there's also a lot of stuff you don't have to deal with which can be very beneficial, especially in terms of scalability.