AWS CDK: What is the best way to implement multiple Stacks? - amazon-web-services

I have a few things to get clear, specifically regarding modeling architecture for a serverless application using AWS CDK.
I’m currently working on a serverless application developed using AWS CDK in TypeScript. Also as a convention, we follow the below rules too.
A stack should only have one table (dynamo)
A stack should only have one REST API (api-gateway)
A stack should not depend on any other stack (no cross-references), unless its the Event-Stack (a stack dedicated to managing EventBridge operations)
The reason for that is so that each stack can be deployed independently without any interferences of other stacks. In a way, our stacks are equivalent to micro-services in a micro-service architecture.
At the moment all the REST APIs are public and now we have decided to make them private by attaching custom Lambda authorizers to each API Gateway resource. Now, in this custom Lambda authorizer, we have to do certain operations (apart from token validation) in order to allow the user's request to proceed further. Those operations are,
Get the user’s role from DB using the user ID in the token
Get the user’s subscription plan (paid, free, etc.) from DB using the user ID in the token.
Get the user’s current payment status (due, no due, fully paid, etc.) from DB using the user ID in the token.
Get scopes allowed for this user based on 1. 2. And 3.
Check whether the user can access this scope (the resource user currently requesting) based on 4.
This authorizer Lambda function needs to be used by all the other Stacks to make their APIs private. But the problem is roles, scopes, subscriptions, payments & user data are in different stacks in their dedicated DynamoDB tables. Because of the rules, I have explained before (especially rule number 3.) we cannot depend on the resources defined in other stacks. Hence we are unable to create the Authoriser we want.
Solutions we could think of and their problems:
Since EventBridge isn't bi-directional we cannot use it to fetch data from a different stack resource.
We can invoke a Lambda in a different stack using its ARN and get the required data from its' response but, AWS has discouraged this as a CDK Anti Pattern
We cannot use technology like gRPC because it requires a continuously running server, which is out of the scope of the server-less architecture.
There was also a proposal to re-design the CDK layout of our application. The main feature of this layout is going from non-crossed-references to adopting a fully-crossed-references pattern. (Inspired by layered architecture as described in this AWS best practice)
Based on that article, we came up with a layout like this.
Presentation Layer
Stack for deploying the consumer web app
Stack for deploying admin portal web app
Application Layer
Stack for REST API definitions using API Gateway
Stack for Lambda functions running business-specific operations (Ex: CRUDs)
Stack for Lambda functions runs on event triggers
Stack for Authorisation (Custom Lambda authorizer(s))
Stack for Authentication implementation (Cognito user pool and client)
Stack for Events (EvenBuses)
Stack for storage (S3)
Data Layer
Stack containing all the database definitions
There could be another stack for reporting, data engineering, etc.
As you can see, now stacks are going to have multiple dependencies with other stacks' resources (But no circular dependencies, as shown in the attached image). While this pattern unblocks us from writing an effective custom Lambda authorizer we are not sure whether this pattern won't be a problem in the long run, when the application's scope increases.
I highly appreciate the help any one of you could give us to resolve this problem. Thanks!

Multiple options:
Use Parameter Store rather than CloudFormation exports
Split stacks into a layered architecture like you described in your
answer and import things between Stacks using SSM parameter store like the other answer describes. This is the most obvious choice for breaking inter-stack dependencies. I use it all the time.
Use fixed resource names, easily referencable and importable
Stack A creates S3 bucket "myapp-users", Stack B imports S3 bucket by fixed name using Bucket.fromBucketName(this, 'Users', 'myapp-users'). Fixed resource names have their own downsides, so this should be used only for resources that are indeed shared between stacks. They prevent easy replacement of the resource, for example. Also, you need to enforce the correct Stack deployment order, CDK will not help you with that anymore since there are no cross-stack dependencies to enforce it.
Combine the app into a single stack
This sounds extreme
and counter intuitive, but I found that most real life teams don't
actually have a pressing need for multi-stack deployment. If your only concern is
separating code-owners of different parts of the application - you
can get away by splitting the stack into multiple Constructs,
composed into a single stack, where each team takes care of their
Construct and its children. Think of it as combining multiple Git repos into a Monorepo. A lot of projects are doing that.

A strategy I use to avoid hard cross-references involves storing shared resource values in AWS Systems Manager.
In the exporting stack, we can save the name of an S3 Bucket for instance:
ssm.StringParameter(
scope=self,
id="/example_stack/example_bucket_name",
string_value=self.example_bucket.bucket_name,
parameter_name="/example_stack/example_bucket_name",
)
and then in the importing stack, retrieve the name and create an IBucket by using a .from_ method.
example_bucket_name = ssm.StringParameter.value_for_string_parameter(
scope=self,
parameter_name="/example_stack/example_bucket_name",
)
example_bucket = s3.Bucket.from_bucket_name(
scope=self,
id="example_bucket_from_ssm",
bucket_name=example_bucket_name,
)
You'll have to figure out the right order to deploy your stacks but otherwise, I've found this to be a good strategy to avoid the issues encountered with stack dependencies.

Related

What are the limitations of using AWS appsync api(GraphQL) through Amplify?

I just want to avoid use of custom/manual resolvers in appsync completely. So I'm using Amplify to setup GraphQL appsync API in my app. I'm doing all the stuffs by changing schema.graphql and amplify push.
I have 2 questions :
1. What are the limitations and what problems I'm going to face in future?
2. Can graphql subscriptions get update when app is not running(like user should be notified)?
tons of business logic will be exposed on the client side code.
I think for push notifications you would still have to go via external integrations like FCM/APNS. Multiple integration options are available in SNS
Just to preamble these answers, the fact that you use an amplify generated graphQL and resolvers doesn't stop you from later including custom resolvers and pipeline functions - it's just that you need to learn quite a bit about where to include them in the backend file structure of amplify.
1. What are the limitations and what problems I'm going to face in future?
This depends on how well your applications use-case matches the graphQL schema design and if your application is relatively self-contained. Amplify becomes more complex when your application needs to talk to other back-end systems, you'll need to start using DynamoDB triggers to notify other state machines/event bridge/SNS or similar services.
As mentioned none of these problems are crippling, you can deal with them later but it will be a step up in the AWS knowledge required to implement them.
For small high-volume/availability apps Amplify and DynamoDB as-it-comes is great. If your application matures into many micro-services and sites then you'll need to learn quite a bit more AWS to make them play together well. Amplify does determine your DynamoDB on a table per object basis and you'll probably be stuck with (paying for) that. Think hard about if you ever might want to go to a different optimised data source (RDS or single dynamo table) to reduce the number of queries required to fulfil your graphQL requests.
2. Can graphql subscriptions get update when app is not running(like user should be notified)?
No. Anurag mentions SNS which would be a good option to out-app notify users, best to blend subscriptions and another service.

AWS CDK multi stack or single stack

I use CDK to deploy a lambda function (along some IAM role & queue) and monitoring resources about the lambda, lambda log group and queue earlier. What i have right now is basically 2 class, 1 class to create all the lambda related resource and another to create monitoring resource and they are added all into 1 deployment stack.
Recently im deploying this to a new account and i realized my stack fail to create because some of the monitoring stuff is looking for the lambda log group and cant find it since its not created yet.
So what is the better option:
have 2 deployment group, 1 for lambda related resource and 1 for monitoring resource
use dependencies to create some ordering in my stack.
seems like both possible solution but what is a better long term solution?
Assuming you mean a Stack for your two classes, then you are better off making them both cdk.NestedStacks and instantiating them in a single common stack. You can then expose constructs as class attributes in one stack and pass them into the other as parameters to the second. Of course, this only works one way - if you have to go both ways you need to re-evaluate how you have your stacks organized.
The advantage of doing this is great: exposing constructs as an attribute is the best practice as it gives you direct access to that construct before it creates the CloudFormation data for it. you have complete access to every part of that construct from various arns (like dynamodb stream arns which are difficult to import) and automatically know the layer versions for lamdba layers - among many other things.
In addition, you never run into a stack dependency - if they are different top level stacks and you share constructs between them you can very run into lock situations where attempting to change something in one stack creates a dependency lock and prevents the stack from deploying.
The downside is that they all are part of the deployment. So there is a potential for something to be updated when you didnt expect it too - though CDK does use the Cloudformation Changeset system so it should not update things that have no changes applied to them (but sometimes, changes occur because of the way CDK generates tokens and such that you may not be aware of)
IF you do not go this route you are stuck using the various from* methods in cdk constructs to import the existing construct into your stack. This causes some issues, as it it can't import everything about a given construct at synth time (layer version and dynamo stream arns are two notable ones i mentioned already). Plus, you need to know the name of the construct - and Best Practices says you shouldn't deliberately name your constructs so you can easily spin up adhoc versions of your app without naming issues.

AWS service for managing state data - dynamodb/step functions/sqs?

I am building a Desktop-on-Demand solution using AWS Workspaces product and I am trying to understand what is the best AWS service to fit my requirements for managing state data for new users.
In a nutshell, solution will create a new AWS Workspace (virtual desktop instance) for a user when multiple conditions are met and checks are satisfied. These tasks would be satisfied by multiple lambda functions.
DynamoDB would be used as a central point for storing confguration data details like user data, user groups data and deployed virtual desktops data.
Logic for Desktops creation would be implemented using Step Functions like below:
Event hook comes from Identity Management system firing a lambda function that checks if user desktop already exists in DynamoDB table
If it does not exist, another lambda creates AWS AD connector
Once this is done, another lambda builds custom image for new desktop if needed
Another lambda pulls latest data from Identity Management system and updates DynamoDB table for users and groups.
Other lambda functions that may be fired up as a dependency
To ensure we have transactional mechanism, we only deploy new desktop when all conditions are met. I can think about few ways of implementing this check:
Use DynamoDB table for keeping State data. When all attributes in item are in expected state, desktop can be created. If any lambda fails or produces data that does not fit, dont' create desktop.
Just use Step Functions and design it's logic flow that all conditions must satisfy before desktop is created
Someone suggested using SQS queue but I don't see how this can be used for my purpose.
What is the best way to keep this data?
Step Functions is the method I would use for this. The DynamoDB solution would also work, but this seems like exactly the sort of thing Step Functions was designed to handle.
I agree that SQS would not be a correct solution.

Tracking of Common Services in AWS using X-ray

I have multiple Lambdas and each of them either invoke another lambda or a
rest API or a dynamoDB or S3 etc .
Example :
HotelBooking
FlighBooking
These invoke the common services like
BookingService
InvoiceService
I need to track which application i.e Flight Booking / HotelBooking is invoking the booking service and how many times / how much CPU etc
Is this possible through X-Ray in AWS or any other better ways ?
After some research, I believe annotation is the best way for the above problem
So we need to add an annotation in NodeJS
AWSXRay.captureFunc('annotations', (subsegment) => {
subsegment.addAnnotation('application', "BookingService");
});
Annotations are indexed and can be used to filter too
using an expression like this annotation.= "BookingService"
more Info :
X-ray merges segments, Subsegments to traces so the annotation at subsegment level is enough for filtering the traces
AWS Xray can help you to trace downstream calls made by lambda application in a form of segments and subsegments in order to give you the over all view of your application. I think AWS Xray can help you in your use case and also you will be able to trace Dynamodb, S3 or RestAPI calls that your lambda is making and able to identify which application (in your case Flight Booking / HotelBooking) is invoking the service (booking service). Though you might not see performance numbers (e.g. memory and CPU usage) but you will be able to trace exceptions inside your application.

How to Deploy Lambdas from one code base?

After doing some brief research, I'm receiving conflicting answers regarding best practices for the AWS lambda service. I'm writing a few microservices for my company that will automate the steps for adding clients to our various services: creating api keys, uploading documents to a repo, sending an email, etc.
I have copied and pasted my code for 3 lambdas now (only changing around a few variable values), but, before I start doing this for all of them, I wanted to request if anyone had an easier method. I do know about the ProxyIntegration, where I could use the same lambda for similar requests and differentiate them by their resource paths; however, is there an easier way I could "map" the lambdas to shared code?
I was thinking about using an S3 Object to hold the code, then change the variables by environment variables (which could very well work), but does anyone have any other recommendations or obvious solutions I'm not realizing?
Thanks!
There is a very recent feature called Lambda Layers that specifically allows you to share code between AWS Lambda functions.
You would build the common code as a library and deploy it as a Layer. Then each individual Lambda function would include that Layer.