Tracking of Common Services in AWS using X-ray - amazon-web-services

I have multiple Lambdas and each of them either invoke another lambda or a
rest API or a dynamoDB or S3 etc .
Example :
HotelBooking
FlighBooking
These invoke the common services like
BookingService
InvoiceService
I need to track which application i.e Flight Booking / HotelBooking is invoking the booking service and how many times / how much CPU etc
Is this possible through X-Ray in AWS or any other better ways ?

After some research, I believe annotation is the best way for the above problem
So we need to add an annotation in NodeJS
AWSXRay.captureFunc('annotations', (subsegment) => {
subsegment.addAnnotation('application', "BookingService");
});
Annotations are indexed and can be used to filter too
using an expression like this annotation.= "BookingService"
more Info :
X-ray merges segments, Subsegments to traces so the annotation at subsegment level is enough for filtering the traces

AWS Xray can help you to trace downstream calls made by lambda application in a form of segments and subsegments in order to give you the over all view of your application. I think AWS Xray can help you in your use case and also you will be able to trace Dynamodb, S3 or RestAPI calls that your lambda is making and able to identify which application (in your case Flight Booking / HotelBooking) is invoking the service (booking service). Though you might not see performance numbers (e.g. memory and CPU usage) but you will be able to trace exceptions inside your application.

Related

AWS CDK: What is the best way to implement multiple Stacks?

I have a few things to get clear, specifically regarding modeling architecture for a serverless application using AWS CDK.
I’m currently working on a serverless application developed using AWS CDK in TypeScript. Also as a convention, we follow the below rules too.
A stack should only have one table (dynamo)
A stack should only have one REST API (api-gateway)
A stack should not depend on any other stack (no cross-references), unless its the Event-Stack (a stack dedicated to managing EventBridge operations)
The reason for that is so that each stack can be deployed independently without any interferences of other stacks. In a way, our stacks are equivalent to micro-services in a micro-service architecture.
At the moment all the REST APIs are public and now we have decided to make them private by attaching custom Lambda authorizers to each API Gateway resource. Now, in this custom Lambda authorizer, we have to do certain operations (apart from token validation) in order to allow the user's request to proceed further. Those operations are,
Get the user’s role from DB using the user ID in the token
Get the user’s subscription plan (paid, free, etc.) from DB using the user ID in the token.
Get the user’s current payment status (due, no due, fully paid, etc.) from DB using the user ID in the token.
Get scopes allowed for this user based on 1. 2. And 3.
Check whether the user can access this scope (the resource user currently requesting) based on 4.
This authorizer Lambda function needs to be used by all the other Stacks to make their APIs private. But the problem is roles, scopes, subscriptions, payments & user data are in different stacks in their dedicated DynamoDB tables. Because of the rules, I have explained before (especially rule number 3.) we cannot depend on the resources defined in other stacks. Hence we are unable to create the Authoriser we want.
Solutions we could think of and their problems:
Since EventBridge isn't bi-directional we cannot use it to fetch data from a different stack resource.
We can invoke a Lambda in a different stack using its ARN and get the required data from its' response but, AWS has discouraged this as a CDK Anti Pattern
We cannot use technology like gRPC because it requires a continuously running server, which is out of the scope of the server-less architecture.
There was also a proposal to re-design the CDK layout of our application. The main feature of this layout is going from non-crossed-references to adopting a fully-crossed-references pattern. (Inspired by layered architecture as described in this AWS best practice)
Based on that article, we came up with a layout like this.
Presentation Layer
Stack for deploying the consumer web app
Stack for deploying admin portal web app
Application Layer
Stack for REST API definitions using API Gateway
Stack for Lambda functions running business-specific operations (Ex: CRUDs)
Stack for Lambda functions runs on event triggers
Stack for Authorisation (Custom Lambda authorizer(s))
Stack for Authentication implementation (Cognito user pool and client)
Stack for Events (EvenBuses)
Stack for storage (S3)
Data Layer
Stack containing all the database definitions
There could be another stack for reporting, data engineering, etc.
As you can see, now stacks are going to have multiple dependencies with other stacks' resources (But no circular dependencies, as shown in the attached image). While this pattern unblocks us from writing an effective custom Lambda authorizer we are not sure whether this pattern won't be a problem in the long run, when the application's scope increases.
I highly appreciate the help any one of you could give us to resolve this problem. Thanks!
Multiple options:
Use Parameter Store rather than CloudFormation exports
Split stacks into a layered architecture like you described in your
answer and import things between Stacks using SSM parameter store like the other answer describes. This is the most obvious choice for breaking inter-stack dependencies. I use it all the time.
Use fixed resource names, easily referencable and importable
Stack A creates S3 bucket "myapp-users", Stack B imports S3 bucket by fixed name using Bucket.fromBucketName(this, 'Users', 'myapp-users'). Fixed resource names have their own downsides, so this should be used only for resources that are indeed shared between stacks. They prevent easy replacement of the resource, for example. Also, you need to enforce the correct Stack deployment order, CDK will not help you with that anymore since there are no cross-stack dependencies to enforce it.
Combine the app into a single stack
This sounds extreme
and counter intuitive, but I found that most real life teams don't
actually have a pressing need for multi-stack deployment. If your only concern is
separating code-owners of different parts of the application - you
can get away by splitting the stack into multiple Constructs,
composed into a single stack, where each team takes care of their
Construct and its children. Think of it as combining multiple Git repos into a Monorepo. A lot of projects are doing that.
A strategy I use to avoid hard cross-references involves storing shared resource values in AWS Systems Manager.
In the exporting stack, we can save the name of an S3 Bucket for instance:
ssm.StringParameter(
scope=self,
id="/example_stack/example_bucket_name",
string_value=self.example_bucket.bucket_name,
parameter_name="/example_stack/example_bucket_name",
)
and then in the importing stack, retrieve the name and create an IBucket by using a .from_ method.
example_bucket_name = ssm.StringParameter.value_for_string_parameter(
scope=self,
parameter_name="/example_stack/example_bucket_name",
)
example_bucket = s3.Bucket.from_bucket_name(
scope=self,
id="example_bucket_from_ssm",
bucket_name=example_bucket_name,
)
You'll have to figure out the right order to deploy your stacks but otherwise, I've found this to be a good strategy to avoid the issues encountered with stack dependencies.

What are the limitations of using AWS appsync api(GraphQL) through Amplify?

I just want to avoid use of custom/manual resolvers in appsync completely. So I'm using Amplify to setup GraphQL appsync API in my app. I'm doing all the stuffs by changing schema.graphql and amplify push.
I have 2 questions :
1. What are the limitations and what problems I'm going to face in future?
2. Can graphql subscriptions get update when app is not running(like user should be notified)?
tons of business logic will be exposed on the client side code.
I think for push notifications you would still have to go via external integrations like FCM/APNS. Multiple integration options are available in SNS
Just to preamble these answers, the fact that you use an amplify generated graphQL and resolvers doesn't stop you from later including custom resolvers and pipeline functions - it's just that you need to learn quite a bit about where to include them in the backend file structure of amplify.
1. What are the limitations and what problems I'm going to face in future?
This depends on how well your applications use-case matches the graphQL schema design and if your application is relatively self-contained. Amplify becomes more complex when your application needs to talk to other back-end systems, you'll need to start using DynamoDB triggers to notify other state machines/event bridge/SNS or similar services.
As mentioned none of these problems are crippling, you can deal with them later but it will be a step up in the AWS knowledge required to implement them.
For small high-volume/availability apps Amplify and DynamoDB as-it-comes is great. If your application matures into many micro-services and sites then you'll need to learn quite a bit more AWS to make them play together well. Amplify does determine your DynamoDB on a table per object basis and you'll probably be stuck with (paying for) that. Think hard about if you ever might want to go to a different optimised data source (RDS or single dynamo table) to reduce the number of queries required to fulfil your graphQL requests.
2. Can graphql subscriptions get update when app is not running(like user should be notified)?
No. Anurag mentions SNS which would be a good option to out-app notify users, best to blend subscriptions and another service.

GCP Best way to manage multiple cloud function flow

I'm studying GCP and reading about different ways to communicate and manage cloud functions I end up wondering when to use each of the services that offer GCP.
So, I have been reading about GCP Composer, GCP Workflows, Cloud Pub/Sub and I don't see clearly when to use each one, or use simple HTTP calls.
I understand that it depends a lot on the application that you are building, but for example, If I'm building a payment gateway and some functions should be fired after the payment was verified, like sending emails, making not related business logic, adding the purchase to a sales platform. So which one should be the way I manage this flow and in which case would be better to use the others? Should I use events to create an async flow with Pub/Sub, or use complex solutions like composer and workflows? or just simple HTTP calls?
As always, it depends!! Even in your use case, it depends! Ok, after a payment you want to send an email, make business logic, adding the order to your databases,...
But, is all theses actions can be done in parallel, or you need to execute them in a certain order and if a step fails, you stop the process?
In the first case, you can use Cloud PubSub with 1 message published (payment OK) and then a fan out to several functions in parallel. Else, you can use workflow to test the response of the fonction and then to call, or not the following fonctions. With composer you can perform much more checks and actions.
You can also imagine to send another email 24h after to thank the customer for their order, and use Cloud Task to delayed an action.
You talked about Cloud Functions, but you also have other solutions to host code on GCP: App Engine and Cloud Run. Cloud function is, most of the time, single purpose. Sending an email is perfect for a function.
Now, if you have "set of functions" to browse your stock, view the object details, review the price, and book an object (validate an order "books" the order content in your warehouse), the "functions" are all single purpose but related to the same domain: warehouse management. Thus you can create a webserver that propose different path to manage the warehouse (a microservice for the warehouse if you prefer) and host it on CloudRun or App Engine.
Each product has its strength and weakness. You will also see this when you will learn about the storage on GCP. Most of the time, you can achieve things with several product, but if you don't use the right one, it will be slower, or cost much more.

StackDriver Trace across Cloud/Services

What if I have an application that works across cloud services. Eg. AWS Lambda will call Google CloudRun service and I want to have my traces work across these. Isit possible? I guess I will have to somehow pass a trace ID and set it when I need it? But I see no way to set a trace ID?
If we look at the list of supported language/backend combinations, we see that both GCP (Stackdriver) and AWS (X-Ray) are supported. See: Exporters. What this means is that you can instrument either (or both) of your AWS Lambda or GCP CloudRun applications with OpenCensus calls. I suspect you will have to dig deep to determine the specifics but this feels like a good starting point.
If an OpenCensus library is available for your programming language, you can simplify the process of creating and sending trace data by using OpenCensus. In addition to being simpler to use, OpenCensus implements batching which might improve performance click here.
The Stackdriver Trace API allows you to send and retrieve latency data to and from Stackdriver Trace. There are two versions of the API:
Stackdriver Trace API v1 is fully supported.
Stackdriver Trace API v2 is in Beta release.
The client libraries for Trace automatically generate the trace_id and the span_id. You need to generate the values for these fields if you don't use the Trace client libraries or the OpenCensus client libraries. In this case, you should use a pseudo-random or random algorithm. Don't derive these fields from need-to-know data or from personally identifiable information, for details please click here.

Is there a way to report custom DataDog metrics from AWS Lambda?

I'm looking to report custom metrics from Lambda functions to Datadog. I need things like counters, gauges, histograms.
Datadog documentation outlines two options for reporting metrics from AWS Lambda:
print a line into the log
use the API
The fine print in the document above mentions that the printing method only supports counters and gauges, so that's obviously not enough for my usecase (I also need histograms).
Now, the second method - the API - only supports reporting time series points, which I'm assuming are just gauges (right?), according to the API documentation.
So, is there a way to report metrics to Datadog from my Lambda functions, short of setting up a statsd server in EC2 and calling out to it using dogstatsd? Anyone have any luck getting around this?
The easier way is using this library: https://github.com/marceloboeira/aws-lambda-datadog
It has runtime no dependencies, doesn't require authentication and reports everything to cloud-watch too. You can read more about it here: https://www.datadoghq.com/blog/how-to-monitor-lambda-functions/
Yes it is possible to emit metrics to DataDog from a AWS Lambda function.
If you were using node.js you could use https://www.npmjs.com/package/datadog-metrics to emit metrics to the API. It supports counters, gauges and histograms. You just need to pass in your app/api key as environment variables.
Matt