I have newly created an API service that is going to be deployed as a pilot to a customer. It has been built with AWS API Gateway, AWS Lambda, and AWS S3. With a SaaS pricing model, what's the best way for me to monitor this customer's usage and cost? At the moment, I have made a unique API Gateway, Lambda function, and S3 bucket specific to this customer. Is there a good way to create a dashboard that allows me (and perhaps the customer) to detail this monitoring?
Additional question, what's the best way to streamline this process when expanding to multiple different customers? Each customer would have a unique API token — what's the better approach than the naive way of making unique AWS resources per customer?
I am new (a college student), but any insights/resources would help me a long way. Thanks.
Full disclosure: I work for Lumigo, a company that does exactly that.
Regarding your question,
As #gusto2 said, there are many tools that you can use, and the best tool depends on your specific requirements.
The main difference between the tools is the level of configuration that you need to apply.
cloudwatch default metrics - The first tool that you should use. This is an out-of-the-box solution that provides you many metrics on the services, such as: duration, number of invocations and errors, memory. You can configure metrics over different timeslots and aggregators (P99, average, max, etc.)
This tool is great for basic monitoring.
Its limitation is its greatest strength - it provides monitoring which is common to all the services, thus nothing tailored-fit to serverless applications. https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/working_with_metrics.html
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html
cloudwatch custom metrics - The other side of the scale - getting much more precise metrics, which allows you to upload any metric data and monitor it: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/publishingMetrics.html
This is a great tool if you know exactly what you want to monitor, and you already familiar with your architecture limitations and pain points.
And, of course, you can configure alarms over this data:
Lumigo - 3rd party company (again, as a disclosure, this is my workplace). Provides out-of-the-box monitoring, specifically created for serverless applications, such as an abnormal number of invocations, costs, etc.. This tool also provides troubleshooting capabilities to enable deeper observability.
Of course, there are more 3rd party tools that you can find online. All are great- just find the one that suits your requirement the best.
Is there a good way to create a dashboard
There a are multiple ways and options depending in your scaling, amount of data and requirements. So you could start small and simple, but check if any option is feasible or not.
You can start with the CloudWatch. You can monitor basic metrics, create dashboards and even share with other accounts.
naive way of making unique AWS resources per customer
For the start I would consider creating custom cloudwatch metrics with the customer id as a metric and put the metrics from the Lambda functions.
Looks simple, but you should do the math and a PoC about the number of requested datapoints and the dashboards to prevent a nasty surprise on the billing.
Another option is sending metrics/events to DynamoDB, using atomic functions you could directly build some basic aggregations (kind of naïve stream processing).
When scaling to a lot of events, clients, maybe you will need some serious api analytics, but that may be a different topic.
Related
Is there a way to allow creation of a resource like a DynamoDB table only if the table to be created was PAY_PER_REQUEST or was provisioned with capacity below a certain amount?
I initially looked at IAM condition keys, but they appear to only be available for interactions with the table data operations (scan, update, put operations etc.) but not creation operations for the table.
Alternatively, are there ways to reduce service quotas for an account?
Ideally, I'm wondering if it is possible to scope down the ability to create DynamoDB table resources beyond a certain capacity and I’m not sure how to do it proactively instead of retroactively processing CloudTrail logs or listing existing table properties.
AWS Config
You can use AWS Config to retrospectively query AWS resources and their properties, and then determine if they are compliant or not. There are rules already available out of the box, but I can't see one which matches your use case. You will need to then write a Lambda function to implement this yourself. Here is an example.
After your rule is working you can either create a remediation action to
Delete the Table
Scale the Table Down
Send a Notification
Adjust Autoscaling (i.e. reduce max)
AWS Budgets
(My Preference)
For determining if an account is using too much DynamoDB, probably the easiest is to setup a budget for the DynamoDB Service. That would have a couple of benefits:
Auto-Scaling: Developers would be free to use high amounts of capacity (such as load tests) for short periods of time.
Potentially Cheaper: what I have found is that if you put restrictions on projects often developers will allocate 100% of the maximum, as opposed to using only what they need, in fear for another developer coming along and taking all the capacity.
Just like before with AWS Config you can setup Billing Alarms to take action and notify developers that they are using too much DynamoDB, also when the Budget is at 50%, 80% ... and so on.
CloudWatch
You could also create CloudWatch Alarms as well for certain DynamoDB metrics, looking at the capacity which has been used and again responding to excessive use.
Conclusion
You have a lot of flexibility how to approach this, so make sure you have gathered up your requirements and then the appropriate response will be easier to see. AWS Config requires a bit more work than budgets so if you can get what you want out of Budgets I would do that.
For example say I build a workflow that uses 10 lambda functions that trigger each other and are triggered by a dynamodb table and an S3 bucket.
Is there any AWS tool that tracks how these triggers are tying together so I can easily visualize the whole workflow I’ve created?
Bang on, few months ago, I too was in a similar situation for my distributed architecture running on AWS.
So far, I have found the following options as possibilities. I'm still figuring out which is more suitable. But, hope this information helps you.
1. AWS-Native option :: Engineer your Lambda code to trigger Cloudwatch custom-metrics for any important events from within the code. Later, you may use Cloudwatch dashboard to visualize them.
2. Non-AWS options :: There are several of them, but all of them require you to engineer your code with their respective libraries / packages to transmit the needed information. Some of them support ASYNC invocations, so it shouldn't keep your master lambdas in the waiting state for log tracing.
IOPipe
Epsagon
3. Mix of AWS & Non-AWS :: This is a more traditional approach to our problem. You log events to Cloudwatch Logs (like how Lambda does it out of the box), "ingest" these logs into popular log management and analysis SaaS tooling to make sense between these logs via "pattern-matching" and other proprietary techniques.
Splunk Cloud
Datadog
All the best! Keep me posted how it is going.
cheers,
ram
If you use CloudFormation you can visualize the resource relations with CloudFormation Designer. However, if you don't have the resources in a CloudFormation stack, you can create one from all the existing resources.
I am trying to see if there is a way for me to bill and restrict users based on their usage of resources across projects on our GCP instance. I know billing and quotas officially are at the project level, but we will have shared projects which will be used by people in seperate cost codes. I was thinking of building an API per cost code, and having people access the resources only through there, but I'm not sure if this is the best idea.
You can use a grouping system where you use labels to track your resources. Using labels such as cost center, service and environment will allow you to track your GCP resource usage and spending.
Then you can export your billing data to BigQuery where it can be filtered and segmented by labels.
In terms of notifications and restricting access, you may set up budget alerts by following this documentation. Together with Cloud Pub/Sub budget notifications and Cloud Functions, you can have more control on your spending by capping costs and stopping your billing.
If i understood the whole concept correctly, the "serverless" architecture assumes that instead of using own servers or containers, one should use bunch of aws services. Usually such architecture includes Amazon API Gateway, bunch of Lambda functions and DynamoDB (or alternative) for storing data and state, as Lambda can't keep state. And such services as EC2 is not participating in all this, well, because this is a virtual server and it diminish all the benefits of serverless architecture.
All this looks really cool, but i feel like i'm missing something important, because right now this seems to be not applicable for such cases as real time applications.
Say, i have 2 users online. One of them performs an action in an app, which triggers changes in database, which in turn, should trigger changes in the second user app.
The conventional way to send some data or command from server to client is websocket connection. But with serverless architecture there seem to be no way to establish and maintain websocket connection. So... where did i misunderstood the concept? Or, if i understood everything correctly, then how do i implement the interactions between 2 users as described above?
where did i misunderstood the concept?
Your observation is correct. It doesn't work out of the box using API Gateway and Lambda.
Applicable solution as described here is to use AWS IoT - yes, another AWS Service.
Serverless isn't just a matter of Lambda, API Gateway and DynamoDB, it's much bigger than that. One of the big advantages to Serverless is the operational burden that it takes off your plate. No more patching, no more capacity planning, no more config management. Those may seem trivial but doing those things well and across a significant fleet of instances is complex, expensive and time consuming. Another benefit is the economics. Public cloud leverages utility billing, meaning you pay for what you run whether or not you actually use it. With AWS most of the billing per service is by hour but with Lambda it's per 100ms. The cheapest EC2 instance running for a full month is about $10/m (double that for redundancy). $20 in Lambda pricing gets you millions of invocations so for most cases serverless is significantly cheaper.
Serverless isn't for everything though, it has it's limitations, for example it's not meant for running binaries. You can't run nginx in Lambda (for example), it's only meant to be a runtime environment for the programming languages that it supports. It's also specifically meant for event based workloads, which is perfect for microservice based architectures. Small independent discrete pieces of compute doing work that when done they send an event to another(s) to do something else and if needed return a response.
To address your concerns about realtime processing, depending on what your code is doing your Lambda function could complete in less than 100ms all the way up to 5 minutes. There are strategies to optimize it's duration time but in general it's for short lived work which is conducive of realtime scenarios.
In your example about the 2 users interacting with the web app and the db, that could very easily be built using serverless technologies with one or 2 functions and a DynamoDB table. The total roundtrip time could be as low as milliseconds if not seconds, it really all depends on your code and what it's doing. These would all be HTTP calls so no websockets needed. Think of a number of APIs calling each other and your Lambda code is the orchestrator.
You might want to look at SNS (simple notification service). In your example, if app user 2 is a a subscriber to an SNS topic, then when app user 1 makes a change that triggers an SNS message, it will be pushed to the subscriber (app user 2). The message can be pushed over several supported protocols (Amazon, Apple, Google, MS, Baidu) in addition to SMTP or SMS. The SNS message can be triggered by a lambda function or directly from a DynamoDB stream after an update (a database trigger). It's up to the app developer to select a message protocol and format. The app only has to receive messages through its native channels. This may not exactly be millisecond-latency 'real-time', but it's fast enough for all but the most latency-sensitive applications.
I've been working on an AWS serverless application for several months now, and am amazed at the variety of services available. The rate of improvement and new features being added is enough to leave you out-of-breath.
I am currently trying to identify the API that handles the reporting for AWS Instances.
I am looking for how the total hours and cost can be identified for all of the instances or just one instance ?
I looked at the XHR Tab and identified 2 API's that get it
But i think there should be some way to get this data from AWS-SDK.
Any help would be appreciated . Thanks
You will need to turn on the Detailed Billing Report. This will then send billing information to Amazon S3.
The billing files show every specific charge incurred by your account, broken down by resource, tag (needs configuration), region, etc.
Please note that this level of detail is only available after you have activated Detailed Billing Reports. You can only obtain high-level information prior to this time.
Most features in the AWS console are directly or indirectly accessing the same documented, exposed APIs that are accessed by the SDKs and CLI.
Most, but not all.
Some features, particularly reporting and graphing-type features -- like these billing/cost reports -- are console-only features. CloudWatch graphs and CloudFront graphs and reports are other examples that come to mind. There is no access provided to these other than what's provided in the console.
In each case, the raw underlying data is generally accessible through the documented APIs, but not necessarily the data in its aggregated form as presented on the screen or for download -- you'd need to do your own analysis/aggregation/summary, etc.