I would like to know if there is a way to get all of my lambda invocation usages for the last 1 hour (better if every 5 minutes).
It could also be nice to get the cost usage (but from what I've read it only updates once a day).
From looking at the documentation it seems like I can use GetMetricData (Cloudwatch), is there a better one for my use case?
You can get this information by region within CloudWatch metrics.
In the AWS/Lambda namespace is a metric named Invocations, this can be viewed for the entire region or on a per Lambda basis.
If you look at the Sum per whichever period you want to use (you can get down to per 1 minute values for this metric), you will be able to get these values in near real-time.
You can get these values from within the console or by using the get-metric-data command within the CLI or SDK.
There are many tools to get metrics on your lambda, so it really depends on your needs.
What do you mean by "is there a better one for my use case"?
If you prefer, you can check it through the console: Go to cloudwatch -> metrics -> and navigate to your lambda. You can aggregate the data differently (examples: average per 5 minutes, or total a day, etc.)
Here's a great doc: https://docs.aws.amazon.com/lambda/latest/dg/monitoring-metrics.html#monitoring-metrics-invocation
Moreover, here's a solution that I gave that surveys different approaches to monitor lambda resources: Best Way to Monitor Customer Usage of AWS Lambda
Disclosoure: I work for Lumigo, a company that does exactly that.
Related
I'm creating a report of the lambda functions in my AWS accounts and looking to see if I could get the cost of the previous invokes and how many times a lambda function has been invoked. I've looked into the documentation for lambda, pricing, cloudwatch, and costexplorer but haven't found anything. I'm now wondering if I have to go through multiple API calls. All help is appreciated!
The following log insights query will tell you the cost of a lambda function for a given period.
parse #message /Duration:\s*(?<#duration_ms>\d+\.\d+)\s*ms\s*Billed\s*Duration:\s*(?<#billed_duration_ms>\d+)\s*ms\s*Memory\s*Size:\s*(?<#memory_size_mb>\d+)\s*MB/
| filter #message like /REPORT RequestId/
| stats sum(#billed_duration_ms * #memory_size_mb * 1.6279296875e-11 + 2.0e-7) as #cost_dollars_total
You just need to go into log insights and select the correct log stream for your lambda function. This can be modified pretty easily to give you an invocation count as well.
Important Note: The 1.6279296875e-11 + 2.0e-7 factor is based on the per compute-millisecond-megabyte cost for an x86 lambda instance in us-east-1. You may need to adjust it if that doesn't apply to you.
Since cost per function is not available under billing, I think the best way would be to monitor the Invocation and Duration metrics for that specific Lambda function as you can Sum these and capture the total invocations and durations for period of time.
The invocations and duration are the two components of the Lambda cost and with that information you should be able to calculate it based on the region and amount of memory you allocated.
For example, if you allocated 1 GB of memory to your Lambda function in us-east-1 region and you had the captured the following metrics for a month:
Sum of Invocations: 1,000,000
Sum of Duration (milliseconds): 10,000,000
The monthly cost of that function would be:
100,000*0.20/1,000,000 = $0.20 for invocations
0.0000166667*1,000,000/1,000 = $0.166 for duration
or around $0.37 total
One other option is that you add a unique tag to each lambda function. Cost explorer should allow you to filter results based on tag. This will not give you historical data but it should allow you to more easily track of a single function cost via the Billing console and APIs.
I'm planning to use the parameter store to have some dynamic config (property) that will programmatically get updated. The apps using this config will poll for the change every 5 minutes. Is this a good use case to use a parameter store? The config is expected to be updated once in a month or so and read like 10 times every 5 minutes. The rate at which it is being read is not expected to increase.
Parameter Store is an event source for cloudwatch events. It would be better to try and use cloudwatch events to trigger a lambda to update config that these apps depend on.
Source: https://aws.amazon.com/blogs/mt/organize-parameters-by-hierarchy-tags-or-amazon-cloudwatch-events-with-amazon-ec2-systems-manager-parameter-store/
It sound like your situation is:
A config is being used by multiple apps
The config could update at any time (but not very often)
The apps should use the latest config
If the apps need to use the latest config at all times, then the only reliable method is to check the config before every use. If you are willing to allow some leniency, then they could update at regular intervals rather than every time the config is required.
There are several places the config could be stored:
In a database
In an Amazon S3 object
In Parameter Store (as you have suggested)
Assuming that you are using a standard parameter (not an Advanced parameter), then there is no charge for the API calls nor the storage. Thus, using Parameter Store seems perfectly valid if it meets your requirements.
The AWS Documents states the below, so I'd say it depends on what else you're using the Parameter Store for, but you're way off the 40 requests per second.
Max throughput (transactions per second)
Default throughput: 40 (Shared by the following API actions: GetParameter, GetParameters, GetParametersByPath)
Higher throughput: 100 (GetParametersByPath)
Higher throughput: 3000 (Shared by the following API actions: GetParameter and GetParameters)
I need to receive a notification each time a certain message does not appear in logs for 3-4 minutes. It is a clear sign that the system is not working properly.
But it is only possible to choose 1 min or 5 mins. Is there any workaround?
"does not appear in logs for 3-4 minutes. It is a clear sign that the system is not working properly."
-- I know what you mean, CloudWatch Alarm on a metric which is not continuously pushed might behave a bit differently.
You should consider using Alarm's M out of N option with 3 out 4 option.
https://aws.amazon.com/about-aws/whats-new/2017/12/amazon-cloudwatch-alarms-now-alerts-you-when-any-m-out-of-n-metric-datapoints-in-an-interval-are-above-your-threshold/
Also, if the metric you are referring to was created using a metric filter on a CloudWatch Log Group, you should edit the metric to include a default value so that each time a log is pushed and the metric filter expression does not match it still pushes a default value (of say 0) thus making metric have more continuous datapoint.
If you describe an cloudwatch alarm using AWS Cli it is possible to input the period in seconds.Only the web interface limits the period to set of values.
https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/describe-alarms.html
I am currently using the boto3 SDK from a Lambda function in order to retrieve various information about the Sagemaker Notebook Instances deployed in my account (almost 70 so not that many...)
One of the operations I am trying to perform is listing the tags for each instance.
However, from time to time it takes ages to return the tags : my Lambda either gets stopped (I could increase the timeout but still...) or a ThrottlingException is raised from the sagemaker.list_tags function (which could be avoided by increasing the number of retry upon sagemaker boto3 client creation) :
sagemaker = boto3.client("sagemaker", config=Config(retries = dict(max_attempts = 10)))
instances_dict = sagemaker.list_notebook_instances()
if not instances_dict['NotebookInstances']:
return "No Notebook Instances"
while instances_dict:
for instance in instances_dict['NotebookInstances']:
print instance['NotebookInstanceArn']
start = time.time()
tags_notebook_instance = sagemaker.list_tags(ResourceArn=instance['NotebookInstanceArn'])['Tags']
print (time.time() - start)
instances_dict = sagemaker.list_notebook_instances(NextToken=instances_dict['NextToken']) if 'NextToken' in instances_dict else None
If you guys have any idea to avoid such delays :)
TY
As you've noted you're getting throttled. Rather than increasing the number of retries you might try to change the delay (i.e. increase the growth_factor). Seems to be configurable looking at https://github.com/boto/botocore/blob/develop/botocore/data/_retry.json#L83
Note that buckets (and refill rates) are usually at the second granularity. So with 70 ARNs you're looking at some number of seconds; double digits does not surprise me.
You might want to consider breaking up the work differently since adding retries/larger growth_factor will just increase the length of time the function will run.
I've had pretty good success at breaking things up so that the Lambda function only processes a single ARN per invocation. The Lambda is processing work (I'll typically use a SQS queue to manage what needs to be processed) and the rate of work is configurable via a combination of configuring the Lambda and the SQS message visibility.
Not know what you're trying to accomplish outside of your original Lambda I realize that breaking up the work this way might (or will) add challenges to what you're doing overall.
It's also worth noting that if you have CloudTrail enabled the tags will be part of the event data (request data) for the "EventName" (which matches the method called, i.e. CreateTrainingJob, AddTags, etc.).
A third option would be if you are trying to find all of the notebook instances with a specific tag then you can use Resource Groups to create a query and find the ARNs with those tags fairly quickly.
CloudTrail: https://docs.aws.amazon.com/awscloudtrail/latest/APIReference/Welcome.html
Resource Groups: https://docs.aws.amazon.com/ARG/latest/APIReference/Welcome.html
Lambda with SQS: https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html
I'm still trying to understand how the pieces might fit together in an AWS workflow for ftp'ing weather forecast files on a daily basis and storing in S3.
I've created a single lambda function that will accept a forecast start time (e.g. 2016-04-21_12Z) and a forecast hour (e.g. 3, 6, 9, etc...), retrieve the file and store it.
Once a day, I would like to be able to call this lambda function, once for each forecast hour - each lambda function retrieves and stores one file.
Of course, I could just set up a Python program on my own machine to generate the launching of each of these, but I would like to be able to do this in an AWS workflow, but I don't understand what tools are available to help me with this.
I understand that "maybe" I could use CloudWatch to generate events periodically, and I suppose I could have one event for each forecast hour, but is there a straightforward mechanism by which I might be able to launch that single Python program that would in turn launch a collection of lambda calls, one for each forecast hour?
I suspect the tools are there, but I'm struggling to connect the dots right now.