How to "merge" AWS Cloudwatch dimensions? - amazon-web-services

I'm trying to setup CloudWatch alerts which follows 2 'rules':
CPU utilization is higher than 80%
DatabaseClass is of certain 'type' (let's say db.t3.small)
Is there a way to do that?
I'm trying to "merge" both of those queries:
I hope to get a single graph which will be grouped by instance id, but, will also have the database class property so I will be able to decide based on it if I should fire the alert or not.
I've tried writing it like this
SELECT MAX(CPUUtilization)
FROM SCHEMA("AWS/RDS", DBInstanceIdentifier)
WHERE DatabaseClass != 'db.t3.small'
GROUP BY DBInstanceIdentifier
but it seems like there is no such 'column' as DatabaseClass in this schema.
Any suggestions?

Related

Is there a way to publish custom metrics from AWS Glue jobs?

I'm using an AWS Glue job to move and transform data across S3 buckets, and I'd like to build custom accumulators to monitor the number of rows that I'm receiving and sending, along with other custom metrics. What is the best way to monitor these metrics? According to this document: https://docs.aws.amazon.com/glue/latest/dg/monitoring-awsglue-with-cloudwatch-metrics.html I can keep track of general metrics on my glue job but there doesn't seem to be a good way to send custom metrics through cloudwatch.
I have done lots of similar project like this, each micro batch can be:
a file or a bunch of file
a time interval of data from API
a partition of records from database
etc ...
Your use case is can be break down into three question:
given a bunch of input, how could you define a task_id
how you want to define the metrics for your task, you need to define a simple dictionary structure for this metrics data
find a backend data store to store the metrics data
find a way to query the metrics data
In some business use case, you also need to store status information to track each of the input, are they succeeded? failed? in-progress? stuck? and you may want to control retry, and concurrency control (avoid multiple worker working on the same input)
DynamoDB is the perfect backend for this type of use case. It is a super fast, no ops, pay as you go, automatically scaling key-value store.
There's a Python library that implemented this pattern https://github.com/MacHu-GWU/pynamodb_mate-project/blob/master/examples/patterns/status-tracker.ipynb
Here's an example:
put your glue ETL job main logic in a function:
def glue_job() -> dict:
...
return your_metrics
given an input, calculate the task id identifier, then you just need
tracker = Tracker.new(task_id)
# start the job, it will succeed
with tracker.start_job():
# do some work
your_metrics = glue_job()
# save your metrics in dynamodb
tracker.set_data(your_metrics)
Consider enabling continuous logging on your AWS Glue Job. This will allow you to do custom logging via. CloudWatch. Custom logging can include information such as row count.
More specifically
Enable continuous logging for you Glue Job
Add logger = glueContext.get_logger() at the beginning of you Glue Job
Add logger.info("Custom logging message that will be sent to CloudWatch") where you want to log information to CloudWatch. For example if I have a data frame named df I could log the number of rows to CloudWatch by adding logger.info("Row count of df " + str(df.count()))
Your log messages will be located under the CloudWatch log groups /aws-glue/jobs/logs-v2 under the log stream named glue_run_id -driver.
You can also reference the "Logging Application-Specific Messages Using the Custom Script Logger" section of the AWS documentation Enabling Continuous Logging for AWS Glue Jobs for more information on application specific logging.

How can I get AWS lambda usage for the last hour?

I would like to know if there is a way to get all of my lambda invocation usages for the last 1 hour (better if every 5 minutes).
It could also be nice to get the cost usage (but from what I've read it only updates once a day).
From looking at the documentation it seems like I can use GetMetricData (Cloudwatch), is there a better one for my use case?
You can get this information by region within CloudWatch metrics.
In the AWS/Lambda namespace is a metric named Invocations, this can be viewed for the entire region or on a per Lambda basis.
If you look at the Sum per whichever period you want to use (you can get down to per 1 minute values for this metric), you will be able to get these values in near real-time.
You can get these values from within the console or by using the get-metric-data command within the CLI or SDK.
There are many tools to get metrics on your lambda, so it really depends on your needs.
What do you mean by "is there a better one for my use case"?
If you prefer, you can check it through the console: Go to cloudwatch -> metrics -> and navigate to your lambda. You can aggregate the data differently (examples: average per 5 minutes, or total a day, etc.)
Here's a great doc: https://docs.aws.amazon.com/lambda/latest/dg/monitoring-metrics.html#monitoring-metrics-invocation
Moreover, here's a solution that I gave that surveys different approaches to monitor lambda resources: Best Way to Monitor Customer Usage of AWS Lambda
Disclosoure: I work for Lumigo, a company that does exactly that.

Get all items in DynamoDB with API Gateway's Mapping Template

Is there a simple way to retrieve all items from a DynamoDB table using a mapping template in an API Gateway endpoint? I usually use a lambda to process the data before returning it but this is such a simple task that a Lambda seems like an overkill.
I have a table that contains data with the following format:
roleAttributeName roleHierarchyLevel roleIsActive roleName
"admin" 99 true "Admin"
"director" 90 true "Director"
"areaManager" 80 false "Area Manager"
I'm happy with getting the data, doesn't matter the representation as I can later transform it further down in my code.
I've been looking around but all tutorials explain how to get specific bits of data through queries and params like roles/{roleAttributeName} but I just want to hit roles/ and get all items.
All you need to do is
create a resource (without curly braces since we dont need a particular item)
create a get method
use Scan instead of Query in Action while configuring the integration request.
Configurations as follows :
enter image description here
now try test...you should get the response.
to try it out on postman deploy the api first and then use the provided link into postman followed by your resource name.
API Gateway allows you to Proxy DynamoDB as a service. Here you have an interesting tutorial on how to do it (you can ignore the part related to index to make it work).
To retrieve all the items from a table, you can use Scan as the action in API Gateway. Keep in mind that DynamoDB limits the query sizes to 1MB either for Scan and Query actions.
You can also limit your own query before it is automatically done by using the Limit parameter.
AWS DynamoDB Scan Reference

How do I create a custom metric for RDS Memory Utilization in Cloudwatch?

I have Enhanced Monitoring turned on for one of my RDS database instances, and accordingly it is publishing messages to the Cloudwatch log group called "RDSOSMetrics" every 60 seconds. I found this article on how to create custom metrics but it doesn't seem to be working.
I'm at the stage where I click "Create Metric Filter" but I'm not understanding the syntax to use for the Filter Pattern, as it seems everything I try to use resutls in this error:
There is a problem with your filter pattern. The error is: Invalid metric filter pattern
One such example of a filter pattern I tried to use (but apparently is invalid) is the following:
{ $.memory.physAvailKb * 100 / $.memory.physTotKb }
How can I change this filter pattern to actually be valid?
I don't think you can do calculation with a filter
For my understanding, what you can do is, create a filter with {$.memory.physAvailKb > 0}, value to report for that filter is $.memory.physAvailKb
Create another filter for {$.memory.physTotKb > 0} that report value of $.memory.physTotKb
After that, you can use MathMetric for your operations

Why playing with AWS DynamoDb "Hello world" produces read/write alarms?

I'v started to play with DynamoDb and I'v created "dynamo-test" table with hash PK on userid and couple more columns (age, name). Read and write capacity is set to 5. I use Lambda and API Gateway with Node.js. Then I manually performed several API calls through API gateway using similar payload:
{
"userId" : "222",
"name" : "Test",
"age" : 34
}
I'v tried to insert the same item couple times (which didn't produce error but silently succeeded.) Also, I used DynamoDb console and browsed for inserted items several times (currently there are 2 only). I haven't tracked how many times exactly I did those actions, but that was done completely manually. And then after an hour, I'v noticed 2 alarms in CloudWatch:
INSUFFICIENT_DATA
dynamo-test-ReadCapacityUnitsLimit-BasicAlarm
ConsumedReadCapacityUnits >= 240 for 12 minutes
No notifications
And the similar alarm with "...WriteCapacityLimit...". Write capacity become OK after 2 mins, but then went back again after 10 mins. Anyway, I'm still reading and learning how to plan and monitor these capacities, but this hello world example scared me a bit if I'v exceeded my table's capacity :) Please, point me to the right direction if I'm missing some fundamental part!
It's just an "INSUFFICIENT_DATA" message. It means that your table hasn't had any reads or writes in a while, so there is insufficient data available for the CloudWatch metric. This happens with the CloudWatch alarms for any DynamoDB table that isn't used very often. Nothing to worry about.
EDIT: You can now change a setting in CloudWatch alarms to ignore missing data, which will leave the alarm at its previous state instead of changing it to the "INSUFFICIENT_DATA" state.