I want to calculate the percentage of Disk space used for AWS RDS via cloudwatch metrics.
We can see the metrics for FreeStorageSpace(The amount of available storage space)
Knowing the total space occupied by AWS RDS can help for calculating the same.
Where to get the total space occupied since no metrics is available.
As far as I know, there's no standard CloudWatch metric for RDS occupied space percentage or total instance size, only the already mentioned FreeStorageSpace which uses bytes as a unit.
However, you can calculate the percentage by getting the total size via AWS CLI command describe-db-instances 1. The same command should also exist in RDS clients inside AWS SDKs (although I have only confirmed its existence in Python's boto3 library) 2. The output is a list of instance objects in JSON format which also contain the parameter AllocatedStorage describing the total size of the instance in Gibibytes. After converting to the same unit, you can then calculate the percentage of free storage space. Depending on your use case, you can then perform some direct action or set up a custom CloudWatch metric for the calculated percentage.
Another interesting solution which might help you was proposed by user alanc10n in a similar question 3
1 https://docs.aws.amazon.com/cli/latest/reference/rds/describe-db-instances.html
2 https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/rds.html#RDS.Client.describe_db_instances
3 https://stackoverflow.com/questions/58657063/how-do-i-get-totalstoragespace-or-usedstoragespace-metric-from-aws-rds
Related
Is there a quick way to check how many data (volume wise, GBs, TBs etc) did my specific DMS task transfered for example within last month?
I can't find any note in the documentation on that, I could probably try with boto3 but want to double check first. Thanks for help!
Even with Boto3, you can check the API - DescribeReplicationTasks but likely, there is no information about your data transfer.
Reference: https://docs.aws.amazon.com/dms/latest/APIReference/API_DescribeReplicationTasks.html
If you have only 1 data replication task that is associated with only 1 replication instance, you can check that replication instance's network metric via CloudWatch metric. From CloudWatch metrics, AWS DMS namespace, there will be several network metrics such as NetworkTransitThroughput or NetworkReceiveThroughput. You can choose one and try as below:
Statistic: Sum
Period: 30 days (or up to you)
And you have a 30DAYS_THROUGHPUT.
I'm trying to identify the initial creation date of a metric on CloudWatch using the AWS CLI but don't see any way of doing so in the documentation. I can kind of identify the start date if there is a large block of missing data but that doesn't work for metrics that have large gaps in data.
CloudWatch metrics are "created" with the first PutMetricData call that includes the metric. I use quotes around created, because the metric doesn't have an independent existence, it's simply an entry in the time-series database. If there's a gap in time with no entries, the metric effectively does not exist for that gap.
Another caveat to CloudWatch metrics is that they only have a lifetime of 455 days, and individual metric values are aggregated as they age (see specifics at link).
All of which begs the question: what's the real problem that you're trying to solve.
I see that AWS RDS provides a FreeStorageSpace metric for monitoring disk usage. Now I am trying to create a generic pre-emptive alert for all my RDS but setting up an ideal threshold on FreeStorageSpace is not making sense.
For example, 20G might be a good threshold with RDS having total disk space as 100G but might be misleading for a RDS with total disk space of 40G.
So I was wondering if there is a way to get TotalStorageSpace or UsedStorageSpace metric from RDS (directly or indirectly).
Update
Since the fact is established that FreeStorageSpace is the only metric RDS provides related to disk storage, any ideas on if / how we can we build a custom metric for TotalStorageSpace or UsedStorageSpace?
p.s.: Creating separate alarms for each RDS for evaluating disk usage percentage seems such waste of time and resource.
If you enable Enhanced Monitoring, then the RDSOSMetrics log group in Cloudwatch Logs will have detailed JSON log messages which include filesystem statistics. I ended up creating a Cloudwatch Logs metric filter to parse out the usedPercent value from the fileSys attribute for the root filesystem. At least for Postgresql, these detailed logs include both / and /rdsdbdata filesystems; the latter is the one that is of interest in terms of storage space.
You can create a metric filter of the form {$.instanceID = "My_DB_Instance_Name" && $.fileSys[0].mountPoint = "/rdsdbdata"} and a corresponding metric value $.fileSys[0].usedPercent to get the used storage percentage for a given instance. This would then be available as a Log Metric that you could use to trigger an alarm. You probably need to create another metric replacing filesystem[0] with filesystem[1] since ordering is unknown for that array. You'd probably want to create these for each RDS instance you have so you know which one is running out of space, but you question seems to indicate you don't want a per-instance alarm.
I suppose you could exclude the $.instanceID from the metric filter and just get all values written to a single metric. When it reached a threshold and triggered an alarm, you'd need to start checking to see which instance is responsible.
According to the doc FreeStorageSpace is the only StorageSpace metrics you can get.
I can only assume that their logic is that you know what is your total space and having the FreeStorageSpace value you can also calculate how much is used.
First, you can check storage-related info in the monitoring section of AWS RDS.
Now I am trying to create a generic pre-emptive alert for all my RDS
but setting up an ideal threshold on FreeStorageSpace is not making
sense.For example, 20G might be a good threshold with RDS having total
disk space as 100G but might be misleading for a RDS with total disk
space of 40G.
If there is the different storage size then you need to configure multiple alarm based on size. A generic one will not work, as it does not accept percentage.
How can I create CloudWatch alarms to monitor the Amazon RDS free storage space and prevent storage full issues?
Short Description
Create alarms in the CloudWatch console or use the AWS Command Line
Interface (AWS CLI) to create alarms that monitor free storage space.
By creating CloudWatch alarms that notify you when the
FreeStorageSpace metric reaches a defined threshold, you can prevent
storage full issues. This can prevent downtime that occurs when your
RDS DB instance runs out of storage.
Resolution
Open the CloudWatch console, and choose Alarms from the navigation pane.
- Choose Create alarm, and choose Select metric.
From the All metrics tab, choose RDS.
Choose Per-Database Metrics.
Search for the FreeStorageSpace metric.
For the instance that you want to monitor, choose the DB instance Identifier FreeStorageSpace metric.
In the Conditions section, configure the threshold. For example, choose Lower/Equal, and then specify the threshold value.
Note: You must specify the value for the parameter in bytes. For example, 10 GB is 10737418240 bytes.
Fore more details you can check storage-full-rds-cloudwatch-alarm
I have recently come across this. I am setting up CloudWatch alarms for a wide mix of various RDS instances. As you note, creating a static threshold does not make much sense when the allocated storage varies.
I am creating the alarms using Powershell. I have a for-each loop that iterates through the RDS instances I need to create the alarm for. The criteria to raise an alarm is for the disk to only have 10 % free space, or 100 GB, which ever is less. Here is the important part of the script:
$AWSAccountName = "aws-account-name"
$Region = "us-east-1"
$DBInstanceIdentifier = "rds-name"
$DBInstance = Get-RDSDBInstance -DBInstanceIdentifier $DBInstanceIdentifier -Region $Region
$MetricName = "FreeStorageSpace"
$ThresholdPerCent = 0.10 * $DBInstance.AllocatedStorage * 1.074e+9 # 10% free disk space in bytes
$Threshold = ($ThresholdPerCent, 107374182400| Measure-Object -Minimum).Minimum
I specify several other variables that a splatted into an object, and then create the alarm:
# Specify Parameters
$params = #{"AlarmName" = $AlarmName ;
"AlarmDescription" = $AlarmDesc;
"ActionsEnabled" = $true;
"AlarmAction" = $AlarmAction;
"ComparisonOperator" = "LessThanOrEqualToThreshold";
"Dimensions" = $dimensions;
"EvaluationPeriod" = 1;
"MetricName" = $MetricName;
"Namespace" = "AWS/RDS";
"Period" = 300;
"Statistic" = "Minimum";
"DatapointsToAlarm" = 1;
"Threshold" = $Threshold;
"TreatMissingData" = "missing";
"Region" = $Region
}
# Create Rule
Write-CWMetricAlarm #params -Force
If the allocated storage for the instance is increased, you can re-run this and it will update the threshold.
When configuring scaling for an ec2 autoscaling group, I have the option of scaling on ASGAverageNetworkIn which is defined as the average number of bytes received in five minutes....
Averaged how?
In other words: if I wanted to maintain something near 50% use of a 10Gigabit connection, would that be 625000 or 187500000 average bytes?
It's looking like it's probably 187500000 average bytes, but I can't find any documentation to definitively confirm this. (If it said "total bytes received in 5 minutes by one ec2 instance", for example, that would definitively confirm this.)
I'm guessing it's the average of total bytes received by 1 EC2 instance across either 1 or 5min, depending on if you're using enhance monitoring (1min) or not (5min).
From https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-target-tracking.html :
ASGAverageNetworkIn—Average number of bytes received by a single instance on all network interfaces.
ASGAverageNetworkOut—Average number of bytes sent out from a single instance on all network interfaces.
Rather than attempting to calculate figures to use for the CloudWatch Alarm, I would recommend you look at historical information on the metric.
Identify periods where the server was "too busy" and you would have liked to scale-out, then check what the Network metric was showing in CloudWatch. This way, you can map "real world" situations back to the metric values, which will more reliably work in future.
In fact, look at all the metrics (CPU, Network, etc) to see which metrics can reliably identify when a scaling event should have occurred. This will be much more reliable that trying to calculate a value that might be an indication of load.
I want to use aws api to get logs for read/write operations with consumed read/write capacity for last two days or previous day.
How can I do that ?
Cloudwatch tracks the read and write cap units in the below metrics
ConsumedReadCapacityUnits
The number of read capacity units consumed over the specified time period, so you can track how much of your provisioned throughput is used. You can retrieve the total consumed read capacity for a table and all of its global secondary indexes, or for a particular global secondary index.
ConsumedWriteCapacityUnits
The number of write capacity units consumed over the specified time period, so you can track how much of your provisioned throughput is used. You can retrieve the total consumed write capacity for a table and all of its global secondary indexes, or for a particular global secondary index.
To view metrics (console)
Metrics are grouped first by the service namespace, and then by the various dimension combinations within each namespace.
Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.
In the navigation pane, choose Metrics.
Select the DynamoDB namespace.
To view metrics (CLI)
At a command prompt, use the following command:
aws cloudwatch list-metrics --namespace "AWS/DynamoDB"
You can also use the CLI to get the metrics for a given time period.
http://docs.aws.amazon.com/cli/latest/reference/cloudwatch/get-metric-statistics.html
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/metrics-dimensions.html