Retrieve Sagemaker instance metrics from cloudwatch via CLI or API - amazon-web-services

I've got a training job running on Sagemaker. I would like to retrieve instance metrics like MemoryUtilization etc by CLI or boto3 client.
Obviously I can see them in the console. However, I cannot see them in the CLI/API. For example, when running:
aws cloudwatch list-metrics --namespace "AWS/SageMaker"
I can see only metrics regarding endpoint invocation but not any training job related metrics.
Any idea?
Thanks!

Related

Prometheus forward alerts to Cloudwatch

I am running a kube cluster in AWS/EKS. All the alarms are managed in AWS CloudWatch. While that could change in the futur, this a requirement I have to deal with today.
I also have alerts in Prometheus. I wish to "export" them to CloudWatch. What would be the best solution for this? I see only two possibilities so far:
I create a lambda in AWS, which query the ALARM{} metrics to Prometheus, then export the result in CW. I then create an additional alarm in CW monitoring the state of the Prometheus alarme.
I create a webhook in alert manager calling an API gateway in AWS, which would turn on/off the alarm in CW.
Any other suggestions ?

how to see powertools custom metrics in Cloudwatch from ECS?

I have an ECS container that is deployed by the Docker image and I have used aws_powertools for logging and creating the metrics in Cloudwatch. However, I can only see my loggings and printed out metrics. But in Cloudwatch my custom metrics do not show up automatically as they do normally in Lambda usage.
So, how can I solve this situation, or are there any possibilities to get metrics from ECS container and put them into CW?
this how I create metrics:
metrics = Metrics(service="servicename", namespace="namespace")
metrics.add_metric(
name="name",
unit=MetricUnit.Seconds,
value=execution_duration,
)
serialized_metric = metrics.serialize_metric_set()
metrics.clear_metrics()
print(json.dumps(serialized_metric, separators=(",", ":")))
Thank you in advance!

How to test cloudwatch alarm locally?

I am using AWS CDK library to create alarm and metric. Both component have been created fine and once deploy cloudformation template using cdk deploy command then components are visible in AWS env.
But sometime things are not executed as per exceptions therefor need to test locally.
Is there any way to test CloudWatch alarm locally ?
Any help would be appreciated.
One way is to write a test where you set the alarm state in a language supported by AWS SDK. i.e Python below
response = client.set_alarm_state(
AlarmName='string',
StateValue='OK'|'ALARM'|'INSUFFICIENT_DATA',
StateReason='string',
StateReasonData='string'
)
or AWS CLI
aws cloudwatch set-alarm-state --alarm-name "{YOUR_ALARM}" --state-reason "Testing alarm" --state-value ALARM

Send custom metric data to Cloudwatch agent from application

I am trying to send custom metric data of my application to CloudWatch using Cloudwatch Agent.
I am able to successfully send the data to CloudWatch using aws cloudwatch put-metric-data command with AWS CLI, but instead i would like to use the cloudwatch agent to receive my application metrics (connection count , queue count etc) and send that data to cloudwatch via Cloudwatch agent. Please help.
At first, you need to check what metrics CWAgent supports and collects.
Reference: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/metrics-collected-by-CloudWatch-agent.html
From my point of view, CWAgent mostly collects OS metrics that it can read directly from your machine instead of metrics such as connection-count from your application.
The way you did using the put-metric-data is a good direction, you can customize it with your own code.
But recently, AWS did mention about OpenTelemetry which supports application metrics and integrates with CloudWatch Agent. I believe this way is the one you are looking for.
Reference:
https://aws-otel.github.io/docs/introduction
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-open-telemetry.html

How to setup Cloudwatch SQL monitor?

I have a view on a PostgreSQL RDS instance that lists any ongoing deadlocks. Ideally, there are no deadlocks in the database, causing the view to show nothing, but on rare occasions, there are.
How would I setup an alarm in Cloudwatch to query this view and raise an alarm if any records return?
I found the cool script on Github specifically for this:
A Serverless MySQL RDS Data Collection script to push Custom Metrics to CloudWatch on AWS
Basically, there are 2 main possibilities to publish any custom metrics on CloudWatch:
Via API
You can run it on a schedule on EC2 instance (AWS example) or as a lambda function (great manual with code examples)
With CloudWatch agent
Here is the pretty example for Monitor your Microsoft SQL Server using custom metrics with Amazon CloudWatch and AWS Systems Manager.
After all, you should set up CloudWatch alarms with Metric Math and relevant thresholds.
It is not possible to configure Amazon CloudWatch to look inside an Amazon RDS database.
You will need some code running somewhere that regularly runs a query on the database and sends a custom metric to Amazon CloudWatch.
For example, you could trigger an AWS Lambda function, or use cron on an Amazon EC2 instance to trigger a script.