Cost of GetMetricData Api calls when using metric math? - amazon-web-services

I'm working on ingesting metrics from Lambda into our centralized logging system. Our first idea is too costly so I'm trying to figure out if there is a way to lower the cost (instead of ingesting 3 metrics from 200 lambdas every 60s).
I've been messing around with MetricMath and have pretty much figured out what I want to do. I'd run this as a k8s cron job like thing and variabilize the start and end time.
How would this be charged? Is it the number of metrics used to perform the math or the number of values that I output?
i.e. m1 and m2 are pulling Errors and Invocations from 200 lambdas. To pull each of these individually would be 400 metrics.
In this method, would it only be 1, 3, or 401?
{
"MetricDataQueries": [
{
"Id": "m1",
"MetricStat": {
"Metric": {
"Namespace": "AWS/Lambda",
"MetricName": "Errors"
},
"Period": 300,
"Stat": "Sum",
"Unit": "Count"
},
"ReturnData": false
},
{
"Id": "m2",
"MetricStat": {
"Metric": {
"Namespace": "AWS/Lambda",
"MetricName": "Invocations"
},
"Period": 300,
"Stat": "Sum",
"Unit": "Count"
},
"ReturnData": false
},
{
"Id": "e1",
"Expression": "m1 / m2",
"Label": "ErrorRate"
}
],
"StartTime": "2020-02-25T02:00:0000",
"EndTime": "2020-02-26T02:05:0000"
}
Output:
{
"Messages": [],
"MetricDataResults": [
{
"Label": "ErrorRate",
"StatusCode": "Complete",
"Values": [
0.0045127626568890146
],
"Id": "e1",
"Timestamps": [
"2020-02-26T19:00:00Z"
]
}
]
}
Example 2:
Same principle. This is pulling the invocations by of each function by FunctionName. It then sorts them and outputs the most invoked. Any idea how many metrics this would be?
{
"MetricDataQueries": [
{
"Id": "e2",
"Expression": "SEARCH(' {AWS/Lambda,FunctionName} MetricName=`Invocations` ', 'Sum', 60)",
"ReturnData" : false
},
{
"Id": "e3",
"Expression": "SORT(e2, SUM, DESC, 1)"
}
],
"StartTime": "2020-02-26T12:00:0000",
"EndTime": "2020-02-26T12:01:0000"
}
Same question. 1 or 201 metrics?
Output:
{
"MetricDataResults": [
{
"Id": "e3",
"Timestamps": [
"2020-02-26T12:00:00Z"
],
"Label": "1 - FunctionName",
"Values": [
91.0
],
"StatusCode": "Complete"
}
],
"Messages": []
}

Billing is on metrics requested: https://aws.amazon.com/cloudwatch/pricing/
In the first example, you're requesting only 2 metrics. These metrics are aggregates of per lambda function metrics, but as far you're concerned, that's only 2 metrics and you will be billed for 2. You're not billed for the metric math, only for metrics you request.
In the second example, the number of metrics the search returns is the amount you will be billed for, 200 in your case.

Related

AWS EVENTBRIDGE: Add content filtering to ECS task state changes

I am trying to create an eventbridge rule whenever ECS task is deleted abnormally.
Normally ECS sends all events event the created or attached states too but I want to filter only DELETEDstate.
I am using CDK to create my event rule. I am trying to implement content filtering based on status which is present in attachment field which is again a part of detail field.
Sample event from ECS Task ->
{
"version": "0",
"id": "3317b2af-7005-947d-b652-f55e762e571a",
"detail-type": "ECS Task State Change",
"source": "aws.ecs",
"account": "111122223333",
"time": "2020-01-23T17:57:58Z",
"region": "us-west-2",
"resources": [
"arn:aws:ecs:us-west-2:111122223333:task/FargateCluster/c13b4cb40f1f4fe4a2971f76ae5a47ad"
],
"detail": {
"attachments": [
{
"id": "1789bcae-ddfb-4d10-8ebe-8ac87ddba5b8",
"type": "eni",
"status": "ATTACHED",
"details": [
{
"name": "subnetId",
"value": "subnet-abcd1234"
},
{
"name": "networkInterfaceId",
"value": "eni-abcd1234"
},
{
"name": "macAddress",
"value": "0a:98:eb:a7:29:ba"
},
{
"name": "privateIPv4Address",
"value": "10.0.0.139"
}
]
}
],
"availabilityZone": "us-west-2c",
"clusterArn": "arn:aws:ecs:us-west-2:111122223333:cluster/FargateCluster",
"containers": [
{
"containerArn": "arn:aws:ecs:us-west-2:111122223333:container/cf159fd6-3e3f-4a9e-84f9-66cbe726af01",
"lastStatus": "RUNNING",
"name": "FargateApp",
"image": "111122223333.dkr.ecr.us-west-2.amazonaws.com/hello-repository:latest",
"imageDigest": "sha256:74b2c688c700ec95a93e478cdb959737c148df3fbf5ea706abe0318726e885e6",
"runtimeId": "ad64cbc71c7fb31c55507ec24c9f77947132b03d48d9961115cf24f3b7307e1e",
"taskArn": "arn:aws:ecs:us-west-2:111122223333:task/FargateCluster/c13b4cb40f1f4fe4a2971f76ae5a47ad",
"networkInterfaces": [
{
"attachmentId": "1789bcae-ddfb-4d10-8ebe-8ac87ddba5b8",
"privateIpv4Address": "10.0.0.139"
}
],
"cpu": "0"
}
],
"createdAt": "2020-01-23T17:57:34.402Z",
"launchType": "FARGATE",
"cpu": "256",
"memory": "512",
"desiredStatus": "RUNNING",
"group": "family:sample-fargate",
"lastStatus": "RUNNING",
"overrides": {
"containerOverrides": [
{
"name": "FargateApp"
}
]
},
"connectivity": "CONNECTED",
"connectivityAt": "2020-01-23T17:57:38.453Z",
"pullStartedAt": "2020-01-23T17:57:52.103Z",
"startedAt": "2020-01-23T17:57:58.103Z",
"pullStoppedAt": "2020-01-23T17:57:55.103Z",
"updatedAt": "2020-01-23T17:57:58.103Z",
"taskArn": "arn:aws:ecs:us-west-2:111122223333:task/FargateCluster/c13b4cb40f1f4fe4a2971f76ae5a47ad",
"taskDefinitionArn": "arn:aws:ecs:us-west-2:111122223333:task-definition/sample-fargate:1",
"version": 4,
"platformVersion": "1.3.0"
}
}
cdk code
{
eventPattern: {
source: ['aws.ecs'],
detailType: ['ECS Task State Change'],
detail: {
clusterArn: [cluster.clusterArn],
attachments: [{ status: [{ prefix: 'DELETED' }] }] // this is not working
},
},
}
Edit: It *is* possible to filter on objects within arrays
detail: { "attachments": {"status": ["DELETED"] } }
EventBridge can match scalars in an array, but not arbitrary objects in an array:
docs: If the value in the event is an array, then the event pattern matches if the intersection of the event pattern array and the event array is non-empty.
That means EventBridge cannot match only "status": "DELETED". What are your options?
Base your pattern on a correlated non-array key-value pair, e.g. "lastStatus": "STOPPED".
Match all patterns. Add logic to the event target to ignore uninteresting patterns.
Note: because you say the array reliably has only one element, you can transform the event detail before it gets sent to the target. This does not help with the matching problem, but can make downstream filtering easier. Here is a CDK example for a Lambda target:
rule.addTarget(
new targets.LambdaFunction(func, {
event: events.RuleTargetInput.fromObject({
status: events.EventField.fromPath('$.detail.attachments[0].status'),
original: events.EventField.fromPath('$'),
}),
})
);
The Lambda receives the reshaped event detail:
{
"status": "ATTACHED",
"original": <the original event>
}

How do I combine 2 search metrics with math expression in cloudwatch?

I am trying to get the percentage memory used when running a lambda to display in a graph on cloudwatch. I know there are other ways I can pull the data, but for reasons outside of the scope of this question, I would like to stick to using search to pull the metrics.
I have the following graph
{
"metrics": [
[ { "expression": "SEARCH('{SomeMetricNamespace} MetricName=\"MemorySize\"', 'Average', 300)", "id": "m1", "visible": "true" } ],
[ { "expression": "SEARCH('{SomeMetricNamespace} MetricName=\"MaxMemoryUsed\"', 'Average', 300)", "id": "m2", "visible": "true" } ],
[ { "expression": "m2/m1*100", "label": "pecentage memory used", "id": "e1", "stat": "Average" } ]
],
"view": "timeSeries",
"stacked": false,
"region": "us-west-2",
"stat": "Average",
"period": 300,
"title": "Memory",
"yAxis": {
"left": {
"label": "Percentage Usage",
"showUnits": false
}
},
"liveData": false
}
The error I am getting
Error in expression e1 [Unsupported operand type(s) for /: '[Array[TimeSeries], Array[TimeSeries]]']
Is there a way to combine the first 2 expressions to give me the percentage memory used?
The result of the expressions are arrays of time series so you can not apply directly operations (+ - * / ^). As a workaround you could transform each time series into single values (average values) for each expression and then calculate the percentage.
The source should be similar to this:
{
"metrics": [
[ { "expression": "SEARCH('{SomeMetricNamespace} MetricName=\"MemorySize\"', 'Average', 300)", "id": "m1", "visible": "false" } ],
[ { "expression": "SEARCH('{SomeMetricNamespace} MetricName=\"MaxMemoryUsed\"', 'Average', 300)", "id": "m2", "visible": "false" } ],
[ { "expression": "AVG(m1)", "label": "AVGMemorySize", "id": "e1", "visible": "false" } ],
[ { "expression": "AVG(m2)", "label": "AVGMaxMemoryUsed", "id": "e2", "visible": "false" } ],
[ { "expression": "e2/e1*100", "label": "pecentage memory used", "id": "e3", "stat": "Average" } ]
],
"view": "timeSeries",
"stacked": false,
"region": "us-west-2",
"stat": "Average",
"period": 300,
"title": "Memory",
"yAxis": {
"left": {
"label": "Percentage Usage",
"showUnits": false
}
},
"liveData": false
}

Monitoring api in Google gives "By" as response

I am reading monitoring data through Google Timeseries api. The api is working correctly and if give alignment period=3600s it gives me the values for that time series between start and end time for any metric type.
I am calling it through Python like this:
service.projects().timeSeries().list(
name=api_args["project_name"],
filter=api_args["metric_filter"],
aggregation_alignmentPeriod=api_args["aggregation_alignment_period"],
# aggregation_crossSeriesReducer=api_args["crossSeriesReducer"],
aggregation_perSeriesAligner=api_args["perSeriesAligner"],
aggregation_groupByFields=api_args["group_by"],
interval_endTime=api_args["end_time_str"],
interval_startTime=api_args["start_time_str"],
pageSize=config.PAGE_SIZE,
pageToken=api_args["nextPageToken"]
).execute()
and in Postman:
https://monitoring.googleapis.com/v3/projects/my-project/timeSeries?pageSize=500&interval.startTime=2020-07-04T16%3A39%3A37.230000Z&aggregation.alignmentPeriod=3600s&aggregation.perSeriesAligner=ALIGN_SUM&filter=metric.type%3D%22compute.googleapis.com%2Finstance%2Fnetwork%2Freceived_bytes_count%22+&pageToken=&interval.endTime=2020-07-04T17%3A30%3A01.497Z&alt=json&aggregation.groupByFields=metric.labels.key
I face an issue here:
{
"metric": {
"labels": {
"instance_name": "insta-demo1",
"loadbalanced": "false"
},
"type": "compute.googleapis.com/instance/network/received_bytes_count"
},
"resource": {
"type": "gce_instance",
"labels": {
"instance_id": "1234343552",
"zone": "us-central1-f",
"project_id": "my-project"
}
},
"metricKind": "DELTA",
"valueType": "INT64",
"points": [
{
"interval": {
"startTime": "2020-07-04T16:30:01.497Z",
"endTime": "2020-07-04T17:30:01.497Z"
},
"value": {
"int64Value": "6720271"
}
}
]
},
{
"metric": {
"labels": {
"loadbalanced": "true",
"instance_name": "insta-demo2"
},
"type": "compute.googleapis.com/instance/network/received_bytes_count"
},
"resource": {
"type": "gce_instance",
"labels": {
"instance_id": "1234566343",
"project_id": "my-project",
"zone": "us-central1-f"
}
},
"metricKind": "DELTA",
"valueType": "INT64",
"points": [
{
"interval": {
"startTime": "2020-07-04T16:30:01.497Z",
"endTime": "2020-07-04T17:30:01.497Z"
},
"value": {
"int64Value": "579187"
}
}
]
}
],
"unit": "By". //This "By" is the value which is causing problem,
I am getting this value like "unit": "By" or "unit":"ms" or something like that at the end, Also if I don't find any data for a range I'm getting this value, as I am evaluating this response in Python I am getting key error as there is not key called "unit"
logMessage: "Key Error: ' '"
severity: "ERROR"
As the response is empty I am getting the single key called "unit". Also at the end of any response I am getting this "unit":"ms" or "unit":"by" - is there any way to prevent that unit value coming in the response?
I am new to Google Cloud APIs and Python. What can I try next?
The "unit" field expresses the kind of resource the metric is counting. For bytes, it is "By". Read this. I understand it is always returned, so there is no way of not receiving it; I recommend you to adapt your code to correctly deal with its appearance in the responses.

Can AWS Eventbridge rules match object inside an array

I'm attempting to match events using an eventbridge rule. However I need to match the event if it's array contains an object with some particular properties and I'm struggling with how to do that.
An example event:
{
"version": "0",
"id": "396bfea8-6311-c1ab-44cf-d44d93014a89",
"detail-type": "ExampleEvent",
"source": "example.com",
"account": "207772098559",
"time": "2020-05-31T19:44:55Z",
"region": "eu-west-1",
"resources": [],
"detail": {
"Id": "2fbf7f1b0b0f462ba16b6076812f1b77",
"Data": {
"entities": [
{
"entityType": "task",
"action": "update",
"entityId": "bbf74ec6-8762-48d6-b09f-23a97834fc2f"
},
{
"entityType": "note",
"action": "update",
"entityId": "bbf74ec6-8762-48d6-b09f-23a97834fc2f"
}
]
}
}
}
I would like the rule to match where the entities collection contains any items with both entityType task and action update. I'd imagined it would look like the below but this gets the error "Unrecognized match type entityType" as it's thinking that the object inside the array means I'm trying to use one of the supported match types.
{
"source": [
"example.com"
],
"detail-type": [
"ExampleEvent"
],
"detail": {
"Data": {
"entities": [
{
"entityType": [
"Task"
],
"action": [
"update"
]
}
]
}
}
}
Hopefully I am answering your question and you are not trying to pull a single entity out of the event, instead I think you are asking how to match this event.
This is a matched event for the object array. I removed the array is the matched event and I lowercased the Task entityType filter since it is case-sensitive.
{
"source": [
"example.com"
],
"detail-type": [
"ExampleEvent"
],
"detail": {
"Data": {
"entities": {
"entityType": [
"task"
],
"action": [
"update"
]
}
}
}
}
Example 2:
Here is another example with that is more nesting and mixture of array and object. As you can it treats arrays and objects the same. This is a Redshift alarm that is producing high disk space.
{
"version": "0",
"id": "c4c1c1c9-6542-e61b-6ef0-8c4d36933a92",
"detail-type": "CloudWatch Alarm State Change",
"source": "aws.cloudwatch",
"account": "123456789012",
"time": "2019-10-02T17:04:40Z",
"region": "us-east-1",
"resources": ["arn:aws:cloudwatch:us-east-1:123456789012:alarm:ServerMemoryTooHigh"],
"detail": {
"alarmName": "ServerDiskSpaceTooHigh",
"configuration": {
"description": "Goes into alarm when server Disk Space utilization is too high!",
"metrics": [{
"id": "30b6c6b2-a864-43a2-4877-c09a1afc3b87",
"metricStat": {
"metric": {
"dimensions": {
"InstanceId": "i-12345678901234567"
},
"name": "PercentageDiskSpaceUsed",
"namespace": "AWS/Redshift"
},
"period": 300,
"stat": "Average"
},
"returnData": true
}]
},
"previousState": {
"reason": "Threshold Crossed: 1 out of the last 1 datapoints [0.0666851903306472 (01/10/19 13:46:00)] was not greater than the threshold (50.0) (minimum 1 datapoint for ALARM -> OK transition).",
"reasonData": "{\"version\":\"1.0\",\"queryDate\":\"2019-10-01T13:56:40.985+0000\",\"startDate\":\"2019-10-01T13:46:00.000+0000\",\"statistic\":\"Average\",\"period\":300,\"recentDatapoints\":[0.0666851903306472],\"threshold\":50.0}",
"timestamp": "2019-10-01T13:56:40.987+0000",
"value": "OK"
},
"state": {
"reason": "Threshold Crossed: 1 out of the last 1 datapoints [99.50160229693434 (02/10/19 16:59:00)] was greater than the threshold (50.0) (minimum 1 datapoint for OK -> ALARM transition).",
"reasonData": "{\"version\":\"1.0\",\"queryDate\":\"2019-10-02T17:04:40.985+0000\",\"startDate\":\"2019-10-02T16:59:00.000+0000\",\"statistic\":\"Average\",\"period\":300,\"recentDatapoints\":[99.50160229693434],\"threshold\":50.0}",
"timestamp": "2019-10-02T17:04:40.989+0000",
"value": "ALARM"
}
}
}
Here is the matched event:
{
"detail-type": ["CloudWatch Alarm State Change"],
"source": ["aws.cloudwatch"],
"detail": {
"configuration": {
"metrics": {
"metricStat": {
"metric": {
"name": ["PercentageDiskSpaceUsed"],
"namespace": ["AWS/Redshift"]
}
}
}
},
"state": {
"value": ["ALARM"]
}
}
}

Alert policy "ALL conditions are met on matching resources" does not match filters or groupByFields

I'm willing to create an alert thanks to GCP Stackdriver to create an alert for Pub/Sub such as "IF a subscription has more than XXX awaiting messages (unacked) for a topic AND the consumption rate of that consumer on that topic is near 0, then trigger an alert".
I'm more used to Prometheus where I can simply rely on the labels to kinda join timeseries, but I'm wondering how to do that with Stackdriver.
At first, I thought about using 2 conditions with the "policy violates when ALL conditions are met on matching resources", but I'm wondering if that "matching resources" is the same behavior as in Prometheus.
Here is the alert I thought about, but it seems to trigger even when the 2 conditions are not fully fulfilled:
{
"combiner": "AND_WITH_MATCHING_RESOURCE",
"conditions": [
{
"conditionThreshold": {
"aggregations": [
{
"alignmentPeriod": "60s",
"crossSeriesReducer": "REDUCE_SUM",
"groupByFields": [
"metadata.system_labels.topic_id",
"resource.label.subscription_id"
],
"perSeriesAligner": "ALIGN_RATE"
}
],
"comparison": "COMPARISON_LT",
"duration": "300s",
"filter": "metric.type=\"pubsub.googleapis.com/subscription/ack_message_count\" resource.type=\"pubsub_subscription\" resource.label.\"project_id\"=\"pl-service-prod-lm-fr\"",
"thresholdValue": 1,
"trigger": {
"count": 1
}
},
"displayName": "Ack message count"
},
{
"conditionThreshold": {
"aggregations": [
{
"alignmentPeriod": "60s",
"crossSeriesReducer": "REDUCE_SUM",
"groupByFields": [
"metadata.system_labels.topic_id",
"resource.label.subscription_id"
],
"perSeriesAligner": "ALIGN_MEAN"
}
],
"comparison": "COMPARISON_GT",
"duration": "300s",
"filter": "metric.type=\"pubsub.googleapis.com/subscription/num_undelivered_messages\" resource.type=\"pubsub_subscription\" resource.label.\"project_id\"=\"pl-service-prod-lm-fr\"",
"trigger": {
"count": 1
}
},
"displayName": "Unacked messages"
}
],
"displayName": "Pub/Sub is not consumed",
"enabled": true,
"incidentStrategy": {}
}