GCP Alerting Policy to Alert on KMS Key Deletion Using Terraform - google-cloud-platform

I am trying to alert on KMS Key deletions using terraform.
I have a log based metric, a policy and a notification channel to PagerDuty.
This all works, however, following the alert triggering it soon clears and there seems to be nothing I can do to stop this.
Here is my code:
resource "google_logging_metric" "logging_metric" {
name = "kms-key-pending-deletion"
description = "Logging metric used to alert on scheduled deletions of KMS keys"
filter = "resource.type=cloudkms_cryptokeyversion AND protoPayload.methodName=DestroyCryptoKeyVersion"
metric_descriptor {
metric_kind = "DELTA"
value_type = "INT64"
unit = "1"
display_name = "kms-key-pending-deletion-metric-descriptor"
}
}
resource "google_monitoring_notification_channel" "pagerduty_alerts" {
display_name = "pagerduty-notification-channel"
type = "pagerduty"
sensitive_labels {
service_key = var.token
}
}
resource "google_monitoring_alert_policy" "kms_key_deletion_alert_policy" {
display_name = "kms-key-deletion-alert-policy"
combiner = "OR"
notification_channels = [google_monitoring_notification_channel.pagerduty_alerts.name]
conditions {
display_name = "kms-key-deletion-alert-policy-conditions"
condition_threshold {
comparison = "COMPARISON_GT"
duration = "300s"
filter = "metric.type=\"logging.googleapis.com/user/kms-key-pending-deletion\" AND resource.type=\"global\""
threshold_value = "0"
}
}
documentation {
content = "Runbook: https://blah"
}
}
In the GCP GUI I can disable the option "Notify on incident closure" in the policy and it stops the alert from clearing.
However I cannot set this via terraform.
I have tried setting alert_strategy.auto_close to null and 0s but this did not work:
alert_strategy {
auto_close = "0s"
# auto_close = null
}
How do I keep the alert active and stop it from clearing when building the policy in terraform?
Am I using the correct resource type? - Should I be using cloudkms.cryptoKey.state that are in "DESTROY_SCHEDULED" state somehow?

For others wanting to find the answer to this:
The need to keep an alert open and not allow it to automatically close is missing in the API.
The issue is tracked here: https://issuetracker.google.com/issues/151052441?pli=1

Related

Add environment based Multiple Notification Channel to GCP Alert Policy with Terraform Lookup

I'm trying to add multiple notification channels to a GCP Alert policy with terraform.
My issue is that I need to add different notification channels based on the production environment where they are deployed.
As long as I keep the notification channel unique, I can easily deploy in the following way.
Here is my variables.tf file:
locals {
notification_channel = {
DEV = "projects/[PROJECT_ID]/notificationChannels/[CHANNEL_ID]"
PRD = "projects/[PROJECT_ID]/notificationChannels/[CHANNEL_ID]"
}
}
Here is my main.tf file:
resource "google_monitoring_alert_policy" "alert_policy" {
display_name = "My Alert Policy"
combiner = "OR"
conditions {
display_name = "test condition"
condition_threshold {
filter = "metric.type=\"compute.googleapis.com/instance/disk/write_bytes_count\" AND resource.type=\"gce_instance\""
duration = "60s"
comparison = "COMPARISON_GT"
aggregations {
alignment_period = "60s"
per_series_aligner = "ALIGN_RATE"
}
}
}
user_labels = {
foo = "bar"
}
notification_channels = [lookup(local.notification_channel, terraform.workspace)]
}
My issue here happens when I try to map multiple notification channels instead of one per environment.
Something like:
locals {
notification_channel = {
DEV = ["projects/[PROJECT_ID]/notificationChannels/[CHANNEL_ID]", "projects/[PROJECT_ID]/notificationChannels/[CHANNEL_ID]" ]...
}
}
However, if I try this way, system tells me that Inappropriate value for attribute "notification_channels": element 0: string.
Here's documentation of:
Terraform Lookup function Terraform
GCP Alert Policy
Could you help?
If I understood your question, you actually need only to remove the square brackets:
notification_channels = lookup(local.notification_channel, terraform.workspace)
Since the local variable notification_channel is already a list, you only need to use lookup to fetch the value based on the workspace you are currently in.

Terraform Google provider, create log-based alerting policy

I need to create a log-based alerting policy via Terraform Google cloud provider :
https://cloud.google.com/logging/docs/alerting/monitoring-logs#lba
I checked from the Terraform official documentation and i saw 'google_monitoring_alert_policy' resource : https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy
I don't found with this doc how creating a log based alerting policy.
I can create an alerting policy with type 'Metrics' but not with type 'Logs'
I use the latest version of Terraform Google cloud provider : https://registry.terraform.io/providers/hashicorp/google/latest
How can i create a log-based alerting policy with Terraform Google provider please ?
Thanks in advance for your help.
Problem is solved with version 4.7.0 of google provider, which adds condition_matched_log. Here is a working example :
resource "google_monitoring_notification_channel" "email-me" {
display_name = "Email Me"
type = "email"
labels = {
email_address = "me#mycompany.com"
}
}
resource "google_monitoring_alert_policy" "workflows" {
display_name = "Workflows alert policy"
combiner = "OR"
conditions {
display_name = "Error condition"
condition_matched_log {
filter = "resource.type=\"workflows.googleapis.com/Workflow\" severity=ERROR"
}
}
notification_channels = [ google_monitoring_notification_channel.email-me.name ]
alert_strategy {
notification_rate_limit {
period = "300s"
}
}
}
Thanks Guillaume.
Yes it's the way i solved the issue.
Now there is no way to directly create alerting with log type, via Terraform.
The steps to solve this problem :
Create un log based metric with expected filter
Create an alerting policy with type metric based on the previous created log based metric
resource "google_logging_metric" "my_log_metrics" {
project = var.project_id
name = "my-log-metric"
filter = "..."
description = "..."
metric_descriptor {
metric_kind = "..."
value_type = "..."
}
}
resource "google_monitoring_alert_policy" "my_policy" {
project = var.project_id
display_name = "my-policy"
combiner = "OR"
conditions {
display_name = "my-policy"
condition_threshold {
filter = "metric.type=\"logging.googleapis.com/user/my-log-metric\" AND resource.type=\"cloud_composer_environment\""
...
}
}
The format is logging.googleapis.com/user/<user metrics name>
Look at this example (no notification, only the alert policy)
resource "google_monitoring_alert_policy" "alert_policy" {
display_name = "My Alert Policy"
combiner = "OR"
conditions {
display_name = "test condition"
condition_threshold {
filter = "metric.type=\"logging.googleapis.com/user/test-metrics\" AND resource.type=\"cloud_run_revision\""
duration = "600s"
comparison = "COMPARISON_GT"
threshold_value = 1
}
}
user_labels = {
foo = "bar"
}
}

How to create a slack notification channel in Google Cloud Platform with terraform

I'm trying to create a slack notification channel in GCP with terraform. I am able to create a channel with the code below, but it's missing "Team" and "Owner" attributes.
resource "google_monitoring_notification_channel" "default" {
display_name = "Test Slack Channel"
type = "slack"
enabled = "true"
labels = {
"channel_name" = "#testing"
"auth_token" = "<my_slack_app_token>"
}
}
The first channel in the screenshot below was created via GUI and works fine. The second channel was created via terraform and is unable to send notificaitons:
Terraform registry does not mention these attributes, I have tried defining them in labels right after channel_name:
labels = {
"channel_name" = "#testing"
"team" = "<my_team>"
"owner" = "google_cloud_monitoring"
"auth_token" = "<my_slack_app_token>"
}
I got the following error:
Error creating NotificationChannel: googleapi: Error 400: Field "notification_channel.labels['owner']" is not allowed; labels must conform to the channel type's descriptor; permissible label keys for "slack" are: {"auth_token", "channel_name"}
Apparently, only channel_name and auth_token are valid labels.
What am I missing?
Slack needs the sensitive_lables option for tokens. There is an example in the docs
resource "google_monitoring_notification_channel" "default" {
display_name = "Test Slack Channel"
type = "slack"
labels = {
"channel_name" = "#foobar"
}
sensitive_labels {
auth_token = "...."
}
}

How to associate new "aws_wafregional_rule" with existing WAF ACL

I have created a WAF ACL using AWS Console. Now I need to create WAF Rule using Terraform, so I have implemented below rule.
resource "aws_wafregional_byte_match_set" "blocked_path_match_set" {
name = format("%s-%s-blocked-path", local.name, var.module)
dynamic "byte_match_tuples" {
for_each = length(var.blocked_path_prefixes) > 0 ? var.blocked_path_prefixes : []
content {
field_to_match {
type = lookup(byte_match_tuples.value, "type", null)
}
target_string = lookup(byte_match_tuples.value, "target_string", null)
positional_constraint = lookup(byte_match_tuples.value, "positional_constraint", null)
text_transformation = lookup(byte_match_tuples.value, "text_transformation", null)
}
}
}
resource "aws_wafregional_rule" "blocked_path_allowed_ipaccess" {
metric_name = format("%s%s%sBlockedPathIpaccess", var.application, var.environment, var.module)
name = format("%s%s%sBlockedPathIpaccessRule", var.application, var.environment, var.module)
predicate {
type = "ByteMatch"
data_id = aws_wafregional_byte_match_set.blocked_path_match_set.id
negated = false
}
}
But how do I map this new rule to existing "web_acl" which was created through AWS Console. As per documentation I can use "aws_wafregional_web_acl" to create new web_acl, but is there a way to associate rule created through terraform with existing waf_acl ? I have a gitlab pipeline which deploys terraform code to aws, so eventually I will pass id/arn of existing web_acl and through pipeline just add/update new rule without impacting existing rules which were created through console.
Please share your valuable feedback.
Thank you.
As per the WAF documentation you associate the rule via an AWS WAF resource, see the example below for a code snippet.
resource "aws_wafregional_web_acl" "foo" {
name = "foo"
metric_name = "foo"
default_action {
type = "ALLOW"
}
rule {
action {
type = "BLOCK"
}
priority = 1
rule_id = aws_wafregional_rule.blocked_path_allowed_ipaccess.id
}
}
However, as you said you have created the resource already in the AWS console. Terraform does support the import of an AWS resource, so you would need to go with this method if you would like to manage it via Terraform.

How to create an alert policy for unknown custom metric in GCP

Given the following alert policy in GCP (created with terraform)
resource "google_monitoring_alert_policy" "latency_alert_policy" {
display_name = "Latency of 95th percentile more than 1 second"
combiner = "OR"
conditions {
display_name = "Latency of 95th percentile more than 1 second"
condition_threshold {
filter = "metric.type=\"custom.googleapis.com/http/server/requests/p95\" resource.type=\"k8s_pod\""
threshold_value = 1000
duration = "60s"
comparison = "COMPARISON_GT"
aggregations {
alignment_period = "60s"
per_series_aligner= "ALIGN_NEXT_OLDER"
cross_series_reducer= "REDUCE_MAX"
group_by_fields = [
"metric.label.\"uri\"",
"metric.label.\"method\"",
"metric.label.\"status\"",
"metadata.user_labels.\"app.kubernetes.io/name\"",
"metadata.user_labels.\"app.kubernetes.io/component\""
]
}
trigger {
count = 1
percent = 0
}
}
}
}
I get the following this error (which is part of a terraform project also creating the cluster):
Error creating AlertPolicy: googleapi: Error 404: The metric referenced by the provided filter is unknown. Check the metric name and labels.
Now, this is a custom metric (by a Spring Boot app with Micrometer), therefore this metric does not exist when creating infrastructure. Does GCP have to know a metric before creating an alert for it? This would mean that a Spring boot app has to be deployed on a cluster and sending metrics before this policy can be created?
Am I missing something... (like this should not be done in terraform, infrastructure)?
interesting question, the reason for the 404 error is because the resource was not found, there seems to be a preexisting pre-requisite for the descriptor. I would create the metric descriptor first, you can use this as reference, then going forward on creating the alerting policy.
This is an ingenious way you may avoid it. Please comment if it makes sense and if you make it work like this, share it.
For reference (this can be referenced from the alert policy according to terraform doc):
resource "google_monitoring_metric_descriptor" "p95_latency" {
description = ""
display_name = ""
type = "custom.googleapis.com/http/server/requests/p95"
metric_kind = "GAUGE"
value_type = "DOUBLE"
labels {
key = "status"
}
labels {
key = "uri"
}
labels {
key = "exception"
}
labels {
key = "method"
}
labels {
key = "outcome"
}
}